Are the things that “I’ve got,” according to songs in my iTunes library.
Are the things that “I’ve got,” according to songs in my iTunes library.
I like to record it here when I see a baseball game and write down a few thoughts, but I forgot to do it right after CJ and I went to these games on May 20 and May 21, and so my memories are a little sparse. A few scattered things. This was the fourth time the Orioles have lost 5-2 this year. Tom Scocca and I agree that this is somehow the emblematic score for this team to lose by. To hit well enough to have won, you’d have to score 6 runs, which we’re rarely doing. And to pitch well enough to have won, you’d have to allow only 1 run, which we’re even more rarely doing. It’s a loss that doesn’t look like a blowout yet is somehow insuperable. The all-time record for most 5-2 losses in a season is 9, held by the 2011 Giants. What’s weird is, that team was actually pretty good! But when they lost, they lost 5-2.
Anyway, that was their 8th loss in a row. We saw two young pitchers, Chayce McDermott and Logan Henderson, each make their third major league start. Henderson was a lot better. Like to the point that I felt: we might want to remember that we saw one of this guy’s very first starts in the major league. A lot of very, very ugly swings from Orioles hitters.
The next afternoon was a lot better. First of all, it was Brewers Math Day. I got to meet some kids from New Berlin and their calculus teacher, who I’d had a phone meeting with earlier this season. Fun! Brewers Math Day started as an outreach program by the UW-Milwaukee math department, a great instance of the Wisconsin Idea.
The score doesn’t make it sound that way, but it was a good old-fashioned pitcher’s duel, which the Orioles almost won 3-2. But Felix Bautista, who has been looking shaky, continued to look shaky. He came in for the bottom of the 9th, he walked a couple of guys, he got two outs and got Brewers rookie 3b and 8th place hitter Caleb Durbin into a 2-strike count — the scattered O’s fans stood up to get ready to celebrate — and Durbin singled in the tying run. Orioles and Brewers traded runs in the 10th, and then the Brewers, in the 11th, ran out of guys they wanted to see pitch and sent out Joel Payamps, who they were not excited to see pitch. But Adley Rutschman was excited to see Joel Payamps pitch. And that’s how this game ended with a not very extra-innings pitching duel type score.
Nowadays it is best to exercise caution when bringing the words “quantum” and “consciousness” anywhere near each other, lest you be suspected of mysticism or quackery. Eugene Wigner did not concern himself with this when he wrote his “Remarks on the Mind-Body Question” in 1967. (Perhaps he was emboldened by his recent Nobel prize for contributions to the mathematical foundations of quantum mechanics, which gave him not a little no-nonsense technical credibility.) The mind-body question he addresses is the full-blown philosophical question of “the relation of mind to body”, and he argues unapologetically that quantum mechanics has a great deal to say on the matter. The workhorse of his argument is a thought experiment that now goes by the name “Wigner’s Friend”. About fifty years later, Daniela Frauchiger and Renato Renner formulated another, more complex thought experiment to address related issues in the foundations of quantum theory. In this post, I’ll introduce Wigner’s goals and argument, and evaluate Frauchiger’s and Renner’s claims of its inadequacy, concluding that these are not completely fair, but that their thought experiment does do something interesting and distinct. Finally, I will describe a recent paper of my own, in which I formalize the Frauchiger-Renner argument in a way that illuminates its status and isolates the mathematical origin of their paradox.
* * *
Wigner takes a dualist view about the mind, that is, he believes it to be non-material. To him this represents the common-sense view, but is nevertheless a newly mainstream attitude. Indeed,
[until] not many years ago, the “existence” of a mind or soul would have been passionately denied by most physical scientists. The brilliant successes of mechanistic and, more generally, macroscopic physics and of chemistry overshadowed the obvious fact that thoughts, desires, and emotions are not made of matter, and it was nearly universally accepted among physical scientists that there is nothing besides matter.
He credits the advent of quantum mechanics with
the return, on the part of most physical scientists, to the spirit of Descartes’s “Cogito ergo sum”, which recognizes the thought, that is, the mind, as primary. [With] the creation of quantum mechanics, the concept of consciousness came to the fore again: it was not possible to formulate the laws of quantum mechanics in a fully consistent way without reference to the consciousness.
What Wigner has in mind here is that the standard presentation of quantum mechanics speaks of definite outcomes being obtained when an observer makes a measurement. Of course this is also true in classical physics. In quantum theory, however, the principles of linear evolution and superposition, together with the plausible assumption that mental phenomena correspond to physical phenomena in the brain, lead to situations in which there is no mechanism for such definite observations to arise. Thus there is a tension between the fact that we would like to ascribe particular observations to conscious agents and the fact that we would like to view these observations as corresponding to particular physical situations occurring in their brains.
Once we have convinced ourselves that, in light of quantum mechanics, mental phenomena must be considered on an equal footing with physical phenomena, we are faced with the question of how they interact. Wigner takes it for granted that “if certain physico-chemical conditions are satisfied, a consciousness, that is, the property of having sensations, arises.” Does the influence run the other way? Wigner claims that the “traditional answer” is that it does not, but argues that in fact such influence ought indeed to exist. (Indeed this, rather than technical investigation of the foundations of quantum mechanics, is the central theme of his essay.) The strongest support Wigner feels he can provide for this claim is simply “that we do not know of any phenomenon in which one subject is influenced by another without exerting an influence thereupon”. Here he recalls the interaction of light and matter, pointing out that while matter obviously affects light, the effects of light on matter (for example radiation pressure) are typically extremely small in magnitude, and might well have been missed entirely had they not been suggested by the theory.
Quantum mechanics provides us with a second argument, in the form of a demonstration of the inconsistency of several apparently reasonable assumptions about the physical, the mental, and the interaction between them. Wigner works, at least implicitly, within a model where there are two basic types of object: physical systems and consciousnesses. Some physical systems (those that are capable of instantiating the “certain physico-chemical conditions”) are what we might call mind-substrates. Each consciousness corresponds to a mind-substrate, and each mind-substrate corresponds to at most one consciousness. He considers three claims (this organization of his premises is not explicit in his essay):
1. Isolated physical systems evolve unitarily.
2. Each consciousness has a definite experience at all times.
3. Definite experiences correspond to pure states of mind-substrates, and arise for a consciousness exactly when the corresponding mind-substrate is in the corresponding pure state.
The first and second assumptions constrain the way the model treats physical and mental phenomena, respectively. Assumption 1 is often paraphrased as the `”completeness of quantum mechanics”, while Assumption 2 is a strong rejection of solipsism – the idea that only one’s own mind is sure to exist. Assumption 3 is an apparently reasonable assumption about the relation between mental and physical phenomena.
With this framework established, Wigner’s thought experiment, now typically known as Wigner’s Friend, is quite straightforward. Suppose that an observer, Alice (to name the friend), is able to perform a measurement of some physical quantity of a particle, which may take two values,
and
. Assumption 1 tells us that if Alice performs this measurement when the particle is in a superposition state, the joint system of Alice’s brain and the particle will end up in an entangled state. Now Alice’s mind-substrate is not in a pure state, so by Assumption 3 does not have a definite experience. This contradicts Assumption 2. Wigner’s proposed resolution to this paradox is that in fact Assumption 1 is incorrect, and that there is an influence of the mental on the physical, namely objective collapse or, as he puts it, that the “statistical element which, according to the orthodox theory, enters only if I make an observation enters equally if my friend does”.
* * *
Decades after the publication of Wigner’s essay, Daniela Frauchiger and Renato Renner formulated a new thought experiment, involving observers making measurements of other observers, which they intended to remedy what they saw as a weakness in Wigner’s argument. In their words, “Wigner proposed an argument […] which should show that quantum mechanics cannot have unlimited validity”. In fact, they argue, Wigner’s argument does not succeed in doing so. They assert that Wigner’s paradox may be resolved simply by noting a difference in what each party knows. Whereas Wigner, describing the situation from the outside, does not initially know the result of his friend’s measurement, and therefore assigns the “absurd” entangled state to the joint system composed of both her body and the system she has measured, his friend herself is quite aware of what she has observed, and so assigns to the system either, but not both, of the states corresponding to definite measurement outcomes. “For this reason”, Frauchiger and Renner argue, “the Wigner’s Friend Paradox cannot be regarded as an argument that rules out quantum mechanics as a universally valid theory.”
This criticism strikes me as somewhat unfair to Wigner. In fact, Wigner’s objection to admitting two different states as equally valid descriptions is that the two states correspond to different sets of \textit{physical} properties of the joint system consisting of Alice and the system she measures. For Wigner, physical properties of physical systems are distinct from mental properties of consciousnesses. To engage in some light textual analysis, we can note that the word ‘conscious’, or ‘consciousness’, appears forty-one times in Wigner’s essay, and only once in Frauchiger and Renner’s, in the title of a cited paper. I have the impression that the authors pay inadequate attention to how explicitly Wigner takes a dualist position, including not just physical systems but also, and distinctly, consciousnesses in his ontology. Wigner’s argument does indeed achieve his goals, which are developed in the context of this strong dualism, and differ from the goals of Frauchiger and Renner, who appear not to share this philosophical stance, or at least do not commit fully to it.
Nonetheless, the thought experiment developed by Frauchiger and Renner does achieve something distinct and interesting. We can understand Wigner’s no-go theorem to be of the following form: “Within a model incorporating both mental and physical phenomena, a set of apparently reasonable conditions on how the model treats physical phenomena, mental phenomena, and their interaction cannot all be satisfied”. The Frauchiger-Renner thought experiment can be cast in the same form, with different choices about how to implement the model and which conditions to consider. The major difference in the model itself is that Frauchiger and Renner do not take consciousnesses to be entities in their own rights, but simply take some states of certain physical systems to correspond to conscious experiences. Within such a model, Wigner’s assumption that each mind has a single, definite conscious experience at all times seems far less natural than it did within his model, where consciousnesses are distinct entities from the physical systems that determine them. Thus Frauchiger and Renner need to weaken this assumption, which was so natural to Wigner. The weakening they choose is a sort of transitivity of theories of mind. In their words (Assumption C in their paper):
Suppose that agent A has established that “I am certain that agent A’, upon reasoning within the same theory as the one I am using, is certain that
at time
.” Then agent A can conclude that “I am certain that
at time
.”
Just as Assumption 3 above was, for Wigner, a natural restriction on how a sensible theory ought to treat mental phenomena, this serves as Frauchiger’s and Renner’s proposed constraint. Just as Wigner designed a thought experiment that demonstrated the incompatibility of his assumption with an assumption of the universal applicability of unitary quantum mechanics to physical systems, so do Frauchiger and Renner.
* * *
In my recent paper “Reasoning across spacelike surfaces in the Frauchiger-Renner thought experiment”, I provide two closely related formalizations of the Frauchiger-Renner argument. These are motivated by a few observations:
1. Assumption C ought to make reference to the (possibly different) times at which agents and
are certain about their respective judgments, since these states of knowledge change.
2. Since Frauchiger and Renner do not subscribe to Wigner’s strong dualism, an agent’s certainty about a given proposition, like any other mental state, corresponds within their implicit model to a physical state. Thus statements like “Alice knows that P” should be understood as statements about the state of some part of Alice’s brain. Conditional statements like “if upon measuring a quantity q Alice observes outcome , she knows that P” should be understood as claims about the state of the composite system composed of the part of Alice’s brain responsible for knowing P and the part responsible for recording outcomes of the measurement of q.
3. Because the causal structure of the protocol does not depend on the absolute times of each event, an external agent describing the protocol can choose various “spacelike surfaces”, corresponding to fixed times in different spacetime embeddings of the protocol (or to different inertial frames). There is no reason to privilege one of these surfaces over another, and so each of them should be assigned a quantum state. This may be viewed as an implementation of a relativistic principle.
After developing a mathematical framework based on these observations, I recast Frauchiger’s and Renner’s Assumption C in two ways: first, in terms of a claim about the validity of iterating the “relative state” construction that captures how conditional statements are interpreted in terms of quantum states; and second, in terms of a deductive rule that allows chaining of inferences within a system of quantum logic. By proving that these claims are false in the mathematical framework, I provide a more formal version of the no-go theorem. I also show that the first claim can be rescued if the relative state construction is allowed to be iterated only “along” a single spacelike surface, and the second if a deduction is only allowed to chain inferences “along” a single surface. In other words, the mental transitivity condition desired by Frauchiger and Renner can in fact be combined with universal physical applicability of unitary quantum mechanics, but only if we restrict our analysis to a single spacelike surface. Thus I hope that the analysis I offer provides some clarification of what precisely is going on in Frauchiger and Renner’s thought experiment, what it tells us about combining the physical and the mental in light of quantum mechanics, and how it relates to Wigner’s thought experiment.
* * *
In view of the fact that “Quantum theory cannot consistently describe the use of itself” has, at present, over five hundred citations, and “Remarks on the Mind-Body Question” over thirteen hundred, it seems fitting to close with a thought, cautionary or exultant, from Peter Schwenger’s book on asemic, that is meaningless, writing. He notes that
commentary endlessly extends language; it is in the service of an impossible quest to extract the last, the final, drop of meaning.
I provide no analysis of this claim.
I never imagined that an artist would update me about quantum-computing research.
Last year, steampunk artist Bruce Rosenbaum forwarded me a notification about a news article published in Science. The article reported on an experiment performed in physicist Yiwen Chu’s lab at ETH Zürich. The experimentalists had built a “mechanical qubit”: they’d stored a basic unit of quantum information in a mechanical device that vibrates like a drumhead. The article dubbed the device a “steampunk qubit.”
I was collaborating with Bruce on a quantum-steampunk sculpture, and he asked if we should incorporate the qubit into the design. Leave it for a later project, I advised. But why on God’s green Earth are you receiving email updates about quantum computing?
My news feed sends me everything that says “steampunk,” he explained. So keeping a bead on steampunk can keep one up to date on quantum science and technology—as I’ve been preaching for years.
Other ideas displaced Chu’s qubit in my mind until I visited the University of California, Berkeley this January. Visiting Berkeley in January, one can’t help noticing—perhaps with a trace of smugness—the discrepancy between the temperature there and the temperature at home. And how better to celebrate a temperature difference than by studying a quantum-thermodynamics-style throwback to the 1800s?
One sun-drenched afternoon, I learned that one of my hosts had designed another steampunk qubit: Alp Sipahigil, an assistant professor of electrical engineering. He’d worked at Caltech as a postdoc around the time I’d finished my PhD there. We’d scarcely interacted, but I’d begun learning about his experiments in atomic, molecular, and optical physics then. Alp had learned about my work through Quantum Frontiers, as I discovered this January. I had no idea that he’d “met” me through the blog until he revealed as much to Berkeley’s physics department, when introducing the colloquium I was about to present.
Alp and collaborators proposed that a qubit could work as follows. It consists largely of a cantilever, which resembles a pendulum that bobs back and forth. The cantilever, being quantum, can have only certain amounts of energy. When the pendulum has a particular amount of energy, we say that the pendulum is in a particular energy level.
One might hope to use two of the energy levels as a qubit: if the pendulum were in its lowest-energy level, the qubit would be in its 0 state; and the next-highest level would represent the 1 state. A bit—a basic unit of classical information—has 0 and 1 states. A qubit can be in a superposition of 0 and 1 states, and so the cantilever could be.
A flaw undermines this plan, though. Suppose we want to process the information stored in the cantilever—for example, to turn a 0 state into a 1 state. We’d inject quanta—little packets—of energy into the cantilever. Each quantum would contain an amount of energy equal to (the energy associated with the cantilever’s 1 state) – (the amount associated with the 0 state). This equality would ensure that the cantilever could accept the energy packets lobbed at it.
But the cantilever doesn’t have only two energy levels; it has loads. Worse, all the inter-level energy gaps equal each other. However much energy the cantilever consumes when hopping from level 0 to level 1, it consumes that much when hopping from level 1 to level 2. This pattern continues throughout the rest of the levels. So imagine starting the cantilever in its 0 level, then trying to boost the cantilever into its 1 level. We’d probably succeed; the cantilever would probably consume a quantum of energy. But nothing would stop the cantilever from gulping more quanta and rising to higher energy levels. The cantilever would cease to serve as a qubit.
We can avoid this problem, Alp’s team proposed, by placing an atomic-force microscope near the cantilever. An atomic force microscope maps out surfaces similarly to how a Braille user reads: by reaching out a hand and feeling. The microscope’s “hand” is a tip about ten nanometers across. So the microscope can feel surfaces far more fine-grained than a Braille user can. Bumps embossed on a page force a Braille user’s finger up and down. Similarly, the microscope’s tip bobs up and down due to forces exerted by the object being scanned.
Imagine placing a microscope tip such that the cantilever swings toward it and then away. The cantilever and tip will exert forces on each other, especially when the cantilever swings close. This force changes the cantilever’s energy levels. Alp’s team chose the tip’s location, the cantilever’s length, and other parameters carefully. Under the chosen conditions, boosting the cantilever from energy level 1 to level 2 costs more energy than boosting from 0 to 1.
So imagine, again, preparing the cantilever in its 0 state and injecting energy quanta. The cantilever will gobble a quantum, rising to level 1. The cantilever will then remain there, as desired: to rise to level 2, the cantilever would have to gobble a larger energy quantum, which we haven’t provided.1
Will Alp build the mechanical qubit proposed by him and his collaborators? Yes, he confided, if he acquires a student nutty enough to try the experiment. For when he does—after the student has struggled through the project like a dirigible through a hurricane, but ultimately triumphed, and a journal is preparing to publish their magnum opus, and they’re brainstorming about artwork to represent their experiment on the journal’s cover—I know just the aesthetic to do the project justice.
1Chu’s team altered their cantilever’s energy levels using a superconducting qubit, rather than an atomic force microscope.
The United States’ government is waging an all-out assault on Harvard University. The strategy, so far, has been:
The grounds for this war is that Harvard allegedly does not provide a safe environment for its Jewish students, and that Harvard refuses to let the government determine who it may and may not hire.
Now, maybe you can explain to me what this is really about. I’m confused what crimes these scientific researchers commited that justifies stripping them of their grants and derailing their research. I’m also unclear as to why many apolitical, hard-working young trainees in laboratories across the campus deserve to be ejected from their graduate and post-graduate careers and sent home, delaying or ruining their futures. [Few will be able to transfer to other US schools; with all the government cuts to US science, there’s no money to support them at other locations.] And I don’t really understand how such enormous damage and disruption to the lives and careers of ten thousand-ish scientists, researchers and graduate students at Harvard (including many who are Jewish) will actually improve the atmosphere for Harvard’s Jewish students.
As far as I can see, the government is merely using Jewish students as pawns, pretending to attack Harvard on their behalf while in truth harboring no honest concern for their well-being. The fact that the horrors and nastiness surrounding the Gaza war are being exploited by the government as cover for an assault on academic freedom and scientific research is deeply cynical and exceedingly ugly.
From the outside, where Harvard is highly respected — it is certainly among the top five universities in the world, however you rank them — this must look completely idiotic, as idiotic as France gutting the Sorbonne, or the UK eviscerating Oxford. But keep in mind that Harvard is by no means the only target here. The US government is cutting the country’s world-leading research in science, technology and medicine to the bone. If that’s what you want to do, then ruining Harvard makes perfect sense.
The country that benefits the most from this self-destructive behavior? China, obviously. As a friend of mine said, this isn’t merely like shooting yourself in the foot, it’s like shooting yourself in the head.
I suspect most readers will understand that I cannot blog as usual right now. To write good articles about quantum physics requires concentration and focus. When people’s careers and life’s work are being devastated all around me, that’s simply not possible.
I’ve mentioned SciPost a few times on this blog. They’re an open journal in every sense you could think of: diamond open-access scientific publishing on an open-source platform, run with open finances. They even publish their referee reports. They’re aiming to cover not just a few subjects, but a broad swath of academia, publishing scientists’ work in the most inexpensive and principled way possible and challenging the dominance of for-profit journals.
And they’re struggling.
SciPost doesn’t charge university libraries for access, they let anyone read their articles for free. And they don’t charge authors Article Processing Charges (or APCs), they let anyone publish for free. All they do is keep track of which institutions those authors are affiliated with, calculate what fraction of their total costs comes from them, and post it in a nice searchable list on their website.
And amazingly, for the last nine years, they’ve been making that work.
SciPost encourages institutions to pay their share, mostly by encouraging authors to bug their bosses until they do. SciPost will also quite happily accept more than an institution’s share, and a few generous institutions do just that, which is what has kept them afloat so far. But since nothing compels anyone to pay, most organizations simply don’t.
From an economist’s perspective, this is that most basic of problems, the free-rider problem. People want scientific publication to be free, but it isn’t. Someone has to pay, and if you don’t force someone to do it, then the few who pay will be exploited by the many who don’t.
There’s more worth saying, though.
First, it’s worth pointing out that SciPost isn’t paying the same cost everyone else pays to publish. SciPost has a stripped-down system, without any physical journals or much in-house copyediting, based entirely on their own open-source software. As a result, they pay about 500 euros per article. Compare this to the fees negotiated by particle physics’ SCOAP3 agreement, which average to closer to 1000 euros, and realize that those fees are on the low end: for-profit journals tend to make their APCs higher in order to, well, make a profit.
(By the way, while it’s tempting to think of for-profit journals as greedy, I think it’s better to think of them as not cost-effective. Profit is an expense, like the interest on a loan: a payment to investors in exchange for capital used to set up the business. The thing is, online journals don’t seem to need that kind of capital, especially when they’re based on code written by academics in their spare time. So they can operate more cheaply as nonprofits.)
So when an author publishes in SciPost instead of a journal with APCs, they’re saving someone money, typically their institution or their grant. This would happen even if their institution paid their share of SciPost’s costs. (But then they would pay something rather than nothing, hence free-rider problem.)
If an author instead would have published in a closed-access journal, the kind where you have to pay to read the articles and university libraries pay through the nose to get access? Then you don’t save any money at all, your library still has to pay for the journal. You only save money if everybody at the institution stops using the journal. This one is instead a collective action problem.
Collective action problems are hard, and don’t often have obvious solutions. Free-rider problems do suggest an obvious solution: why not just charge?
In SciPost’s case, there are philosophical commitments involved. Their desire to attribute costs transparently and equally means dividing a journal’s cost among all its authors’ institutions, a cost only fully determined at the end of the year, which doesn’t make for an easy invoice.
More to the point, though, charging to publish is directly against what the Open Access movement is about.
That takes some unpacking, because of course, someone does have to pay. It probably seems weird to argue that institutions shouldn’t have to pay charges to publish papers…instead, they should pay to publish papers.
SciPost itself doesn’t go into detail about this, but despite how weird it sounds when put like I just did, there is a difference. Charging a fee to publish means that anyone who publishes needs to pay a fee. If you’re working in a developing country on a shoestring budget, too bad, you have to pay the fee. If you’re an amateur mathematician who works in a truck stop and just puzzled through something amazing, too bad, you have to pay the fee.
Instead of charging a fee, SciPost asks for support. I have to think that part of the reason is that they want some free riders. There are some people who would absolutely not be able to participate in science without free riding, and we want their input nonetheless. That means to support them, others need to give more. It means organizations need to think about SciPost not as just another fee, but as a way they can support the scientific process as a whole.
That’s how other things work, like the arXiv. They get support from big universities and organizations and philanthropists, not from literally everyone. It seems a bit weird to do that for a single scientific journal among many, though, which I suspect is part of why institutions are reluctant to do it. But for a journal that can save money like SciPost, maybe it’s worth it.
This NY Times feature lets you see how each piece of NSF's funding has been reduced this year relative to the normalized average spanning in the last decade. Note: this fiscal year, thanks to the continuing resolution, the actual agency budget has not actually been cut like this. They are just not spending congressionally appropriated agency funds. The agency, fearing/assuming that its budget will get hammered next fiscal year, does not want to start awards that it won't be able to fund in out-years. The result is that this is effectively obeying in advance the presidential budget request for FY26. (And it's highly likely that some will point to unspent funds later in the year and use that as a justification for cuts, when in fact it's anticipation of possible cuts that has led to unspent funds. I'm sure the Germans have a polysyllabic word for this. In English, "Catch-22" is close.)
AB asked me “what did they call clockwise and counterclockwise before they had clocks?” which is an excellent question I’d never considered. It turns out that clockwise was called “sunwise” in the Northern Hemisphere, and counterclockwise was called “widdershins.” Widdershins! That needs to be brought back. It later acquired a more general meaning of “in a direction opposite to or different from what was expected/desired.”
In honor of Bob Rivers, I’m considering only listening to songs that were on the Billboard Top 100 in April 1988 until the Orioles win a game. “Devil Inside” was really one of the best INXS songs but I feel it’s been largely forgotten. (“Need You Tonight” is the best INXS song, I’m sorry but sometimes the popular choice is the right one.)
Terence Trent D’Arby! I think he became a mystic and changed his name at some point. But boy do I love the little tin whistle thing in “Wishing Well.”.
But there’s no way around it, if we’re doing April 1988 we are going to have to listen to Billy Ocean, “Get Out of My Dreams, Get Into My Car.” I’m going to be at American Family Field watching the Orioles take on the Brewers tomorrow and Wednesday. Let’s hope they win; I can’t let this song get stuck in my head again.
Some years ago I speculated that it would nice if a certain mathematical object existed, and even nicer if it were to satisfy an ordinary differential equation of a special sort. I was motivated by a particular physical question, and it seemed very natural to me to imagine such an object... So natural that I was sure that it must already have been studied, the equation for it known. As a result, every so often I'd go down a rabbit hole of a literature dig, but not with much success because it isn't entirely clear where best to look. Then I'd get involved with other projects and forget all about the matter.
Last year I began to think about it again because it might be useful in a method I was developing for a paper, went through the cycle of wondering, and looking for a while, then forgot all about it in thinking about other things.
Then, a little over a month ago at the end of March, while starting on a long flight across the continent, I started thinking about it again, and given that I did not have a connection to the internet to hand, took another approach: I got out a pencil and began mess around in my notebook and just derive what I thought the equation for this object should be, given certain properties it should have. One property is that it should in some circumstances reduce to a known powerful equation (often associated with the legendary 1975 work of Gel'fand and Dikii*) satisfied by the diagonal resolvent $latex {\widehat R}(E,x) {=}\langle x|({\cal H}-E)^{-1}|x\rangle$ of a Schrodinger Hamiltonian $latex {\cal H}=-\hbar^2\partial^2_x+u(x)$. It is:
$latex 4(u(x)-E){\widehat R}^2-2\hbar^2 {\widehat R}{\widehat R}^{\prime\prime}+\hbar^2({\widehat R}^\prime)^2 = 1\ .$
Here, $latex E$ is an energy of the Hamiltonian, in potential $latex u(x)$, and $latex x$ is a coordinate on the real line.
The object itself would be a generalisation of the diagonal resolvent $latex {\widehat R}(E,x)$, although non-diagonal in the energy, not the [...] Click to continue reading this post
The post A New Equation? appeared first on Asymptotia.
Apologies for slow posting. Real life has been very intense, and I also was rather concerned when one of my readers mentioned last weekend that these days my blog was like concentrated doom-scrolling. I will have more to say about the present university research crisis later, but first I wanted to give a hopefully diverting example of the kind of problem-solving and following-your-nose that crops up in research.
Recently in my lab we have had a need to measure very small changes in electrical resistance of some devices, at the level of a few milliOhms out of kiloOhms - parts in \(10^6\). One of my students put together a special kind of resistance bridge to do this, and it works very well. Note to interested readers: if you want to do this, make sure that you use components with very low temperature coefficients of their properties (e.g., resistors with a very small \(dR/dT\)), because otherwise your bridge becomes an extremely effective thermometer for your lab. It’s kind of cool to be able to see the lab temperature drift around by milliKelvins, but it's not great for measuring your sample of interest.
There are a few ways to measure resistance. The simplest is the two-terminal approach, where you drive currents through and measure voltages across your device with the same two wires. This is easy, but it means that the voltage you measure includes contributions from the contacts those wires make with the device. A better alternative is the four-terminal method, where you use separate wires to supply/collect the current.
Anyway, in the course of doing some measurements of a particular device's resistance as a function of magnetic field at low temperatures, we saw something weird. Below some rather low temperatures, when we measured in a 2-terminal arrangement, we saw a jump up in resistance by around 20 milliOhms (out of a couple of kOhms) as magnetic field was swept up from zero, and a small amount of resistance hysteresis with magnetic field sweep that vanished above maybe 0.25 T. This vanished completely in a 4-terminal arrangement, and also disappeared above about 3.4 K. What was this? Turns out that I think we accidentally rediscovered the superconducting transition in indium. While our contact pads on our sample mount looked clean to the unaided eye, they had previously had indium on there. The magic temperature is very close to the bulk \(T_{c}\) for indium.
For one post, rather than dwelling on the terrible news about the US science ecosystem, does anyone out there have other, similar fun experimental anecdotes? Glitches that turned out to be something surprising? Please share in the comments.
Three guys claim that any heavy chunk of matter emits Hawking radiation, even if it’s not a black hole:
• Michael F. Wondrak, Walter D. van Suijlekom and Heino Falcke, Gravitational pair production and black hole evaporation, Phys. Rev. Lett. 130 (2023), 221502.
Now they’re getting more publicity by claiming this will make the universe fizzle out sooner than expected. They’re claiming, for example, that a dead, cold star will emit Hawking radiation, and thus slowly lose mass and eventually disappear!
They admit that this would violate baryon conservation: after all, the protons and neutrons in the star would have to go away somehow! They admit they don’t know how this would happen. They just say that the gravitational field of the star will create particle-antiparticle pairs that will slowly radiate away, forcing the dead star to lose mass somehow to conserve energy.
If experts thought this had even a chance of being true, it would be the biggest thing since sliced bread—at least in the field of quantum gravity. Everyone would be writing papers about it, because if true it would be revolutionary. It would overturn calculations by experts which say that a stationary chunk of matter doesn’t emit Hawking radiation. It would also mean that quantum field theory in curved spacetime can only be consistent if baryon number fails to be conserved! This would be utterly shocking.
But in fact, these new papers have had almost zero effect on physics. There’s a short rebuttal, here:
• Antonio Ferreiro José Navarro-Salas and Silvia Pla, Comment on “Gravitational pair production and black hole evaporation”, Phys. Rev. Lett. 133 (2024), 229001.
It explains that these guys used a crude approximation that gives wrong results even in a simpler problem. Similar points are made here:
• E. T. Akhmedov, D. V. Diakonov and C. Schubert, Complex effective actions and gravitational pair creation, Phys. Rev. D. 110, 105011.
Unfortunately, it seems the real experts on quantum field theory in curved spacetime have not come out and mentioned the correct way to think about this issue, which has been known at least since 1975. To them—or maybe I should dare to say “us”—it’s just well known that the gravitational field of a static mass does not cause the creation of particle-antiparticle pairs.
Of course, the referees should have rejected Wondrak, van Suijlekom and Falcke’s papers. But apparently none of those referees were experts on the subject at hand. So you can’t trust a paper just because it appears in a supposedly reputable physics journal. You have to actually understand the subject and assess the paper yourself, or talk to some experts you trust.
If I were a science journalist writing an article about a supposedly shocking development like this, I would email some experts and check to see if it’s for real. But plenty of science journalists don’t bother with that anymore: they just believe the press releases. So now we’re being bombarded with lazy articles like these:
• Universe will die “much sooner than expected,” new research says, CBS News, May 13, 2025.
• Sharmila Kuthunur, Scientists calculate when the universe will end—it’s sooner than expected, Space.com, 15 May 2025.
• Jamie Carter, The universe will end sooner than thought, scientists say, Forbes, 16 May 2025.
The list goes on; these are just three. There’s no way what I say can have much effect against such a flood of misinformation. As Mark Twain said, “A lie can travel around the world and back again while the truth is lacing up its boots.” Actually he probably didn’t say that—but everyone keeps saying he did, illustrating the point perfectly.
Still, there might be a few people who both care and don’t already know this stuff. Instead of trying to give a mini-course here, let me simply point to an explanation of how things really work:
• Abhay Ashtekar and Anne Magnon, Quantum fields in curved space-times, Proceedings of the Royal Society, 346 (1975), 375–394.
It’s technical, so it’s not easy reading if you haven’t studied quantum field theory and general relativity, but that’s unavoidable. It shows that in a static spacetime there is a well-defined concept of ‘vacuum’, and the vacuum is stable. Jorge Pullin pointed out the key sentence for present purposes:
Thus, if the underlying space-time admits a everywhere time-like Killing field, the vacuum state is indeed stable and phenomena such as the spontaneous creation of particles do not occur.
This condition of having an “everywhere time-like Killing field” says that a spacetime has time translation symmetry. Ashtekar and Magnon also assume that spacetime is globally hyperbolic and that the wave equation for a massive spin-zero particle has a smooth solution given smooth initial data. All this lets us define a concept of energy for solutions of this equation. It also lets us split solutions into positive-frequency solutions, which correspond to particles, and negative-frequency ones, which correspond to antiparticles. We can thus set up quantum field theory in way we’re used to on Minkowski spacetime, where there’s a well-defined vacuum which does not decay into particle-antiparticle pairs.
The Schwarzschild solution, which describes a static black hole, also has a Killing field. But this ceases to be timelike at the event horizon, so this result does not apply to that!
I could go into more detail if required, but you can find a more pedagogical treatment in this standard textbook:
• Robert Wald, Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics, University of Chicago Press, Chicago, 1994.
In particular, go to Section 4.3, which is on quantum field theory in stationary spacetimes.
I also can’t resist citing this thesis by a student of mine:
• Valeria Michelle Carrión Álvarez, Loop Quantization versus Fock Quantization of p-Form Electromagnetism on Static Spacetimes, Ph.D. thesis, U. C. Riverside, 2004.
This thesis covers the case of electromagnetism, while Ashtekar and Magnon, and also Wald, focus on a massive scalar field for simplicity.
So: it’s been rigorously shown that the gravitational field of a static object does not create particle-antiparticle pairs. This has been known for decades. Now some people have done a crude approximate calculation that seems to show otherwise. Some flaws in the approximation have been pointed out. Of course the authors of the calculation don’t believe their approximation is flawed. We could argue about that for a long time. But it’s scarcely worth thinking about, because no approximations were required to settle this issue. It was settled over 50 years ago, and the new work is not shedding new light on the issue: it’s much more hand-wavy than the old work.
I have another piece this week on the FirstPrinciples.org Hub. If you’d like to know who they are, I say a bit about my impressions of them in my post on the last piece I had there. They’re still finding their niche, so there may be shifts in the kind of content they cover over time, but for now they’ve given me an opportunity to cover a few topics that are off the beaten path.
This time, the piece is what we in the journalism biz call an “explainer”. Instead of interviewing people about cutting-edge science, I wrote a piece to explain an older idea. It’s an idea that’s pretty cool, in a way I think a lot of people can actually understand: a black hole puzzle that might explain why gravity is the weakest force. It’s an idea that’s had an enormous influence, both in the string theory world where it originated and on people speculating more broadly about the rules of quantum gravity. If you want to learn more, read the piece!
Since I didn’t interview anyone for this piece, I don’t have the same sort of “bonus content” I sometimes give. Instead of interviewing, I brushed up on the topic, and the best resource I found was this review article written by Dan Harlow, Ben Heidenreich, Matthew Reece, and Tom Rudelius. It gave me a much better idea of the subtleties: how many different ways there are to interpret the original conjecture, and how different attempts to build on it reflect on different facets and highlight different implications. If you are a physicist curious what the whole thing is about, I recommend reading that review: while I try to give a flavor of some of the subtleties, a piece for a broad audience can only do so much.
Back before satellites, to transmit radio waves over really long distances folks bounced them off the ionosphere—a layer of charged particles in the upper atmosphere. Unfortunately this layer only reflects radio waves with frequencies up to 30 megahertz. This limits the rate at which information can be transmitted.
How to work around this?
METEOR BURST COMMUNICATIONS!
On average, 100 million meteorites weighing about a milligram hit the Earth each day. They vaporize about 120 kilometers up. Each one creates a trail of ions that lasts about a second. And you can bounce radio waves with a frequency up to 100 megahertz off this trail.
That’s not a huge improvement, and you need to transmit in bursts whenever a suitable meterorite comes your way, but the military actually looked into doing this.
The National Bureau of Standards tested a burst-mode system in 1958 that used the 50-MHz band and offered a full-duplex link at 2,400 bits per second. The system used magnetic tape loops to buffer data and transmitters at both ends of the link that operated continually to probe for a path. Whenever the receiver at one end detected a sufficiently strong probe signal from the other end, the transmitter would start sending data. The Canadians got in on the MBC action with their JANET system, which had a similar dedicated probing channel and tape buffer. In 1954 they established a full-duplex teletype link between Ottawa and Nova Scotia at 1,300 bits per second with an error rate of only 1.5%.
This is from
• Dan Maloney, Radio apocalypse: meteor burst communications, Hackaday, 2025 May 12.
and the whole article is a great read.
There’s a lot more to the story. For example, until recently people used this method in the western United States to report the snow pack from mountain tops!
The system was called SNOTEL, and you can read more about it here:
• Dan Maloney, Know snow: monitoring snowpack with the SNOTEL network, Hackaday, 2023 June 29.
Also, a lot of ham radio operators bounce signals off meteors just for fun!
• Robert Gulley, Incoming! An introduction to meteor scatter propagation, The SWLing Post, 2024 January 17.
On Wednesday May 14, 2025 I’ll be giving a talk at 2 pm Pacific Time, or 10 pm UK time. The talk is for physics students at the Universidade de São Paulo in Brazil, organized by Artur Renato Baptista Boyago.
Abstract. The 20th century was the century of fundamental physics. What about the 21st? Progress on fundamental physics has been slow since about 1980, but there is exciting progress in other fields, such as condensed matter. This requires an adjustment in how we think about the goal of physics.
You can see my slides here, or watch a video of the talk here:
Rachel Greenfeld and I have just uploaded to the arXiv our paper Some variants of the periodic tiling conjecture. This paper explores variants of the periodic tiling phenomenon that, in some cases, a tile that can translationally tile a group, must also be able to translationally tile the group periodically. For instance, for a given discrete abelian group , consider the following question:
Question 1 (Periodic tiling question) Letbe a finite subset of
. If there is a solution
to the tiling equation
, must there exist a periodic solution
to the same equation
?
We know that the answer to this question is positive for finite groups (trivially, since all sets are periodic in this case), one-dimensional groups
with
finite, and in
, but it can fail for
for certain finite
, and also for
for sufficiently large
; see this previous blog post for more discussion. But now one can consider other variants of this question:
We are able to obtain positive answers to three such analogues of the periodic tiling conjecture for three cases of this question. The first result (which was kindly shared with us by Tim Austin), concerns the homogeneous problem . Here the results are very satisfactory:
Theorem 2 (First periodic tiling result) Letbe a discrete abelian group, and let
be integer-valued and finitely supported. Then the following are equivalent.
- (i) There exists an integer-valued solution
to
that is not identically zero.
- (ii) There exists a periodic integer-valued solution
to
that is not identically zero.
- (iii) There is a vanishing Fourier coefficient
for some non-trivial character
of finite order.
By combining this result with an old result of Henry Mann about sums of roots of unity, as well as an even older decidability result of Wanda Szmielew, we obtain
Corollary 3 Any of the statements (i), (ii), (iii) is algorithmically decidable; there is an algorithm that, when givenand
as input, determines in finite time whether any of these assertions hold.
Now we turn to the inhomogeneous problem in , which is the first difficult case (periodic tiling type results are easy to establish in one dimension, and trivial in zero dimensions). Here we have two results:
Theorem 4 (Second periodic tiling result) Let, let
be periodic, and let
be integer-valued and finitely supported. Then the following are equivalent.
- (i) There exists an integer-valued solution
to
.
- (ii) There exists a periodic integer-valued solution
to
.
Theorem 5 (Third periodic tiling result) Let, let
be periodic, and let
be integer-valued and finitely supported. Then the following are equivalent.
- (i) There exists an indicator function solution
to
.
- (ii) There exists a periodic indicator function solution
to
.
In particular, the previously established case of periodic tiling conjecture for level one tilings of , is now extended to higher level. By an old argument of Hao Wang, we now know that the statements mentioned in Theorem 5 are now also algorithmically decidable, although it remains open whether the same is the case for Theorem 4. We know from past results that Theorem 5 cannot hold in sufficiently high dimension (even in the classic case
), but it also remains open whether Theorem 4 fails in that setting.
Following past literature, we rely heavily on a structure theorem for solutions to tiling equations
, which roughly speaking asserts that such solutions
must be expressible as a finite sum of functions
that are one-periodic (periodic in a single direction). This already explains why tiling is easy to understand in one dimension, and why the two-dimensional case is more tractable than the case of general dimension. This structure theorem can be obtained by averaging a dilation lemma, which is a somewhat surprising symmetry of tiling equations that basically arises from finite characteristic arguments (viewing the tiling equation modulo
for various large primes
).
For Theorem 2, one can take advantage of the fact that the homogeneous equation is preserved under finite difference operators
: if
solves
, then
also solves the same equation
. This freedom to take finite differences one to selectively eliminate certain one-periodic components
of a solution
to the homogeneous equation
until the solution is a pure one-periodic function, at which point one can appeal to an induction on dimension, to equate parts (i) and (ii) of the theorem. To link up with part (iii), we also take advantage of the existence of retraction homomorphisms from
to
to convert a vanishing Fourier coefficient
into an integer solution to
.
The inhomogeneous results are more difficult, and rely on arguments that are specific to two dimensions. For Theorem 4, one can also perform finite differences to analyze various components of a solution
to a tiling equation
, but the conclusion now is that the these components are determined (modulo
) by polynomials of one variable. Applying a retraction homomorphism, one can make the coefficients of these polynomials rational, which makes the polynomials periodic. This turns out to reduce the original tiling equation
to a system of essentially local combinatorial equations, which allows one to “periodize” a non-periodic solution by periodically repeating a suitable block of the (retraction homomorphism applied to the) original solution.
Theorem 5 is significantly more difficult to establish than the other two results, because of the need to maintain the solution in the form of an indicator function. There are now two separate sources of aperiodicity to grapple with. One is the fact that the polynomials involved in the components may have irrational coefficients (see Theorem 1.3 of our previous paper for an explicit example of this for a level 4 tiling). The other is that in addition to the polynomials (which influence the fractional parts of the components
), there is also “combinatorial” data (roughly speaking, associated to the integer parts of
) which also interact with each other in a slightly non-local way. Once one can make the polynomial coefficients rational, there is enough periodicity that the periodization approach used for the second theorem can be applied to the third theorem; the main remaining challenge is to find a way to make the polynomial coefficients rational, while still maintaining the indicator function property of the solution
.
It turns out that the restriction homomorphism approach is no longer available here (it makes the components unbounded, which makes the combinatorial problem too difficult to solve). Instead, one has to first perform a second moment analysis to discern more structure about the polynomials involved. It turns out that the components
of an indicator function
can only utilize linear polynomials (as opposed to polynomials of higher degree), and that one can partition
into a finite number of cosets on which only three of these linear polynomials are “active” on any given coset. The irrational coefficients of these linear polynomials then have to obey some rather complicated, but (locally) finite, sentence in the theory of first-order linear inequalities over the rationals, in order to form an indicator function
. One can then use the Weyl equidistribution theorem to replace these irrational coefficients with rational coefficients that obey the same constraints (although one first has to ensure that one does not accidentally fall into the boundary of the constraint set, where things are discontinuous). Then one can apply periodization to the remaining combinatorial data to conclude.
A key technical problem arises from the discontinuities of the fractional part operator at integers, so a certain amount of technical manipulation (in particular, passing at one point to a weak limit of the original tiling) is needed to avoid ever having to encounter this discontinuity.
In a recent post, I talked about a proof of concept tool to verify estimates automatically. Since that post, I have overhauled the tool twice: first to turn it into a rudimentary proof assistant that could also handle some propositional logic; and second into a much more flexible proof assistant (deliberately designed to mimic the Lean proof assistant in several key aspects) that is also powered by the extensive Python package sympy for symbolic algebra, following the feedback from previous commenters. This I think is now a stable framework with which one can extend the tool much further; my initial aim was just to automate (or semi-automate) the proving of asymptotic estimates involving scalar functions, but in principle one could keep adding tactics, new sympy types, and lemmas to the tool to handle a very broad range of other mathematical tasks as well.
The current version of the proof assistant can be found here. (As with my previous coding, I ended up relying heavily on large language model assistance to understand some of the finer points of Python and sympy, with the autocomplete feature of Github Copilot being particularly useful.) While the tool can support fully automated proofs, I have decided to focus more for now on semi-automated interactive proofs, where the human user supplies high-level “tactics” that the proof assistant then performs the necessary calculations for, until the proof is completed.
It’s easiest to explain how the proof assistant works with examples. Right now I have implemented the assistant to work inside the interactive mode of Python, in which one enters Python commands one at a time. (Readers from my generation may be familiar with text adventure games, which have a broadly similar interface.) I would be interested developing at some point a graphical user interface for the tool, but for prototype purposes, the Python interactive version suffices. (One can also run the proof assistant within a Python script, of course.)
After downloading the relevant files, one can launch the proof assistant inside Python by typing from main import *
and then loading one of the pre-made exercises. Here is one such exercise:
>>> from main import *
>>> p = linarith_exercise()
Starting proof. Current proof state:
x: pos_real
y: pos_real
z: pos_real
h1: x < 2*y
h2: y < 3*z + 1
|- x < 7*z + 2
This is the proof assistant’s formalization of the following problem: If are positive reals such that
and
, prove that
.
The way the proof assistant works is that one directs the assistant to use various “tactics” to simplify the problem until it is solved. In this case, the problem can be solved by linear arithmetic, as formalized by the Linarith()
tactic:
>>> p.use(Linarith())
Goal solved by linear arithmetic!
Proof complete!
If instead one wanted a bit more detail on how the linear arithmetic worked, one could have run this tactic instead with a verbose
flag:
>>> p.use(Linarith(verbose=true))
Checking feasibility of the following inequalities:
1*z > 0
1*x + -7*z >= 2
1*y + -3*z < 1
1*y > 0
1*x > 0
1*x + -2*y < 0
Infeasible by summing the following:
1*z > 0 multiplied by 1/4
1*x + -7*z >= 2 multiplied by 1/4
1*y + -3*z < 1 multiplied by -1/2
1*x + -2*y < 0 multiplied by -1/4
Goal solved by linear arithmetic!
Proof complete!
Sometimes, the proof involves case splitting, and then the final proof has the structure of a tree. Here is one example, where the task is to show that the hypotheses and
imply
:
>>> from main import *
>>> p = split_exercise()
Starting proof. Current proof state:
x: real
y: real
h1: (x > -1) & (x < 1)
h2: (y > -2) & (y < 2)
|- (x + y > -3) & (x + y < 3)
>>> p.use(SplitHyp("h1"))
Decomposing h1: (x > -1) & (x < 1) into components x > -1, x < 1.
1 goal remaining.
>>> p.use(SplitHyp("h2"))
Decomposing h2: (y > -2) & (y < 2) into components y > -2, y < 2.
1 goal remaining.
>>> p.use(SplitGoal())
Split into conjunctions: x + y > -3, x + y < 3
2 goals remaining.
>>> p.use(Linarith())
Goal solved by linear arithmetic!
1 goal remaining.
>>> p.use(Linarith())
Goal solved by linear arithmetic!
Proof complete!
>>> print(p.proof())
example (x: real) (y: real) (h1: (x > -1) & (x < 1)) (h2: (y > -2) & (y < 2)): (x + y > -3) & (x + y < 3) := by
split_hyp h1
split_hyp h2
split_goal
. linarith
linarith
Here at the end we gave a “pseudo-Lean” description of the proof in terms of the three tactics used: a tactic cases h
1 to case split on the hypothesis h1
, followed by two applications of the simp_all
tactic to simplify in each of the two cases.
The tool supports asymptotic estimation. I found a way to implement the order of magnitude formalism from the previous post within sympy. It turns out that sympy, in some sense, already natively implements nonstandard analysis: its symbolic variables have an is_number
flag which basically corresponds to the concept of a “standard” number in nonstandard analysis. For instance, the sympy
version S(3)
of the number 3 has S(3).is_number == True
and so is standard, whereas an integer variable n = Symbol("n", integer=true)
has n.is_number == False
and so is nonstandard. Within sympy
, I was able to construct orders of magnitude Theta(X)
of various (positive) expressions X
, with the property that Theta(n)=Theta(1)
if n
is a standard number, and use this concept to then define asymptotic estimates such as (implemented as
lesssim(X,Y)
). One can then apply a logarithmic form of linear arithmetic to then automatically verify some asymptotic estimates. Here is a simple example, in which one is given a positive integer and positive reals
such that
and
, and the task is to conclude that
:
>>> p = loglinarith_exercise()
Starting proof. Current proof state:
N: pos_int
x: pos_real
y: pos_real
h1: x <= 2*N**2
h2: y < 3*N
|- Theta(x)*Theta(y) <= Theta(N)**4
>>> p.use(LogLinarith(verbose=True))
Checking feasibility of the following inequalities:
Theta(N)**1 >= Theta(1)
Theta(x)**1 * Theta(N)**-2 <= Theta(1)
Theta(y)**1 * Theta(N)**-1 <= Theta(1)
Theta(x)**1 * Theta(y)**1 * Theta(N)**-4 > Theta(1)
Infeasible by multiplying the following:
Theta(N)**1 >= Theta(1) raised to power 1
Theta(x)**1 * Theta(N)**-2 <= Theta(1) raised to power -1
Theta(y)**1 * Theta(N)**-1 <= Theta(1) raised to power -1
Theta(x)**1 * Theta(y)**1 * Theta(N)**-4 > Theta(1) raised to power 1
Proof complete!
The logarithmic linear programming solver can also handle lower order terms, by a rather brute force branching method:
>>> p = loglinarith_hard_exercise()
Starting proof. Current proof state:
N: pos_int
x: pos_real
y: pos_real
h1: x <= 2*N**2 + 1
h2: y < 3*N + 4
|- Theta(x)*Theta(y) <= Theta(N)**3
>>> p.use(LogLinarith())
Goal solved by log-linear arithmetic!
Proof complete!
I plan to start developing tools for estimating function space norms of symbolic functions, for instance creating tactics to deploy lemmas such as Holder’s inequality and the Sobolev embedding inequality. It looks like the sympy
framework is flexible enough to allow for creating further object classes for these sorts of objects. (Right now, I only have one proof-of-concept lemma to illustrate the framework, the arithmetic mean-geometric mean lemma.)
I am satisfied enough with the basic framework of this proof assistant that I would be open to further suggestions or contributions of new features, for instance by introducing new data types, lemmas, and tactics, or by contributing example problems that ought to be easily solvable by such an assistant, but are currently beyond its ability, for instance due to the lack of appropriate tactics and lemmas.
In the news this week was the joint announcement by the presidents of the European Commission and France of initiatives about welcoming top researchers from abroad, with the aim being especially to encourage researchers from the USA to cross the Atlantic. I've seen some discussion online about this among people I know and thought I'd add a few comments here, for those outside Europe thinking about making such a jump.
Firstly, what is the new initiative? Various programmes have been put in place; on the EU side it seems to be encouraging applications to Marie Curie Fellowships for postdocs and ERC grants. It looks like there is some new money, particularly for Marie Curie Fellowships for incoming researchers. Applying for these is generally good advice, as they are prestigious programs that open the way to a career; in my field a Marie Curie often leads to a permanent position, and an ERC grant is so huge that it opens doors everywhere. In France, the programme seems to be an ANR programme targeting specific strategic fields, so unlikely to be relevant for high-energy physicists (despite the fact that they invited Mark Thomson to speak at the meeting). But France can be a destination for the European programmes, and there are good reasons for choose France as a destination.
So the advice would seem to be to try out life in France with a Marie-Curie Fellowship, and then apply through the usual channels for a permanent position. This is very reasonable, because it makes little sense to move permanently before having some idea of what life and research is actually like here first. I would heartily recommend it. There are several permanent positions available every year in the CNRS at the junior level, but because of the way the CNRS hiring works -- via a central committee, that decides for positions in the whole country -- if someone leaves it is not very easy to replace them, and people job-hopping is a recurrent problem. There is also the possibility for people to enter the CNRS at a senior level, with up to one position available in theoretical physics most years.
I wrote a bit last year where I mentioned some of the great things about the CNRS but I will add a bit now. Firstly, what is it? It is a large organisation that essentially just hires permanent researchers, who work in laboratories throughout the country. Most of these laboratories are hosted by universities, such as my lab (the LPTHE) which is hosted by Sorbonne University. Most of these laboratories are mixed, meaning that they also include university staff, i.e. researchers who also teach undergraduates. University positions have a similar but parallel career to the CNRS, but since the teaching is done in French, and because the positions only open on a rather unpredictable basis, I won't talk about them today. The CNRS positions are 100% research; there is little administrative overhead, and therefore plenty of time to focus on what is important. This is the main advantage of such positions; but also the fact that the organisation of researchers is done into laboratories is a big difference to the Anglo-Saxon model. My lab is relatively small, yet contains a large number of people working in HEP, and this provides a very friendly environment with lots of interesting interactions, without being lost in a labyrinthine organisation or having key decisions taken by people working in vastly different (sub) fields.
The main criticisms I have seen bandied around on social media about the CNRS are that the pay is not competitive, and that CNRS researchers are lazy/do not work. I won't comment about pay, because it's difficult to compare. But there is plenty of oversight by the CNRS committee -- a body of our peers elected by all researchers -- which scrutinises activity, in addition to deciding on hiring and promotions. If people were really sitting on their hands then this would be spotted and nipped in the bud; but the process of doing this is not onerous or intrusive, precisely because it is done by our peers. In fact, the yearly and five-yearly reports serve a useful role in helping people to focus their activities and plan for the next one to five years. There is also evaluation of laboratories and universities (the HCERES, which will now be changed into something else) that however seems sensible: it doesn't seem to lead to the same sort of panic or perverse incentives that the (equivalent) REF seems to induce in the UK, for example.
The people I know are incredibly hard-working and productive. This is, to be fair, also a product of the fact that we have relatively few PhD students compared to other countries. This is partly by design: the philosophy is that it is unfair to train lots of students who can never get permanent positions in the field. As a result, we take good care of our students, and the students we have tend to be good; but since we have the time, we mostly do research ourselves, rather than just being managers.
So the main reason to choose France is to be allowed to do the research you want to do, without managerialisation, bureaucrats or other obstacles interfering. If that sounds appealing, then I suggest getting in touch and/or arranging to visit. A visit to the RPP or one of the national meetings would be a great way to start. The applications for Marie Curie fellowships are open now, and the CNRS competition opens in December with a deadline usually in early January.
Blogger Andrew Oh-Willeke of Dispatches from Turtle Island pointed me to an editorial in Science about the phrase scientific consensus.
The editorial argues that by referring to conclusions like the existence of climate change or vaccine safety as “the scientific consensus”, communicators have inadvertently fanned the flames of distrust. By emphasizing agreement between scientists, the phrase “scientific consensus” leaves open the question of how that consensus was reached. More conspiracy-minded people imagine shady backroom deals and corrupt payouts, while the more realistic blame incentives and groupthink. If you disagree with “the scientific consensus”, you may thus decide the best way forward is to silence those pesky scientists.
(The link to current events is left as an exercise to the reader, to comment on elsewhere. As usual, please no explicit discussion of politics on this blog!)
Instead of “scientific consensus”, the editorial suggests another term, convergence of evidence. The idea is that by centering the evidence instead of the scientists, the phrase would make it clear that these conclusions are justified by something more than social pressures, and will remain even if the scientists promoting them are silenced.
Oh-Willeke pointed me to another blog post responding to the editorial, which has a nice discussion of how the terms were used historically, showing their popularity over time. “Convergence of evidence” was more popular in the 1950’s, with a small surge in the late 90’s and early 2000’s. “Scientific consensus” rose in the 1980’s and 90’s, lining up with a time when social scientists were skeptical about science’s objectivity and wanted to explore the social reasons why scientists come to agreement. It then fell around the year 2000, before rising again, this time used instead by professional groups of scientists to emphasize their agreement on issues like climate change.
(The blog post then goes on to try to motivate the word “consilience” instead, on the rather thin basis that “convergence of evidence” isn’t interdisciplinary enough, which seems like a pretty silly objection. “Convergence” implies coming in from multiple directions, it’s already interdisciplinary!)
I appreciate “convergence of evidence”, it seems like a useful phrase. But I think the editorial is working from the wrong perspective, in trying to argue for which terms “we should use” in the first place.
Sometimes, as a scientist or an organization or a journalist, you want to emphasize evidence. Is it “a preponderance of evidence”, most but not all? Is it “overwhelming evidence”, evidence so powerful it is unlikely to ever be defeated? Or is it a “convergence of evidence”, evidence that came in slowly from multiple paths, each independent route making a coincidence that much less likely?
But sometimes, you want to emphasize the judgement of the scientists themselves.
Sometimes when scientists agree, they’re working not from evidence but from personal experience: feelings of which kinds of research pan out and which don’t, or shared philosophies that sit deep in how they conceive their discipline. Describing physicists’ reasons for expecting supersymmetry before the LHC turned on as a convergence of evidence would be inaccurate. Describing it as having been a (not unanimous) consensus gets much closer to the truth.
Sometimes, scientists do have evidence, but as a journalist, you can’t evaluate its strength. You note some controversy, you can follow some of the arguments, but ultimately you have to be honest about how you got the information. And sometimes, that will be because it’s what most of the responsible scientists you talked to agreed on: scientific consensus.
As science communicators, we care about telling the truth (as much as we ever can, at any rate). As a result, we cannot adopt blanket rules of thumb. We cannot say, “we as a community are using this term now”. The only responsible thing we can do is to think about each individual word. We need to decide what we actually mean, to read widely and learn from experience, to find which words express our case in a way that is both convincing and accurate. There’s no shortcut to that, no formula where you just “use the right words” and everything turns out fine. You have to do the work, and hope it’s enough.
I’ve now been blogging for nearly twenty years—through five presidential administrations, my own moves from Waterloo to MIT to UT Austin, my work on algebrization and BosonSampling and BQP vs. PH and quantum money and shadow tomography, the publication of Quantum Computing Since Democritus, my courtship and marriage and the birth of my two kids, a global pandemic, the rise of super-powerful AI and the terrifying downfall of the liberal world order.
Yet all that time, through more than a thousand blog posts on quantum computing, complexity theory, philosophy, the state of the world, and everything else, I chased a form of recognition for my blogging that remained elusive.
Until now.
This week I received the following email:
I emailed regarding your blog Shtetl-Optimized Blog which was selected by FeedSpot as one of the Top 50 Quantum Computing Blogs on the web.
https://bloggers.feedspot.com/quantum_computing_blogs
We recommend adding your website link and other social media handles to get more visibility in our list, get better ranking and get discovered by brands for collaboration.
We’ve also created a badge for you to highlight this recognition. You can proudly display it on your website or share it with your followers on social media.
We’d be thankful if you can help us spread the word by briefly mentioning Top 50 Quantum Computing Blogs in any of your upcoming posts.
Please let me know if you can do the needful.
You read that correctly: Shtetl-Optimized is now officially one of the top 50 quantum computing blogs on the web. You can click the link to find the other 49.
Maybe it’s not unrelated to this new notoriety that, over the past few months, I’ve gotten a massively higher-than-usual volume of emailed solutions to the P vs. NP problem, as well as the other Clay Millennium Problems (sometimes all seven problems at once), as well as quantum gravity and life, the universe, and everything. I now get at least six or seven confident such emails per day.
While I don’t spend much time on this flood of scientific breakthroughs (how could I?), I’d like to note one detail that’s new. Many of the emails now include transcripts where ChatGPT fills in the details of the emailer’s theories for them—unironically, as though that ought to clinch the case. Who said generative AI wasn’t poised to change the world? Indeed, I’ll probably need to start relying on LLMs myself to keep up with the flood of fan mail, hate mail, crank mail, and advice-seeking mail.
Anyway, thanks for reading everyone! I look forward to another twenty years of Shtetl-Optimized, if my own health and the health of the world cooperate.
Yesterday, the Texas State Legislature heard public comments about SB37, a bill that would give a state board direct oversight over course content and faculty hiring at public universities, perhaps inspired by Trump’s national crackdown on higher education. (See here or here for coverage.) So, encouraged by a friend in the history department, I submitted the following public comment, whatever good it will do.
I’m a computer science professor at UT, although I’m writing in my personal capacity. For 20 years, on my blog and elsewhere, I’ve been outspoken in opposing woke radicalism on campus and (especially) obsessive hatred of Israel that often veers into antisemitism, even when that’s caused me to get attacked from my left. Nevertheless, I write to strongly oppose SB37 in its current form, because of my certainty that no world-class research university can survive ceding control over its curriculum and faculty hiring to the state. If this bill passes, for example, it will severely impact my ability to recruit the most talented computer scientists to UT Austin, if they have competing options that will safeguard their academic freedom as traditionally conceived. Even if our candidates are approved, the new layer of bureaucracy will make it difficult and slow for us to do anything. For those concerned about intellectual diversity in academia, a much better solution would include safeguarding tenure and other protections for faculty with heterodox views, and actually enforcing content-neutral time, place, and manner rules for protests and disruptions. UT has actually done a better job on these things than many other universities in the US, and could serve as a national model for how viewpoint diversity can work — but not under an intolerably stifling regime like the one proposed by this bill.
Many problems in analysis (as well as adjacent fields such as combinatorics, theoretical computer science, and PDE) are interested in the order of growth (or decay) of some quantity that depends on one or more asymptotic parameters (such as
) – for instance, whether the quantity
grows or decays linearly, quadratically, polynomially, exponentially, etc. in
. In the case where these quantities grow to infinity, these growth rates had once been termed “orders of infinity” – for instance, in the 1910 book of this name by Hardy – although this term has fallen out of use in recent years. (Hardy fields are still a thing, though.)
In modern analysis, asymptotic notation is the preferred device to organize orders of infinity. There are a couple of flavors of this notation, but here is one such (a blend of Hardy’s notation and Landau’s notation). Formally, we need a parameter space equipped with a non-principal filter
that describes the subsets of parameter space that are “sufficiently large” (e.g., the cofinite (Fréchet) filter on
, or the cocompact filter on
). We will use
to denote elements of this filter; thus, an assertion holds for sufficiently large
if and only if it holds for all
in some element
of the filter
. Given two positive quantities
that are defined for sufficiently large
, one can then define the following notions:
We caution that in analytic number theory and adjacent fields, the slightly different notation of Vinogradov is favored, in which would denote the concept (i) instead of (ii), and
would denote a fourth concept
instead of (iii). However, we will use the Hardy-Landau notation exclusively in this blog post.
Anyone who works with asymptotic notation for a while will quickly recognize that it enjoys various algebraic properties akin to the familiar algebraic properties of order on the real line. For instance, the symbols
behave very much like
,
,
,
, with properties such as the following:
However, in contrast with other standard algebraic structures (such as ordered fields) that blend order and arithmetic operations, the precise laws of orders of infinity are usually not written down as a short list of axioms. Part of this is due to cultural differences between analysis and algebra – as discussed in this essay by Gowers, analysis is often not well suited to the axiomatic approach to mathematics that algebra benefits so much from. But another reason is due to our orthodox implementation of analysis via “epsilon-delta” type concepts, such as the notion of “sufficiently large” used above, which notoriously introduces a large number of both universal and existential quantifiers into the subject (for every epsilon, there exists a delta…) which tends to interfere with the smooth application of algebraic laws (which are optimized for the universal quantifier rather than the existential quantifier).
But there is an alternate approach to analysis, namely nonstandard analysis, which rearranges the foundations so that many of quantifiers (particularly the existential ones) are concealed from view (usually via the device of ultrafilters). This makes the subject of analysis considerably more “algebraic” in nature, as the “epsilon management” that is so prevalent in orthodox analysis is now performed much more invisibly. For instance, as we shall see, in the nonstandard framework, orders of infinity acquire the algebraic structure of a totally ordered vector space that also enjoys a completeness property reminiscent, though not identical to, the completeness of the real numbers. There is also a transfer principle that allows one to convert assertions in orthodox asymptotic notation into logically equivalent assertions about nonstandard orders of infinity, allowing one to then prove asymptotic statements in a purely algebraic fashion. There is a price to pay for this “algebrization” of analysis; the spaces one works with become quite large (in particular, they tend to be “inseparable” and not “countably generated” in any reasonable fashion), and it becomes difficult to extract explicit constants (or explicit decay rates) from the asymptotic notation. However, there are some cases in which the tradeoff is worthwhile. For instance, symbolic computations tend to be easier to perform in algebraic settings than in orthodox analytic settings, so formal computations of orders of infinity (such as the ones discussed in the previous blog post) could benefit from the nonstandard approach. (See also my previous posts on nonstandard analysis for more discussion about these tradeoffs.)
Let us now describe the nonstandard approach to asymptotic notation. With the above formalism, the switch from standard to nonstandard analysis is actually quite simple: one assumes that the asymptotic filter is in fact an ultrafilter. In terms of the concept of “sufficiently large”, this means adding the following useful axiom:
This can be compared with the situation with, say, the Fréchet filter on the natural numbers , in which one has to insert some qualifier such as “after passing to a subsequence if necessary” in order to make the above axiom true.
The existence of an ultrafilter requires some weak version of the axiom of choice (specifically, the ultrafilter lemma), but for this post we shall just take the existence of ultrafilters for granted.
We can now define the nonstandard orders of infinity to be the space of all non-negative functions
defined for sufficiently large
, modulo the equivalence relation
defined previously. That is to say, a nonstandard order of infinity is an equivalence class
We can place various familiar algebraic operations on :
With these operations, combined with the ultrafilter axiom, we see that obeys the laws of many standard algebraic structures, the proofs of which we leave as exercises for the reader:
The ordered (log-)vector space structure of in particular opens up the ability to prove asymptotic implications by (log-)linear programming; this was implicitly used in my previous post. One can also use the language of (log-)linear algebra to describe further properties of various orders of infinity. For instance, if
is the natural numbers, we can form the subspace
In addition to the above algebraic properties, the nonstandard orders of infinity also enjoy a completeness property that is reminiscent of the completeness of the real numbers. In the reals, it is true that any nested sequence
of non-empty closed intervals has a non-empty intersection, which is a property closely tied to the more familiar definition of completeness as the assertion that Cauchy sequences are always convergent. This claim of course fails for open intervals: for instance,
for
is a nested sequence of non-empty open intervals whose intersection is empty. However, in the nonstandard orders of infinity
, we have the same property for both open and closed intervals!
Lemma 1 (Completeness for arbitrary intervals) Letbe a nested sequence of non-empty intervals in
(which can be open, closed, or half-open). Then the intersection
is non-empty.
Proof: For sake of notation we shall assume the intervals are open intervals , although much the same argument would also work for closed or half-open intervals (and then by the pigeonhole principle one can then handle nested sequences of arbitrary intervals); we leave this extension to the interested reader.
Pick an element of each
, then we have
whenever
. In particular, one can find a set
in the ultrafilter such that
This property is closely related to the countable saturation and overspill properties in nonstandard analysis. From this property one might expect that has better topological structure than say the reals. This is not exactly true, because unfortunately
is not metrizable (or separable, or first or second countable). It is perhaps better to view
as obeying a parallel type of completeness that is neither strictly stronger nor strictly weaker than the more familiar notion of metric completeness, but is otherwise rather analogous.
An article I wrote for Voices of Academia back in 2022.
This post was inspired by some recent discussions with Bjoern Bringmann.
Symbolic math software packages are highly developed for many mathematical tasks in areas such as algebra, calculus, and numerical analysis. However, to my knowledge we do not have similarly sophisticated tools for verifying asymptotic estimates – inequalities that are supposed to hold for arbitrarily large parameters, with constant losses. Particularly important are functional estimates, where the parameters involve an unknown function or sequence (living in some suitable function space, such as an space); but for this discussion I will focus on the simpler situation of asymptotic estimates involving a finite number of positive real numbers, combined using arithmetic operations such as addition, multiplication, division, exponentiation, and minimum and maximum (but no subtraction). A typical inequality here might be the weak arithmetic mean-geometric mean inequality
where are arbitrary positive real numbers, and the
here indicates that we are willing to lose an unspecified (multiplicative) constant in the estimates.
I have wished in the past (e.g., in this MathOverflow answer) for a tool that could automatically determine whether such an estimate was true or not (and provide a proof if true, or an asymptotic counterexample if false). In principle, simple inequalities of this form could be automatically resolved by brute force case splitting. For instance, with (1), one first observes that is comparable to
up to constants, so it suffices to determine if
Next, to resolve the maximum, one can divide into three cases: ;
; and
. Suppose for instance that
. Then the estimate to prove simplifies to
and this is (after taking logarithms) a positive linear combination of the hypotheses ,
. The task of determining such a linear combination is a standard linear programming task, for which many computer software packages exist.
Any single such inequality is not too difficult to resolve by hand, but there are applications in which one needs to check a large number of such inequalities, or split into a large number of cases. I will take an example at random from an old paper of mine (adapted from the equation after (51), and ignoring some epsilon terms for simplicity): I wanted to establish the estimate
for any obeying the constraints
where ,
, and
are the maximum, median, and minimum of
respectively, and similarly for
,
, and
, and
. This particular bound could be dispatched in three or four lines from some simpler inequalities; but it took some time to come up with those inequalities, and I had to do a dozen further inequalities of this type. This is a task that seems extremely ripe for automation, particularly with modern technology.
Recently, I have been doing a lot more coding (in Python, mostly) than in the past, aided by the remarkable facility of large language models to generate initial code samples for many different tasks, or to autocomplete partially written code. For the most part, I have restricted myself to fairly simple coding tasks, such as computing and then plotting some mildly complicated mathematical functions, or doing some rudimentary data analysis on some dataset. But I decided to give myself the more challenging task of coding a verifier that could handle inequalities of the above form. After about four hours of coding, with frequent assistance from an LLM, I was able to produce a proof of concept tool for this, which can be found at this Github repository. For instance, to verify (1), the relevant Python code is
a = Variable("a")
b = Variable("b")
c = Variable("c")
assumptions = Assumptions()
assumptions.can_bound((a * b * c) ** (1 / 3), max(a, b, c))
and the (somewhat verbose) output verifying the inequality is
Checking if we can bound (((a * b) * c) ** 0.3333333333333333) by max(a, b, c) from the given axioms.
We will split into the following cases:
[[b <~ a, c <~ a], [a <~ b, c <~ b], [a <~ c, b <~ c]]
Trying case: ([b <~ a, c <~ a],)
Simplify to proving (((a ** 0.6666666666666667) * (b ** -0.3333333333333333)) * (c ** -0.3333333333333333)) >= 1.
Bound was proven true by multiplying the following hypotheses :
b <~ a raised to power 0.33333333
c <~ a raised to power 0.33333333
Trying case: ([a <~ b, c <~ b],)
Simplify to proving (((b ** 0.6666666666666667) * (a ** -0.3333333333333333)) * (c ** -0.3333333333333333)) >= 1.
Bound was proven true by multiplying the following hypotheses :
a <~ b raised to power 0.33333333
c <~ b raised to power 0.33333333
Trying case: ([a <~ c, b <~ c],)
Simplify to proving (((c ** 0.6666666666666667) * (a ** -0.3333333
333333333)) * (b ** -0.3333333333333333)) >= 1.
Bound was proven true by multiplying the following hypotheses :
a <~ c raised to power 0.33333333
b <~ c raised to power 0.33333333
Bound was proven true in all cases!
This is of course an extremely inelegant proof, but elegance is not the point here; rather, that it is automated. (See also this recent article of Heather Macbeth for how proof writing styles change in the presence of automated tools, such as formal proof assistants.)
The code is close to also being able to handle more complicated estimates such as (3); right now I have not written code to properly handle hypotheses such as that involve complex expressions such as
, as opposed to hypotheses that only involve atomic variables such as
,
, but I can at least handle such complex expressions in the left and right-hand sides of the estimate I am trying to verify.
In any event, the code, being a mixture of LLM-generated code and my own rudimentary Python skills, is hardly an exemplar of efficient or elegant coding, and I am sure that there are many expert programmers who could do a much better job. But I think this is proof of concept that a more sophisticated tool of this form could be quite readily created to do more advanced tasks. One such example task was the one I gave in the above MathOverflow question, namely being able to automatically verify a claim such as
for all . Another task would be to automatically verify the ability to estimate some multilinear expression of various functions, in terms of norms of such functions in standard spaces such as Sobolev spaces; this is a task that is particularly prevalent in PDE and harmonic analysis (and can frankly get somewhat tedious to do by hand). As speculated in that MO post, one could eventually hope to also utilize AI to assist in the verification process, for instance by suggesting possible splittings of the various sums or integrals involved, but that would be a long-term objective.
This sort of software development would likely best be performed as a collaborative project, involving both mathematicians and expert programmers. I would be interested to receive advice on how best to proceed with such a project (for instance, would it make sense to incorporate such a tool into an existing platform such as SageMATH), and what features for a general estimate verifier would be most desirable for mathematicians. One thing on my wishlist is the ability to give a tool an expression to estimate (such as a multilinear integral of some unknown functions), as well as a fixed set of tools to bound that integral (e.g., splitting the integral into pieces, integrating by parts, using the Hölder and Sobolev inequalities, etc.), and have the computer do its best to optimize the bound it can produce with those tools (complete with some independently verifiable proof certificate for its output). One could also imagine such tools having the option to output their proof certificates in a formal proof assistant language such as Lean. But perhaps there are other useful features that readers may wish to propose.
People are talking about colliders again.
This year, the European particle physics community is updating its shared plan for the future, the European Strategy for Particle Physics. A raft of proposals at the end of March stirred up a tail of public debate, focused on asking what sort of new particle collider should be built, and discussing potential reasons why.
That discussion, in turn, has got me thinking about experiments, and how they’re justified.
The purpose of experiments, and of science in general, is to learn something new. The more sure we are of something, the less reason there is to test it. Scientists don’t check whether the Sun rises every day. Like everyone else, they assume it will rise, and use that knowledge to learn other things.
You want your experiment to surprise you. But to design an experiment to surprise you, you run into a contradiction.
Suppose that every morning, you check whether the Sun rises. If it doesn’t, you will really be surprised! You’ll have made the discovery of the century! That’s a really exciting payoff, grant agencies should be lining up to pay for…
Well, is that actually likely to happen, though?
The same reasons it would be surprising if the Sun stopped rising are reasons why we shouldn’t expect the Sun to stop rising. A sunrise-checking observatory has incredibly high potential scientific reward…but an absurdly low chance of giving that reward.
Ok, so you can re-frame your experiment. You’re not hoping the Sun won’t rise, you’re observing the sunrise. You expect it to rise, almost guaranteed, so your experiment has an almost guaranteed payoff.
But what a small payoff! You saw exactly what you expected, there’s no science in that!
By either criterion, the “does the Sun rise” observatory is a stupid experiment. Real experiments operate in between the two extremes. They also mix motivations. Together, that leads to some interesting tensions.
What was the purpose of the Large Hadron Collider?
There were a few things physicists were pretty sure of, when they planned the LHC. Previous colliders had measured W bosons and Z bosons, and their properties made it clear that something was missing. If you could collide protons with enough energy, physicists were pretty sure you’d see the missing piece. Physicists had a reasonably plausible story for that missing piece, in the form of the Higgs boson. So physicists could be pretty sure they’d see something, and reasonably sure it would be the Higgs boson.
If physicists expected the Higgs boson, what was the point of the experiment?
First, physicists expected to see the Higgs boson, but they didn’t expect it to have the mass that it did. In fact, they didn’t know anything about the particle’s mass, besides that it should be low enough that the collider could produce it, and high enough that it hadn’t been detected before. The specific number? That was a surprise, and an almost-inevitable one. A rare creature, an almost-guaranteed scientific payoff.
I say almost, because there was a second point. The Higgs boson didn’t have to be there. In fact, it didn’t have to exist at all. There was a much bigger potential payoff, of noticing something very strange, something much more complicated than the straightforward theory most physicists had expected.
(Many people also argued for another almost-guaranteed payoff, and that got a lot more press. People talked about finding the origin of dark matter by discovering supersymmetric particles, which they argued was almost guaranteed due to a principle called naturalness. This is very important for understanding the history…but it’s an argument that many people feel has failed, and that isn’t showing up much anymore. So for this post, I’ll leave it to the side.)
This mix, of a guaranteed small surprise and the potential for a very large surprise, was a big part of what made the LHC make sense. The mix has changed a bit for people considering a new collider, and it’s making for a rougher conversation.
Like the LHC, most of the new collider proposals have a guaranteed payoff. The LHC could measure the mass of the Higgs, these new colliders will measure its “couplings”: how strongly it influences other particles and forces.
Unlike the LHC, though, this guarantee is not a guaranteed surprise. Before building the LHC, we did not know the mass of the Higgs, and we could not predict it. On the other hand, now we absolutely can predict the couplings of the Higgs. We have quite precise numbers, our expectation for what they should be based on a theory that so far has proven quite successful.
We aren’t certain, of course, just like physicists weren’t certain before. The Higgs boson might have many surprising properties, things that contradict our current best theory and usher in something new. These surprises could genuinely tell us something about some of the big questions, from the nature of dark matter to the universe’s balance of matter and antimatter to the stability of the laws of physics.
But of course, they also might not. We no longer have that rare creature, a guaranteed mild surprise, to hedge in case the big surprises fail. We have guaranteed observations, and experimenters will happily tell you about them…but no guaranteed surprises.
That’s a strange position to be in. And I’m not sure physicists have figured out what to do about it.
Around 250 BC Archimedes found a general algorithm for computing pi to arbitrary accuracy, and used it to prove that 223/71 < π < 22/7. This seems to be when people started using 22/7 as an approximation to pi.
By the Middle Ages, math had backslid so much in Western Europe that scholars believed pi was actually equal to 22/7.
Around 1020, a mathematician named Franco of Liège got interested in the ancient Greek problem of squaring the circle. But since he believed that pi is 22/7, he started studying the square root of 22/7.
There’s a big difference between being misinformed and being stupid. Franco was misinformed but not stupid. He went ahead to prove that the square root of 22/7 is irrational!
His proof resembles the old Greek proof that the square root of 2 is irrational. I don’t know if Franco was aware of that. I also don’t know if he noticed that if pi were 22/7, it would be possible to square the circle with straightedge and compass. I also don’t know if he wondered why pi was 22/7. He may have just taken it on authority.
But still: math was coming back.
Franco was a student of a student of the famous scholar Gerbert of Aurillac (~950–1003), who studied in the Islamic schools of Sevilla and Córdoba, and thus got some benefits of a culture whose mathematics was light years ahead of Western Europe. Gerbert wrote something interesting: he said that the benefit of mathematics lie in the “sharpening of the mind”.
I got most of this interesting tale from this book:
• Thomas Sonar, trans. Morton Patricia and Keith William Morton, 3000 Years of Analysis: Mathematics in History and Culture, Birkhäuser, 2020. Preface and table of contents free here.
It’s over 700 pages long, but it’s fun to read, and you can start anywhere! The translation is weak and occasionally funny, but tolerable. If its length is intimating, you may enjoy the detailed review here:
• Anthony Weston, 3000 years of analysis, Notices of the American Mathematical Society 70 1 (January 2023), 115–121.
Do you know when an engineer built the first artificial automaton—the first human-made machine that operated by itself, without external control mechanisms that altered the machine’s behavior over time as the machine undertook its mission?
The ancient Greek thinker Archytas of Tarentum reportedly created it about 2,300 years ago. Steam propelled his mechanical pigeon through the air.
For centuries, automata cropped up here and there as curiosities and entertainment. The wealthy exhibited automata to amuse and awe their peers and underlings. For instance, the French engineer Jacques de Vauconson built a mechanical duck that appeared to eat and then expel grains. The device earned the nickname the Digesting Duck…and the nickname the Defecating Duck.
Vauconson also invented a mechanical loom that helped foster the Industrial Revolution. During the 18th and 19th centuries, automata began to enable factories, which changed the face of civilization. We’ve inherited the upshots of that change. Nowadays, cars drive themselves, Roombas clean floors, and drones deliver packages.1 Automata have graduated from toys to practical tools.2
Rather, classical automata have. What of their quantum counterparts?
Scientists have designed autonomous quantum machines, and experimentalists have begun realizing them. The roster of such machines includes autonomous quantum engines, refrigerators, and clocks. Much of this research falls under the purview of quantum thermodynamics, due to the roles played by energy in these machines’ functioning: above, I defined an automaton as a machine free of time-dependent control (exerted by a user). Equivalently, according to a thermodynamicist mentality, we can define an automaton as a machine on which no user performs any work as the machine operates. Thermodynamic work is well-ordered energy that can be harnessed directly to perform a useful task. Often, instead of receiving work, an automaton receives access to a hot environment and a cold environment. Heat flows from the hot to the cold, and the automaton transforms some of the heat into work.
Quantum automata appeal to me because quantum thermodynamics has few practical applications, as I complained in my previous blog post. Quantum thermodynamics has helped illuminate the nature of the universe, and I laud such foundational insights. Yet we can progress beyond laudation by trying to harness those insights in applications. Some quantum thermal machines—quantum batteries, engines, etc.—can outperform their classical counterparts, according to certain metrics. But controlling those machines, and keeping them cold enough that they behave quantum mechanically, costs substantial resources. The machines cost more than they’re worth. Quantum automata, requiring little control, offer hope for practicality.
To illustrate this hope, my group partnered with Simone Gasparinetti’s lab at Chalmer’s University in Sweden. The experimentalists created an autonomous quantum refrigerator from superconducting qubits. The quantum refrigerator can help reset, or “clear,” a quantum computer between calculations.
After we wrote the refrigerator paper, collaborators and I raised our heads and peered a little farther into the distance. What does building a useful autonomous quantum machine take, generally? Collaborators and I laid out guidelines in a “Key Issues Review” published in Reports in Progress on Physics last November.
We based our guidelines on DiVincenzo’s criteria for quantum computing. In 1996, David DiVincenzo published seven criteria that any platform, or setup, must meet to serve as a quantum computer. He cast five of the criteria as necessary and two criteria, related to information transmission, as optional. Similarly, our team provides ten criteria for building useful quantum automata. We regard eight of the criteria as necessary, at least typically. The final two, optional guidelines govern information transmission and machine transportation.
DiVincenzo illustrated his criteria with multiple possible quantum-computing platforms, such as ions. Similarly, we illustrate our criteria in two ways. First, we show how different quantum automata—engines, clocks, quantum circuits, etc.—can satisfy the criteria. Second, we illustrate how quantum automata can consist of different platforms: ultracold atoms, superconducting qubits, molecules, and so on.
Nature has suggested some of these platforms. For example, our eyes contain autonomous quantum energy transducers called photoisomers, or molecular switches. Suppose that such a molecule absorbs a photon. The molecule may use the photon’s energy to switch configuration. This switching sets off chemical and neurological reactions that result in the impression of sight. So the quantum switch transduces energy from light into mechanical, chemical, and electric energy.
My favorite of our criteria ranks among the necessary conditions: every useful quantum automata must produce output worth the input. How one quantifies a machine’s worth and cost depends on the machine and on the user. For example, an agent using a quantum engine may care about the engine’s efficiency, power, or efficiency at maximum power. Costs can include the energy required to cool the engine to the quantum regime, as well as the control required to initialize the engine. The agent also chooses which value they regard as an acceptable threshold for the output produced per unit input. I like this criterion because it applies a broom to dust that we quantum thermodynamicists often hide under a rug: quantum thermal machines’ costs. Let’s begin building quantum engines that perform more work than they require to operate.
One might object that scientists and engineers are already sweating over nonautonomous quantum machines. Companies, governments, and universities are pouring billions of dollars into quantum computing. Building a full-scale quantum computer by hook or by crook, regardless of classical control, is costing enough. Eliminating time-dependent control sounds even tougher. Why bother?
Fellow Quantum Frontiers blogger John Preskill pointed out one answer, when I described my new research program to him in 2022: control systems are classical—large and hot. Consider superconducting qubits—tiny quantum circuits—printed on a squarish chip about the size of your hand. A control wire terminates on each qubit. The rest of the wire runs off the edge of the chip, extending to classical hardware standing nearby. One can fit only so many wires on the chip, so one can fit only so many qubits. Also, the wires, being classical, are hotter than the qubits should be. The wires can help decohere the circuits, introducing errors into the quantum information they store. The more we can free the qubits from external control—the more autonomy we can grant them—the better.
Besides, quantum automata exemplify quantum steampunk, as my coauthor Pauli Erker observed. I kicked myself after he did, because I’d missed the connection. The irony was so thick, you could have cut it with the retractible steel knife attached to a swashbuckling villain’s robotic arm. Only two years before, I’d read The Watchmaker of Filigree Street, by Natasha Pulley. The novel features a Londoner expatriate from Meiji Japan, named Mori, who builds clockwork devices. The most endearing is a pet-like octopus, called Katsu, who scrambles around Mori’s workshop and hoards socks.
Does the world need a quantum version of Katsu? Not outside of quantum-steampunk fiction…yet. But a girl can dream. And quantum automata now have the opportunity to put quantum thermodynamics to work.
1And deliver pizzas. While visiting the University of Pittsburgh a few years ago, I was surprised to learn that the robots scurrying down the streets were serving hungry students.
2And minions of starving young scholars.
Last week I visited Harvard and MIT, and as advertised in my last post, gave the Yip Lecture at Harvard on the subject “How Much Math Is Knowable?” The visit was hosted by Harvard’s wonderful Center of Mathematical Sciences and Applications (CMSA), directed by my former UT Austin colleague Dan Freed. Thanks so much to everyone at CMSA for the visit.
And good news! You can now watch my lecture on YouTube here:
I’m told it was one of my better performances. As always, I strongly recommend watching at 2x speed.
I opened the lecture by saying that, while obviously it would always be an honor to give the Yip Lecture at Harvard, it’s especially an honor right now, as the rest of American academia looks to Harvard to defend the value of our entire enterprise. I urged Harvard to “fight fiercely,” in the words of the Tom Lehrer song.
I wasn’t just fishing for applause; I meant it. It’s crucial for people to understand that, in its total war against universities, MAGA has now lost, not merely the anti-Israel leftists, but also most conservatives, classical liberals, Zionists, etc. with any intellectual scruples whatsoever. To my mind, this opens up the possibility for a broad, nonpartisan response, highlighting everything universities (yes, even Harvard ) do for our civilization that’s worth defending.
For three days in my old hometown of Cambridge, MA, I met back-to-back with friends and colleagues old and new. Almost to a person, they were terrified about whether they’ll be able to keep doing science as their funding gets decimated, but especially terrified for anyone who they cared about on visas and green cards. International scholars can now be handcuffed, deported, and even placed in indefinite confinement for pretty much any reason—including long-ago speeding tickets—or no reason at all. The resulting fear has paralyzed, in a matter of months, an American scientific juggernaut that took a century to build.
A few of my colleagues personally knew Rümeysa Öztürk, the Turkish student at Tufts who currently sits in prison for coauthoring an editorial for her student newspaper advocating the boycott of Israel. I of course disagree with what Öztürk wrote … and that is completely irrelevant to my moral demand that she go free. Even supposing the government had much more on her than this one editorial, still the proper response would seem to be a deportation notice—“either contest our evidence in court, or else get on the next flight back to Turkey”—rather than grabbing Öztürk off the street and sending her to indefinite detention in Louisiana. It’s impossible to imagine any university worth attending where the students live in constant fear of imprisonment for the civil expression of opinions.
To help calibrate where things stand right now, here’s the individual you might expect to be most on board with a crackdown on antisemitism at Harvard:
Jason Rubenstein, the executive director of Harvard Hillel, said that the school is in the midst of a long — and long-overdue — reckoning with antisemitism, and that [President] Garber has taken important steps to address the problem. Methodical federal civil rights oversight could play a constructive role in that reform, he said. “But the government’s current, fast-paced assault against Harvard – shuttering apolitical, life-saving research; targeting the university’s tax-exempt status; and threatening all student visas … is neither deliberate nor methodical, and its disregard for the necessities of negotiation and due process threatens the bulwarks of institutional independence and the rule of law that undergird our shared freedoms.”
Meanwhile, as the storm clouds over American academia continue to darken, I’ll just continue to write what I think about everything, because what else can I do?
Last night, alas, I lost yet another left-wing academic friend, the fourth or fifth I’ve lost since October 7. For while I was ready to take a ferocious public stand against the current US government, for the survival and independence of our universities, and for free speech and due process for foreign students, this friend regarded all that as insufficient. He demanded that I also clear the tentifada movement of any charge of antisemitism. For, as he patiently explained to me (while worrying that I wouldn’t grasp the point), while the protesters may have technically violated university rules, disrupted education, created a hostile environment in the sense of Title VI antidiscrimination law in ways that would be obvious were we discussing any other targeted minority, etc. etc., still, the only thing that matters morally is that the protesters represent “the powerless,” whereas Zionist Jews like me represent “the powerful.” So, I told this former friend to go fuck himself. Too harsh? Maybe if he hadn’t been Jewish himself, I could’ve forgiven him for letting the world’s oldest conspiracy theory colonize his brain.
For me, the deep significance of in-person visits, including my recent trip to Harvard, is that they reassure me of the preponderance of sanity within my little world—and thereby of my own sanity. Online, every single day I feel isolated and embattled: pressed in on one side by MAGA forces who claim to care about antisemitism, but then turn out to want the destruction of science, universities, free speech, international exchange, due process of law, and everything else that’s made the modern world less than fully horrible; and on the other side, by leftists who say they stand with me for science and academic freedom and civil rights and everything else that’s good, but then add that the struggle needs to continue until the downfall of the scheming, moneyed Zionists and the liberation of Palestine from river to sea.
When I travel to universities to give talks, though, I meet one sane, reasonable human being after another. Almost to a person, they acknowledge the reality of antisemitism, ideological monoculture, bureaucracy, spiraling costs, and many other problems at universities—and they care about universities enough to want to fix those problems, rather than gleefully nuking the universities from orbit as MAGA is doing. Mostly, though, people just want me to sign Quantum Computing Since Democritus, or tell me how much they like this blog, or ask questions about quantum algorithms or the Busy Beaver function. Which is fine too, and which you can do in the comments.
There is a lot going on. Today, some words about NSF.
Yesterday Sethuraman Panchanathan, the director of the National Science Foundation, resigned 16 months before the end of his six year term. The relevant Science article raises the possibility that this is because, as an executive branch appointee, he would effectively have to endorse the upcoming presidential budget request, which is rumored to be a 55% cut to the agency budget (from around $9B/yr to $4B/yr) and a 50% reduction in agency staffing. (Note: actual appropriations are set by Congress, which has ignored presidential budget requests in the past.) This comes at the end of a week when all new awards were halted at the agency while non-agency personnel conducted "a second review" of all grants, and many active grants have been terminated. Bear in mind, awards this year from NSF are already down 50% over last year, even without official budget cuts. Update: Here is Nature's reporting from earlier today.
The NSF has been absolutely critical to a long list of scientific and technological advances over the last 70 years (see here while it's still up). As mentioned previously, government support of basic research has a great return on investment for the national economy, and it's a tiny fraction of government spending. Less than three years ago, the CHIPS & Science Act was passed with supposed bipartisan support in Congress, authorizing the doubling of the NSF budget. Last summer I posted in frustration that this support seemed to be an illusion when it came to actual funding.
People can have disagreements about the "right" level of government support for science in times of fiscal challenges, but as far as I can tell, no one (including and especially Congress so far) voted for the dismantling of the NSF. If you think the present trajectory is wrong, contact your legislators and make your voices heard.
Quantum computing finds itself in a peculiar situation. On the technological side, after billions of dollars and decades of research, working quantum computers are nearing fruition. But still, the number one question asked about quantum computers is the same as it was two decades ago: What are they good for? The honest answer reveals an elephant in the room: We don’t fully know yet. For theorists like me, this is an opportunity, a call to action.
Suppose we do not have quantum computers in a few decades time. What will be the reason? It’s unlikely that we’ll encounter some insurmountable engineering obstacle. The theoretical basis of quantum error-correction is solid, and several platforms are approaching or below the error-correction threshold (Harvard, Yale, Google). Experimentalists believe today’s technology can scale to 100 logical qubits and gates—the megaquop era. If mankind spends $100 billion over the next few decades, it’s likely we could build a quantum computer.
A more concerning reason that quantum computing might fail is that there is not enough incentive to justify such a large investment in R&D and infrastructure. Let’s make a comparison to nuclear fusion. Like quantum hardware, they have challenging science and engineering problems to solve. However, if a nuclear fusion lab were to succeed in their mission of building a nuclear fusion reactor, the application would be self-evident. This is not the case for quantum computing—it is a sledgehammer looking for nails to hit.
Nevertheless, industry investment in quantum computing is currently accelerating. To maintain the momentum, it is critical to match investment growth and hardware progress with algorithmic capabilities. The time to discover quantum algorithms is now.
Theory research is forward-looking and predictive. Theorists such as Geoffrey Hinton laid the foundations of the current AI revolution. But decades later, with an abundance of computing hardware, AI has become much more of an empirical field. I look forward to the day that quantum hardware reaches a state of abundance, but that day is not yet here.
Today, quantum computing is an area where theorists have extraordinary leverage. A few pages of mathematics by Peter Shor inspired thousands of researchers, engineers and investors to join the field. Perhaps another few pages by someone reading this blog will establish a future of world-altering impact for the industry. There are not many places where mathematics has such potential for influence. An entire community of experimentalists, engineers, and businesses are looking to the theorists for ideas.
Traditionally, it is thought that the ideal quantum algorithm would exhibit three features. First, it should be provably correct, giving a guarantee that executing the quantum circuit reliably will achieve the intended outcome. Second, the underlying problem should be classically hard—the output of the quantum algorithm should be computationally hard to replicate with a classical algorithm. Third, it should be useful, with the potential to solve a problem of interest in the real world. Shor’s algorithm comes close to meeting all of these criteria. However, demanding all three in an absolute fashion may be unnecessary and perhaps even counterproductive to progress.
Provable correctness is important, since today we cannot yet empirically test quantum algorithms on hardware at scale. But what degree of evidence should we require for classical hardness? Rigorous proof of classical hardness is currently unattainable without resolving major open problems like P vs NP, but there are softer forms of proof, such as reductions to well-studied classical hardness assumptions.
I argue that we should replace the ideal of provable hardness with a more pragmatic approach: The quantum algorithm should outperform the best known classical algorithm that produces the same output by a super-quadratic speedup.1 Emphasizing provable classical hardness might inadvertently impede the discovery of new quantum algorithms, since a truly novel quantum algorithm could potentially introduce a new classical hardness assumption that differs fundamentally from established ones. The back-and-forth process of proposing and breaking new assumptions is a productive direction that helps us triangulate where quantum advantage lies.
It may also be unproductive to aim directly at solving existing real-world problems with quantum algorithms. Fundamental computational tasks with quantum advantage are special and we have very few examples, yet they necessarily provide the basis for any eventual quantum application. We should search for more of these fundamental tasks and match them to applications later.
That said, it is important to distinguish between quantum algorithms that could one day provide the basis for a practically relevant computation, and those that will not. In the real world, computations are not useful unless they are verifiable or at least repeatable. For instance, consider a quantum simulation algorithm that computes a physical observable. If two different quantum computers run the simulation and get the same answer, one can be confident that this answer is correct and that it makes a robust prediction about the world. Some problems such as factoring are naturally easy to verify classically, but we can set the bar even lower: The output of a useful quantum algorithm should at least be repeatable by another quantum computer.
There is a subtle fourth requirement of paramount importance that is often overlooked, captured by the following litmus test: If given a quantum computer tomorrow, could you implement your quantum algorithm? In order to do so, you need not only a quantum algorithm but also a distribution over its inputs on which to run it. Classical hardness must then be judged in the average case over this distribution of inputs, rather than in the worst case.
I’ll end this section with a specific caution regarding quantum algorithms whose output is the expectation value of an observable. A common reason these proposals fail to be classically hard is that the expectation value exponentially concentrates over the distribution of inputs. When this happens, a trivial classical algorithm can replicate the quantum result by simply outputting the concentrated (typical) value for every input. To avoid this, we must seek ensembles of quantum circuits whose expectation values exhibit meaningful variation and sensitivity to different inputs.
We can crystallize these priorities into the following challenge:
The Challenge
Find a quantum algorithm and a distribution over its inputs with the following features:
— (Provable correctness.) The quantum algorithm is provably correct.
— (Classical hardness.) The quantum algorithm outperforms the best known classical algorithm that performs the same task by a super-quadratic speedup, in the average-case over the distribution of inputs.
— (Potential utility.) The output is verifiable, or at least repeatable.
Category | Classically verifiable | Quantumly repeatable | Potentially useful | Provable classical hardness | Examples |
---|---|---|---|---|---|
Search problem | Yes | Yes | Yes | No | Shor ‘99 Regev’s reduction: CLZ22, YZ24, Jor+24 Planted inference: Has20, SOKB24 |
Compute a value | No | Yes | Yes | No | Condensed matter physics? Quantum chemistry? |
Proof of quantumness | Yes, with key | Yes, with respect to key | No | Yes, under crypto assumptions | BCMVV21 |
Sampling | No | No | No | Almost, under complexity assumptions | BJS10, AA11, Google ‘20 |
Hamiltonian simulation is perhaps the most widely heralded source of quantum utility. Physics and chemistry contain many quantities that Nature computes effortlessly, yet remain beyond the reach of even our best classical simulations. Quantum computation is capable of simulating Nature directly, giving us strong reason to believe that quantum algorithms can compute classically-hard quantities.
There are already many examples where a quantum computer could help us answer an unsolved scientific question, like determining the phase diagram of the Hubbard model or the ground energy of FeMoCo. These undoubtedly have scientific value. However, they are isolated examples, whereas we would like evidence that the pool of quantum-solvable questions is inexhaustible. Can we take inspiration from strongly correlated physics to write down a concrete ensemble of Hamiltonian simulation instances where there is a classically-hard observable? This would gather evidence for the sustained, broad utility of quantum simulation, and would also help us understand where and how quantum advantage arises.
Over in the computer science community, there has been a lot of work on oracle separations such as welded trees and forrelation, which should give us confidence in the abilities of quantum computers. Can we instantiate these oracles in a way that pragmatically remains classically hard? This is necessary in order to pass our earlier litmus test of being ready to run the quantum algorithm tomorrow.
In addition to Hamiltonian simulation, there are several other broad classes of quantum algorithms, including quantum algorithms for linear systems of equations and differential equations, variational quantum algorithms for machine learning, and quantum algorithms for optimization. These frameworks sometimes come with proofs of BQP-completeness.
The issue with these broad frameworks is that they often do not specify a distribution over inputs. Can we find novel ensembles of inputs to these frameworks which exhibit super-quadratic speedups? BQP-completeness shows that one has translated the notion of quantum computation into a different language, which allows one to embed an existing quantum algorithm such as Shor’s algorithm into your framework. But in order to discover a new quantum algorithm, you must find an ensemble of BQP computations which does not arise from Shor’s algorithm.
Table I claims that sampling tasks alone are not useful since they are not even quantumly repeatable. One may wonder if sampling tasks could be useful in some way. After all, classical Monte Carlo sampling algorithms are widely used in practice. However, applications of sampling typically use samples to extract meaningful information or specific features of the underlying distribution. For example, Monte Carlo sampling can be used to evaluate integrals in Bayesian inference and statistical physics. In contrast, samples obtained from random quantum circuits lack any discernible features. If a collection of quantum algorithms generated samples containing meaningful signals from which one could extract classically hard-to-compute values, those algorithms would effectively transition into the compute a value category.
Table I also claims that proofs of quantumness are not useful. This is not completely true—one potential application is generating certifiable randomness. However, such applications are generally cryptographic rather than computational in nature. Specifically, proofs of quantumness cannot help us solve problems or answer questions whose solutions we do not already know.
Finally, there are several exciting directions proposing applications of quantum technologies in sensing and metrology, communication, learning with quantum memory, and streaming. These are very interesting, and I hope that mankind’s second century of quantum mechanics brings forth all flavors of capabilities. However, the technological momentum is mostly focused on building quantum computers for the purpose of computational advantage, and so this is where breakthroughs will have the greatest immediate impact.
At the annual QIP conference, only a handful of papers out of hundreds each year attempt to advance new quantum algorithms. Given the stakes, why is this number so low? One common explanation is that quantum algorithm research is simply too difficult. Nevertheless, we have seen substantial progress in quantum algorithms in recent years. After an underwhelming lack of end-to-end proposals with the potential for utility between the years 2000 and 2020, Table I exhibits several breakthroughs from the past 5 years.
In between blind optimism and resigned pessimism, embracing a mission-driven mindset can propel our field forward. We should allow ourselves to adopt a more exploratory, scrappier approach: We can hunt for quantum advantages in yet-unstudied problems or subtle signals in the third decimal place. The bar for meaningful progress is lower than it might seem, and even incremental advances are valuable. Don’t be too afraid!
A basic type of problem that occurs throughout mathematics is the lifting problem: given some space that “sits above” some other “base” space
due to a projection map
, and some map
from a third space
into the base space
, find a “lift”
of
to
, that is to say a map
such that
. In many applications we would like to have
preserve many of the properties of
(e.g., continuity, differentiability, linearity, etc.).
Of course, if the projection map is not surjective, one would not expect the lifting problem to be solvable in general, as the map
to be lifted could simply take values outside of the range of
. So it is natural to impose the requirement that
be surjective, giving the following commutative diagram to complete:
If no further requirements are placed on the lift , then the axiom of choice is precisely the assertion that the lifting problem is always solvable (once we require
to be surjective). Indeed, the axiom of choice lets us select a preimage
in the fiber of each point
, and one can lift any
by setting
. Conversely, to build a choice function for a surjective map
, it suffices to lift the identity map
to
.
Of course, the maps provided by the axiom of choice are famously pathological, being almost certain to be discontinuous, non-measurable, etc.. So now suppose that all spaces involved are topological spaces, and all maps involved are required to be continuous. Then the lifting problem is not always solvable. For instance, we have a continuous projection from
to
, but the identity map
cannot be lifted continuously up to
, because
is contractable and
is not.
However, if is a discrete space (every set is open), then the axiom of choice lets us solve the continuous lifting problem from
for any continuous surjection
, simply because every map from
to
is continuous. Conversely, the discrete spaces are the only ones with this property: if
is a topological space which is not discrete, then if one lets
be the same space
equipped with the discrete topology, then the only way one can continuously lift the identity map
through the “projection map”
(that maps each point to itself) is if
is itself discrete.
These discrete spaces are the projective objects in the category of topological spaces, since in this category the concept of an epimorphism agrees with that of a surjective continuous map. Thus can be viewed as the unique (up to isomorphism) projective object in this category that has a bijective continuous map to
.
Now let us narrow the category of topological spaces to the category of compact Hausdorff (CH) spaces. Here things should be better behaved; for instance, it is a standard fact in this category that continuous bijections are homeomorphisms, and it is still the case that the epimorphisms are the continuous surjections. So we have a usable notion of a projective object in this category: CH spaces such that any continuous map
into another CH space can be lifted via any surjective continuous map
to another CH space.
By the previous discussion, discrete CH spaces will be projective, but this is an extremely restrictive set of examples, since of course compact discrete spaces must be finite. Are there any others? The answer was worked out by Gleason:
Proposition 1 A compact Hausdorff spaceis projective if and only if it is extremally disconnected, i.e., the closure of every open set is again open.
Proof: We begin with the “only if” direction. Let was projective, and let
be an open subset of
. Then the closure
and complement
are both closed, hence compact, subsets of
, so the disjoint union
is another CH space, which has an obvious surjective continuous projection map
to
formed by gluing the two inclusion maps together. As
is projective, the identity map
must then lift to a continuous map
. One easily checks that
has to map
to the first component
of the disjoint union, and
ot the second component; hence
, and so
is open, giving extremal disconnectedness.
Conversely, suppose that is extremally disconnected, that
is a continuous surjection of CH spaces, and
is continuous. We wish to lift
to a continuous map
.
We first observe that it suffices to solve the lifting problem for the identity map , that is to say we can assume without loss of generality that
and
is the identity. Indeed, for general maps
, one can introduce the pullback space
So now we are trying to lift the identity map via a continuous surjection
. Let us call this surjection
minimally surjective if no restriction
of
to a proper closed subset
of
remains surjective. An easy application of Zorn’s lemma shows that every continuous surjection
can be restricted to a minimally surjective continuous map
. Thus, without loss of generality, we may assume that
is minimally surjective.
The key claim now is that every minimally surjective map into an extremally disconnected space is in fact a bijection. Indeed, suppose for contradiction that there were two distinct points
in
that mapped to the same point
under
. By taking contrapositives of the minimal surjectivity property, we see that every open neighborhood of
must contain at least one fiber
of
, and by shrinking this neighborhood one can ensure the base point is arbitrarily close to
. Thus, every open neighborhood of
must intersect every open neighborhood of
, contradicting the Hausdorff property.
It is well known that continuous bijections between CH spaces must be homeomorphisms (they map compact sets to compact sets, hence must be open maps). So is a homeomorphism, and one can lift the identity map to the inverse map
.
Remark 2 The property of being “minimally surjective” sounds like it should have a purely category-theoretic definition, but I was unable to match this concept to a standard term in category theory (something along the lines of a “minimal epimorphism”, I would imagine).
In view of this proposition, it is now natural to look for extremally disconnected CH spaces (also known as Stonean spaces). The discrete CH spaces are one class of such spaces, but they are all finite. Unfortunately, these are the only “small” examples:
Lemma 3 Any first countable extremally disconnected CH spaceis discrete.
Proof: If such a space were not discrete, one could find a sequence
in
converging to a limit
such that
for all
. One can sparsify the elements
to all be distinct, and from the Hausdorff property one can construct neighbourhoods
of each
that avoid
, and are disjoint from each other. Then
and then
are disjoint open sets that both have
as an adherent point, which is inconsistent with extremal disconnectedness: the closure of
contains
but is disjoint from
, so cannot be open.
Thus for instance there are no extremally disconnected compact metric spaces, other than the finite spaces; for instance, the Cantor space is not extremally disconnected, even though it is totally disconnected (which one can easily see to be a property implied by extremal disconnectedness). On the other hand, once we leave the first-countable world, we have plenty of such spaces:
Lemma 4 Letbe a complete Boolean algebra. Then the Stone dual
of
(i.e., the space of boolean homomorphisms
) is an extremally disconnected CH space.
Proof: The CH properties are standard. The elements of
give a basis of the topology given by the clopen sets
. Because the Boolean algebra is complete, we see that the closure of the open set
for any family
of sets is simply the clopen set
, which obviously open, giving extremal disconnectedness.
Remark 5 In fact, every extremally disconnected CH spaceis homeomorphic to a Stone dual of a complete Boolean algebra (and specifically, the clopen algebra of
); see Gleason’s paper.
Corollary 6 Every CH spaceis the surjective continuous image of an extremally disconnected CH space.
Proof: Take the Stone-Čech compactification of
equipped with the discrete topology, or equivalently the Stone dual of the power set
(i.e., the ultrafilters on
). By the previous lemma, this is an extremally disconnected CH space. Because every ultrafilter on a CH space has a unique limit, we have a canonical map from
to
, which one can easily check to be continuous and surjective.
Remark 7 In fact, to each CH spaceone can associate an extremally disconnected CH space
with a minimally surjective continuous map
. The construction is the same, but instead of working with the entire power set
, one works with the smaller (but still complete) Boolean algebra of domains – closed subsets of
which are the closure of their interior, ordered by inclusion. This
is unique up to homoeomorphism, and is thus a canonical choice of extremally disconnected space to project onto
. See the paper of Gleason for details.
Several facts in analysis concerning CH spaces can be made easier to prove by utilizing Corollary 6 and working first in extremally disconnected spaces, where some things become simpler. My vague understanding is that this is highly compatible with the modern perspective of condensed mathematics, although I am not an expert in this area. Here, I will just give a classic example of this philosophy, due to Garling and presented in this paper of Hartig:
Theorem 8 (Riesz representation theorem) Letbe a CH space, and let
be a bounded linear functional. Then there is a (unique) Radon measure
on
(on the Baire
-algebra, generated by
) such
for all
.
Uniqueness of the measure is relatively straightforward; the difficult task is existence, and most known proofs are somewhat complicated. But one can observe that the theorem “pushes forward” under surjective maps:
Proposition 9 Supposeis a continuous surjection between CH spaces. If the Riesz representation theorem is true for
, then it is also true for
.
Proof: As is surjective, the pullback map
is an isometry, hence every bounded linear functional on
can be viewed as a bounded linear functional on a subspace of
, and hence by the Hahn–Banach theorem it extends to a bounded linear functional on
. By the Riesz representation theorem on
, this latter functional can be represented as an integral against a Radon measure
on
. One can then check that the pushforward measure
is then a Radon measure on
, and gives the desired representation of the bounded linear functional on
.
In view of this proposition and Corollary 6, it suffices to prove the Riesz representation theorem for extremally disconnected CH spaces. But this is easy:
Proposition 10 The Riesz representation theorem is true for extremally disconnected CH spaces.
Proof: The Baire -algebra is generated by the Boolean algebra of clopen sets. A functional
induces a finitely additive measure
on this algebra by the formula
. This is in fact a premeasure, because by compactness the only way to partition a clopen set into countably many clopen sets is to have only finitely many of the latter sets non-empty. By the Carathéodory extension theorem,
then extends to a Baire measure, which one can check to be a Radon measure that represents
(the finite linear combinations of indicators of clopen sets are dense in
).
You’ve heard of antimatter, right?
For each type of particle, there is a rare kind of evil twin with the opposite charge, called an anti-particle. When an anti-proton meets a proton, they annihilate each other in a giant blast of energy.
I see a lot of questions online about antimatter. One recurring theme is people asking a very general question: how does antimatter work?
If you’ve just heard the pop physics explanation, antimatter probably sounds like magic. What about antimatter lets it destroy normal matter? Does it need to touch? How long does it take? And what about neutral particles like neutrons?
You find surprisingly few good explanations of this online, but I can explain why. Physicists like me don’t expect antimatter to be confusing in this way, because to us, antimatter isn’t doing anything all that special. When a particle and an antiparticle annihilate, they’re doing the same thing that any other pair of particles do when they do…basically anything else.
Instead of matter and antimatter, let’s talk about one of the oldest pieces of evidence for quantum mechanics, the photoelectric effect. Scientists shone light at a metal, and found that if the wavelength of the light was short enough, electrons would spring free, causing an electric current. If the wavelength was too long, the metal wouldn’t emit any electrons, no matter how much light they shone. Einstein won his Nobel prize for the explanation: the light hitting the metal comes in particle-sized pieces, called photons, whose energy is determined by the wavelength of the light. If the individual photons don’t have enough energy to get an electron to leave the metal, then no electron will move, no matter how many photons you use.
What happens to the photons after they hit the metal?
They go away. We say they are absorbed, an electron absorbs a photon and speeds up, increasing its kinetic energy so it can escape.
But we could just as easily say the photon is annihilated, if we wanted to.
In the photoelectric effect, you start with one electron and one photon, they come together, and you end up with one electron and no photon. In proton-antiproton annihilation, you start with a proton and an antiproton, they come together, and you end up with no protons or antiprotons, but instead “energy”…which in practice, usually means two photons.
That’s all that happens, deep down at the root of things. The laws of physics are rules about inputs and outputs. Start with these particles, they come together, you end up with these other particles. Sometimes one of the particles stays the same. Sometimes particles seem to transform, and different kinds of particles show up. Sometimes some of the particles are photons, and you think of them as “just energy”, and easy to absorb. But particles are particles, and nothing is “just energy”. Each thing, absorption, decay, annihilation, each one is just another type of what we call interactions.
What makes annihilation of matter and antimatter seem unique comes down to charges. Interactions have to obey the laws of physics: they conserve energy, they conserve momentum, and they conserve charge.
So why can an antiproton and a proton annihilate to pure photons, while two protons can’t? A proton and an antiproton have opposite charge, a photon has zero charge. You could combine two protons to make something else, but it would have to have the same charge as two protons.
What about neutrons? A neutron has no electric charge, so you might think it wouldn’t need antimatter. But a neutron has another type of charge, called baryon number. In order to annihilate one, you’d need an anti-neutron, which would still have zero electric charge but would have the opposite baryon number. (By the way, physicists have been making anti-neutrons since 1956.)
On the other hand, photons actually have no charge. So do Higgs bosons. So one Higgs boson can become two photons, without annihilating with anything else. Each of these particles can be called its own antiparticle: a photon is also an antiphoton, a Higgs is also an anti-Higgs.
Because particle-antiparticle annihilation follows the same rules as other interactions between particles, it also takes place via the same forces. When a proton and an antiproton annihilate each other, they typically do this via the electromagnetic force. This is why you end up with light, which is an electromagnetic wave. Like everything in the quantum world, this annihilation isn’t certain. Is has a chance to happen, proportional to the strength of the interaction force involved.
What about neutrinos? They also appear to have a kind of charge, called lepton number. That might not really be a conserved charge, and neutrinos might be their own antiparticles, like photons. However, they are much less likely to be annihilated than protons and antiprotons, because they don’t have electric charge, and thus their interaction doesn’t depend on the electromagnetic force, but on the much weaker weak nuclear force. A weaker force means a less likely interaction.
Antimatter might seem like the stuff of science fiction. But it’s not really harder to understand than anything else in particle physics.
(I know, that’s a low bar!)
It’s just interactions. Particles go in, particles go out. If it follows the rules, it can happen, if it doesn’t, it can’t. Antimatter is no different.
Every week, I tell myself I won’t do yet another post about the asteroid striking American academia, and then every week events force my hand otherwise.
No one on earth—certainly no one who reads this blog—could call me blasé about the issue of antisemitism at US universities. I’ve blasted the takeover of entire departments and unrelated student clubs and campus common areas by the dogmatic belief that the State of Israel (and only Israel, among all nations on earth) should be eradicated, by the use of that belief as a litmus test for entry. Since October 7, I’ve dealt with comments and emails pretty much every day calling me a genocidal Judeofascist Zionist.
So I hope it means something when I say: today I salute Harvard for standing up to the Trump administration. And I’ll say so in person, when I visit Harvard’s math department later this week to give the Fifth Annual Yip Lecture, on “How Much Math Is Knowable?” The more depressing the news, I find, the more my thoughts turn to the same questions that bothered Euclid and Archimedes and Leibniz and Russell and Turing. Actually, what the hell, why don’t I share the abstract for this talk?
Theoretical computer science has over the years sought more and more refined answers to the question of which mathematical truths are knowable by finite beings like ourselves, bounded in time and space and subject to physical laws. I’ll tell a story that starts with Gödel’s Incompleteness Theorem and Turing’s discovery of uncomputability. I’ll then introduce the spectacular Busy Beaver function, which grows faster than any computable function. Work by me and Yedidia, along with recent improvements by O’Rear and Riebel, has shown that the value of BB(745) is independent of the axioms of set theory; on the other end, an international collaboration proved last year that BB(5) = 47,176,870. I’ll speculate on whether BB(6) will ever be known, by us or our AI successors. I’ll next discuss the P≠NP conjecture and what it does and doesn’t mean for the limits of machine intelligence. As my own specialty is quantum computing, I’ll summarize what we know about how scalable quantum computers, assuming we get them, will expand the boundary of what’s mathematically knowable. I’ll end by talking about hypothetical models even beyond quantum computers, which might expand the boundary of knowability still further, if one is able (for example) to jump into a black hole, create a closed timelike curve, or project oneself onto the holographic boundary of the universe.
Now back to the depressing news. What makes me take Harvard’s side is the experience of Columbia. Columbia had already been moving in the right direction on fighting antisemitism, and on enforcing its rules against disruption, before the government even got involved. Then, once the government did take away funding and present its ultimatum—completely outside the process specified in Title VI law—Columbia’s administration quickly agreed to everything asked, to howls of outrage from the left-leaning faculty. Yet despite its total capitulation, the government has continued to hold Columbia’s medical research and other science funding hostage, while inventing a never-ending list of additional demands, whose apparent endpoint is that Columbia submit to state ideological control like a university in Russia or Iran.
By taking this scorched-earth route, the government has effectively telegraphed to all the other universities, as clearly as possible: “actually, we don’t care what you do or don’t do on antisemitism. We just want to destroy you, and antisemitism was our best available pretext, the place where you’d most obviously fallen short of your ideals. But we’re not really trying to cure a sick patient, or force the patient to adopt better health habits: we’re trying to shoot, disembowel, and dismember the patient. That being the case, you might as well fight us and go down with dignity!”
No wonder that my distinguished Harvard friends (and past Shtetl-Optimized guest bloggers) Steven Pinker and Boaz Barak—not exactly known as anti-Zionist woke radicals—have come out in favor of Harvard fighting this in court. So has Harvard’s past president Larry Summers, who’s welcome to guest-blog here as well. They all understand that events have given us no choice but to fight Trump as if there were no antisemitism, even while we continue to fight antisemitism as if there were no Trump.
Update (April 16): Commenter Greg argues that, in the title of this post, I probably ought to revise “Harvard’s biggest crisis since 1636” to “its biggest crisis since 1640.” Why 1640? Because that’s when the new college was shut down, over allegations that its head teacher was beating the students and that the head teacher’s wife (who was also the cook) was serving the students food adulterated with dung. By 1642, Harvard was back on track and had graduated its first class.
guest post by Bruce Bartlett
Stellenbosch University is hiring!
The Mathematics Division at Stellenbosch University in South Africa is looking to hire a new permanent appointment at Lecturer / Senior Lecturer level (other levels may be considered too under the appropriate circumstances).
Preference will be given to candidates working in number theory or a related area, but those working in other areas of mathematics will definitely also be considered.
The closing date for applications is 30 April 2025. For more details, kindly see the official advertisement.
Consider a wonderful career in the winelands area of South Africa!
Quanta magazine has just published a feature on Tai-Danae Bradley and her work, entitled
Where Does Meaning Live in a Sentence? Math Might Tell Us.
The mathematician Tai-Danae Bradley is using category theory to try to understand both human and AI-generated language.
It’s a nicely set up Q&A, with questions like “What’s something category theory lets you see that you can’t otherwise?” and “How do you use category theory to understand language?”
Particularly interesting for me is the part towards the end where Bradley describes her work with Juan Pablo Vigneaux on magnitude of enriched categories of texts.
We’ll get back to measurement, interference and the double-slit experiment just as soon as I can get my math program to produce pictures of the relevant wave functions reliably. I owe you some further discussion of why measurement (and even interactions without measurement) can partially or completely eliminate quantum interference.
But in the meantime, I’ve gotten some questions and some criticism for arguing that superposition is an OR, not an AND. It is time to look closely at this choice, and understand both its strengths and its limitations, and how we have to move beyond it to fully appreciate quantum physics. [I probably should have written this article earlier — and I suspect I’ll need to write it again someday, as it’s a tricky subject.]
Just to remind you of my definitions (we’ll see examples in a moment): objects that interact with one another form a system, and a system is at any time in a certain quantum state, consisting of one or more possibilities combined in some way and described by what is often called a “wave function”. If the number of possibilities described by the wave function is more than one, then physicists say that the state of the quantum system is a superposition of two or more basic states. [Caution: as we’ll explore in later posts, the number of states in the superposition can depend on one’s choice of “basis”.]
As an example, suppose we have two boxes, L and R for left and right, and two atoms, one of hydrogen H and one of nitrogen N. Our physical system consists of the two atoms, and depending on which box each atom is in, the system can exist in four obvious possibilities, shown in Fig. 1:
Before quantum physics, we would have thought those were the only options; each atom must be in one box or the other. But in quantum physics there are many more non-obvious possibilities.
In particular, we could put the system in a superposition of the form HL NL + HR NR, shown in Fig. 2. In the jargon of physics, “the system is in a superposition of HL NL and HR NR“. Note the use of the word “and” here. But don’t read too much into it; jargon often involves linguistic shorthand, and can be arbitrary and imprecise. The question I’m focused on here is not “what do physicists say?”, but “what does it actually mean?”
In particular, does it mean that “HL NL AND HR NR” are true? Or does it mean “HL NL OR HR NR” is true? Or does it mean something else?
First, let’s see why the “AND” option has a serious problem.
In ordinary language, if I say that “A AND B are true”, then I mean that one can check that A is true and also, separately, that B is true — i.e., both A and B are true. With this meaning in mind, it’s clear that experiments do not encourage us to view superposition as an AND. (There are theory interpretations of quantum physics that do encourage the use of “AND”, a point I’ll return to.)
Specifically, if a system is in a quantum superposition of two states A and B, no experiment will ever show that
Instead, in any experiment explicitly designed to check whether A is true and whether B is true, the result will only reveal, at best, that
The result might also be ambiguous, neither confirming nor denying that either one is true. But no measurement will ever show that both A AND B are definitively true. The two possibilities A and B are mutually exclusive in any actual measurement that is sensitive to the question.
In our case, if we go looking for our two atoms in the state HL NL + HR NR — if we do position measurements on both of them — we will either find both of them in the left box OR both of them in the right box. 1920’s quantum physics may be weird, but it does not allow measurements of an atom to find it in two places at the same time: an atom has a position, even if it is inherently uncertain, and if I make a serious attempt to locate it, I will find only one answer (within the precision of the measurement). [Measurement itself requires a long discussion, which I won’t attempt here; but see this post and the following one.]
And so, in this case, a measurement will find that one box has two atoms and the other has zero. Yet if we use “AND” in describing the superposition, we end up saying “both atoms are in the left box AND both atoms are in the right box”, which seems to imply that both atoms are in both boxes, contrary to any experiment. Again, certain theoretical approaches might argue that they are in both boxes, but we should obviously be very cautious when experiment disagrees with theoretical reasoning.
The example of Schrodinger’s cat is another context in which some writers use “and” in describing what is going on.
A reminder of the cat experiment: We have an atom which may decay now or later, according to a quantum process whose timing we cannot predict. If the atom decays, it initiates a chain reaction which kills the cat. If the atom and the cat are placed inside a sealed box, isolating them completely from the rest of the universe, then the initial state, with an intact atom (Ai) and a Live cat (CL), will evolve to a state in a superposition roughly of the form Ai CL + Ad CD, where Ad refers to a decayed atom and CD refers to a Dead cat. (More precisely, the state will take the form c1 Ai CL + c2 Ad CD, where c1 and c2 are complex numbers with |c1|2 + |c2|2 = 1; but we can ignore these numbers for now.)
Leaving aside that the experiment is both unethical and impossible in practice, it raises an important point about the word “AND”. It includes a place where we must say “AND“; there’s no choice.
As we close the box to start the experiment, the atom is intact AND the cat is alive; both are simultaneously true, as measurement can verify. The state that we use to describe this, Ai CL, is a mathematical product: implicitly “Ai CL” means Ai x CL, where x is the “times” symbol.
Later, the state to which the system evolves is a sum of two products — a superposition (Ai x CL) + (Ad x CD) which includes two “AND” relationships
1) “the atom is intact AND the cat is alive” (Ai x CL)
2) “the atom has decayed AND the cat is dead” (Ad x CD)
In each of these two possibilities, the state of the atom and the state of the cat are perfectly correlated; if you know one, you know the other. To use language consistent with English (and all other languages with which I am familiar), we must use “AND” to describe this correlation. (Note: in this particular example, correlation does in fact imply causation — but that’s not a requirement here. Correlation is enough.)
It is then often said that, theoretically, that “before we open the box, the cat is both alive AND dead”. But again, if we open the box to find out, experimentally, we will find out either that “the cat is alive OR the cat is dead.” So we should think this through carefully.
We’ve established that “x” must mean “AND“, as in Fig. 4. So let’s try to understand the “+” that appears in the superposition (Ai x CL) + (Ad x CD). It is certainly the case that such a state doesn’t tell us whether CL is true or CD is true, or even that it is meaningful to say that only one is true.
But suppose we decide that “+” means “AND“, also. Then we end up saying
That’s very worrying. In ordinary English, if I’m referring to some possible facts A,B,C, and D, and I tell you that “(A AND B are true) AND (C AND D are true)”, the logic of the language implies that A AND B AND C AND D are all true. But that standard logic would leads to a falsehood. It is absolutely not the case, in the state (Ai x CL) + (Ad x CD), that CL is true and Ad is true — we will never find, in any experiment, that the cat is alive and yet the atom has decayed. That could only happen if the system were in a superposition that includes the possibility Ad x CL. Nor (unless we wait a few years and the cat dies of old age) can it be the case that CD is true and Ai is true.
And so, if “x” means “AND” and “+” means “AND“, it’s clear that these are two different meanings of “AND.”
Is that okay? Well, lots of words have multiple meanings. Still, we’re not used to the idea of “AND” being ambiguous in English. Nor are “x” and “+” usually described with the same word. So using “AND” is definitely problematic.
(That said, people who like to think in terms of parallel “universes” or “branches” in which all possibilities happen [the many-worlds interpretation] may actually prefer to have two meanings of “AND”, one for things that happen in two different branches, and one for things that happen in the same branch. But this has some additional problems too, as we’ll see later when we get to the subtleties of “OR”.)
These issues are why, in my personal view, “OR” is better when one first learns quantum physics. I think it makes it easier to explain how quantum physics is both related to standard probabilities and yet goes beyond it. For one thing, “or” is already ambiguous in English, so we’re used to the idea that it might have multiple meanings. For another, we definitely need “+” to be conceptually different from “x“, so it is confusing, pedagogically, to start right off by saying that both mathematical operators are “AND”.
But “OR” is not without its problems.
In normal English, saying “the atom is intact and the cat is alive” OR “the atom has decayed and the cat is dead” would tell us two possible facts about the current contents of the box, one of which is definitely true.
But in quantum physics, the use of “OR” in the Schrodinger cat superposition does not tell us what is currently happening inside the box. It does tell us the state of the system at the moment, but all that does is predict the possible outcomes that would be observed if the box were opened right now (and their probabilities.) That’s less information than telling us the properties of what is in the closed box.
The advantage of “OR” is that it does tell us the two outcomes of opening the box, upon which we will find
Similarly, for our box of atoms, it tells us that if we attempt to locate the atoms, we will find that
In other words, this use of AND and OR agrees with what experiments actually find. Better this than the alternative, it seems to me.
Nevertheless, just because it is better doesn’t mean it is unproblematic.
The word “OR” is already ambiguous in usual English, in that it could mean
Which of these two meanings is intended in an English sentence has to be determined by context, or explained by the speaker. Here I’m focused on the first meaning.
Returning to our first example of Figs. 1 and 2, suppose I hand the two atoms to you and ask you to put them in either box, whichever one you choose. You do so, but you don’t tell me what your choice was, and you head off on a long vacation.
While I wait for you to return, what can I say about the two atoms? Assuming you followed my instructions, I would say that
In doing so, I’m using “or” in its “either…or…” sense in ordinary English. I don’t know which box you chose, but I still know (Fig. 5) that the system is either definitely in the HL NL state OR definitely in the HR NR state of Fig. 1. I know this without doing any measurement, and I’m only uncertain about which is which because I’m missing information that you could have provided me. The information is knowable; I just don’t have it.
But this uncertainty about which box the atoms are in is completely different from the uncertainty that arises from putting the atoms in the superposition state HL NL + HR NR!
If the system is in the state HL NL + HR NR, i.e. what I’ve been calling (“HL NL OR HR NR“), it is in a state of inherent uncertainty of whether the two atoms are in the left box or in the right box. It is not that I happen not to know which box the atoms are in, but rather that this information is not knowable within the rules of quantum physics. Even if you yourself put the atoms into this superposition, you don’t know which box they’re in any more than I do.
The only thing we can try to do is perform an experiment and see what the answer is. The problem is that we cannot necessarily infer, if we find both atoms in the left box, that the two atoms were in that box prior to that measurement.
If we do try to make that assumption, we find ourselves in apparent contradiction with experiment. The issue is quantum interference. If we repeat the whole process, but instead of opening the boxes to see where the atoms are, we first bring the two boxes together and measure the atoms’ properties, we will observe quantum interference effects. As I have discussed in my recent series of five posts on interference (starting here), quantum interference can only occur when a system takes at least two paths to its current state; but if the two atoms were definitely in one box or definitely in the other, then there would be only one path in Fig. 6.
Prior to the measurement, the system had inherent uncertainty about the question, and while measurement removes the current uncertainty, it does not in general remove the past uncertainty. The act of measurement changes the state of the system — more precisely, it changes the state of the larger system that includes both atoms and the measurement device — and so establishing meaningfully that the two atoms are now in the left box is not sufficient to tell us meaningfully that the two atoms were previously and definitively in the left box.
So if this is “OR“, it is certainly not what it usually means in English!
And it gets worse, because we can take more complex examples. As I mentioned when discussing the poor cat, the superposition HL NL + HR NR is actually one in a large class of superpositions, of the form c1 HL NL + c2 HR NR , where c1 and c2 are complex numbers. A second simple example of such a superposition is HL NL – HR NR, with a minus sign instead of a plus sign.
So suppose I had asked you to put the two atoms in a superposition either of the form HL NL + HR NR or HL NL – HR NR, your choice; and suppose you did so without telling me which superposition you chose. What would I then know?
I would know that the system is either in the state (HL NL + HR NR) or in the state (HL NL – HR NR), depending on what you chose to do. In words, what I would know is that the system is represented by
Uh oh. Now we’re as badly off as we were with “AND“.
First, the “OR” in the center is a standard English “OR” — it means that the system is definitely in one superposition or the other, but I don’t know which one — which isn’t the same thing as the “OR“s in the parentheses, which are “OR“s of superposition that only tell us what the results of measurements might be.
Second, the two “OR“s in the parentheses are different, since one means “+” and the other means “–“. In some other superposition state, the OR might mean 3/5 + i 4/5, where i is the standard imaginary number equal to the square root of -1. In English, there’s obviously no room for all this complexity. [Note that I’d have the same problem if I used “AND” for superpositions instead.]
So even if “OR” is better, it’s still not up to the task. Superposition forces us to choose whether to have multiple meanings of “AND” or multiple meanings of “OR”, including meanings that don’t hold in ordinary language. In a sense, the “+” (or “-” or whatever) in a superposition is a bit more “AND” than standard English “OR”, but it’s also a bit more “OR” than a standard English “AND”. It’s something truly new and unfamiliar.
Experts in the foundational meaning of quantum physics argue over whether to use “OR” or “AND”. It’s not an argument I want to get into. My goal here is to help you understand how quantum physics works with the minimum of interpretation and the minimum of mathematics. This requires precise language, of course. But here we find we cannot avoid a small amount of math — that of simple numbers, sometimes even complex numbers — because ordinary language simply can’t capture the logic of what quantum physics can do.
I will continue, for consistency, to use “OR” for a superposition, but going forward we must admit and recognize its limitations, and become more sophisticated about what it does and doesn’t mean. One should understand my use of “OR“, and the “pre-quantum viewpoint” that I often employ, as pedagogical methodology, not a statement about nature. Specifically, I have been trying to clarify the crucial idea of the space of possibilities, and to show examples of how quantum physics goes beyond pre-quantum physics. I find the “pre-quantum viewpoint”, where it is absolutely required that we use “OR”, helps students get the basics straight. But it is true that the pre-quantum viewpoint obscures some of the full richness and complexity of quantum phenomena, much of which arises precisely because the quantum “OR” is not the standard “OR” [and similarly if you prefer “AND” instead.] So we have to start leaving it behind.
There are many more layers of subtlety yet to be uncovered [for instance, what if my system is in a state (A OR B), but I make a measurement that can’t directly tell me whether A is true or B is true?] but this is enough for today.
I’m grateful to Jacob Barandes for a discussion about some of these issues.
The third bullet point is open to different choices about “AND” and “OR“, and open to different interpretation about what superposition states imply about the systems that are in them. There are different consistent ways to combine the language and concepts, and the particular choice I’ve made is pragmatic, not dogmatic. For a single set of blog posts that tell a coherent story, I have to to pick a single consistent language; but it’s a choice. Once one’s understanding of quantum physics is strong, it’s both valuable and straightforward to consider other possible choices.
With the stock market crash and the big protests across the US, I’m finally feeling a trace of optimism that Trump’s stranglehold on the nation will weaken. Just a trace.
I still need to self-medicate to keep from sinking into depression — where ‘self-medicate’, in my case, means studying fun math and physics I don’t need to know. I’ve been learning about the interactions between number theory and group theory. But I haven’t been doing enough physics! I’m better at that, and it’s more visceral: more of a bodily experience, imagining things wiggling around.
So, I’ve been belatedly trying to lessen my terrible ignorance of nuclear physics. Nuclear physics is a fascinating application of quantum theory, but it’s less practical than chemistry and less sexy than particle physics, so I somehow skipped over it.
I’m finding it worth looking at! Right away it’s getting me to think about quantum ellipsoids.
Nuclear physics forces you to imagine blobs of protons and neutrons wiggling around in a very quantum-mechanical way. Nuclei are too complicated to fully understand. We can simulate them on a computer, but simulation is not understanding, and it’s also very hard: one book I’m reading points out that one computation you might want to do requires diagonalizing a matrix. So I’d rather learn about the many simplified models of nuclei people have created, which offer partial understanding… and lots of beautiful math.
Protons minimize energy by forming pairs with opposite spin. Same for neutrons. Each pair acts like a particle in its own right. So nuclei act very differently depending on whether they have an even or odd number of protons, and an even or odd number of neutrons!
The ‘Interacting Boson Model’ is a simple approximate model of ‘even-even’ atomic nuclei: nuclei with an even number of protons and an even number of neutrons. It treats the nucleus as consisting of bosons, each boson being either a pair of nucleons — that is, either protons or neutrons — where the members of a pair have opposite spin but are the same in every other way. So, these bosons are a bit like the paired electrons responsible for superconductivity, called ‘Cooper pairs’.
However, in the Interacting Boson Model we assume our bosons all have either spin 0 (s-bosons) or spin 2 (d-bosons), and we ignore all properties of the bosons except their spin angular momentum. A spin-0 particle has 1 spin state, since the spin-0 representation of is 1-dimensional. A spin-2 particle has 5, since the spin-2 representation is 5-dimensional.
If we assume the maximum amount of symmetry among all 6 states, both s-boson and d-boson states, we get a theory with symmetry! And part of why I got interested in this stuff was that it would be fun to see a rather large group like showing up as symmetries — or approximate symmetries — in real world physics.
More sophisticated models recognize that not all these states behave the same, so they assume a smaller group of symmetries.
But there are some simpler questions to start with.
How do we make a spin-0 or spin-2 particle out of two nucleons? That’s easy. Two nucleons with opposite spin have total spin 0. But if they’re orbiting each other, they have orbital angular momentum too, so the pair can act like a particle with spin 0, 1, 2, 3, etc.
Why are these bosons in the Interacting Boson Model assumed to have spin 0 or spin 2, but not spin 1 or any other spin? This is a lot harder. I assume that at some level the answer is “because this model works fairly well”. But why does it work fairly well?
By now I’ve found two answers for this, and I’ll tell you the more exciting answer, which I found in this book:
In the ‘liquid drop model’ of nuclei, you think of a nucleus as a little droplet of fluid. You can think of an even-even nucleus as a roughly ellipsoidal droplet, which however can vibrate. But we need to treat it using quantum mechanics. So we need to understand quantum ellipsoids!
The space of ellipsoids in centered at the origin is 6-dimensional, because these ellipsoids are described by equations like
and there are 6 coefficients here. Not all nuclei are close to spherical! But perhaps it’s easiest to start by thinking about ellipsoids that are close to spherical, so that
where are small. If our nucleus were classical, we’d want equations that describe how these numbers change with time as our little droplet oscillates.
But the nucleus is deeply quantum mechanical. So in the Interacting Boson Model, invented by Iachello, it seems we replace with operators on a Hilbert space, say , and introduce corresponding momentum operators , obeying the usual ‘canonical commutation relations’:
As usual, we can take this Hilbert space to either be or ‘Fock space’ of : the Hilbert space completion of the symmetric algebra of . These are two descriptions of the same thing. The Fock space of gets an obvious representation of the unitary group , since that group acts on . And gets an obvious representation of , since rotations act on ellipsoids and thus on the tuples that we’re using to describe ellipsoids.
The latter description lets us see where the s-bosons and d-bosons are coming from! Our representation of on splits into two summands:
the (real) spin-0 representation, which is 1-dimensional because it takes just one number to describe the rotation-invariant aspects of the shape of an ellipsoid centered at the origin: for example, its volume. In physics jargon this number tells us the monopole moment of the mass distribution of our nucleus.
the (real) spin-2 representation, which is 5-dimensional because it takes 5 numbers to describe all other aspects of the shape of an ellipsoid centered at the origin. You need 2 numbers to say in which direction its longest axis points, one number to say how long that axis is, 1 number to say which direction the second-longest axis point in (it’s at right angles to the longest axis), and 1 number to say how long it is. In physics jargon these 5 numbers tell us the quadrupole moment of our nucleus.
This shows us why we don’t get spin-1 bosons! We’d get them if the mass distribution of our nucleus could have a nonzero dipole moment. In other words, we’d get them if we added linear terms to our equation
But by conservation of momentum, we can assume the center of mass of our nucleus stays at the origin, and set these linear terms to zero.
As usual, we can take linear combinations of the operators and to get annihilation and creation operators for s-bosons and d-bosons. If we want, we can think of these bosons as nucleon pairs. But we don’t need that microscopic interpretation if we don’t want it: we can just say we’re studying the quantum behavior of an oscillating ellipsoid!
After we have our Hilbert space and these operators on it, we can write down a Hamiltonian for our nucleus, or various possible candidate Hamiltonians, in terms of these operators. Talmi’s book goes into a lot of detail on that. And then we can compare the oscillations these Hamiltonians predict to what we see in the lab. (Often we just see the frequencies of the standing waves, which are proportional to the eigenvalues of the Hamiltonian.)
So, from a high-level mathematical viewpoint, what we’ve done is try to define a manifold of ellipsoid shapes, and then form its cotangent bundle , and then quantize that and start studying ‘quantum ellipsoids’.
Pretty cool! And there’s a lot more to say about it. But I’m wondering if there might be a better manifold of ellipsoid shapes than just . After all, when or become negative things go haywire: our ellipsoid can turn into a hyperboloid! The approach I’ve described is probably fine ‘perturbatively’, i.e. when are small. But it may not be the best when our ellipsoid oscillates so much it gets far from spherical.
I think we need a real algebraic geometer here. In both senses of the word ‘real’.
With the stock market crash and the big protests across the US, I’m finally feeling a trace of optimism that Trump’s stranglehold on the nation will weaken. Just a trace.
I still need to self-medicate to keep from sinking into depression — where ‘self-medicate’, in my case, means studying fun math and physics I don’t need to know. I’ve been learning about the interactions between number theory and group theory. But I haven’t been doing enough physics! I’m better at that, and it’s more visceral: more of a bodily experience, imagining things wiggling around.
So, I’ve been belatedly trying to lessen my terrible ignorance of nuclear physics. Nuclear physics is a fascinating application of quantum theory, but it’s less practical than chemistry and less sexy than particle physics, so I somehow skipped over it.
I’m finding it worth looking at! Right away it’s getting me to think about quantum ellipsoids.
Nuclear physics forces you to imagine blobs of protons and neutrons wiggling around in a very quantum-mechanical way. Nuclei are too complicated to fully understand. We can simulate them on a computer, but simulation is not understanding, and it’s also very hard: one book I’m reading points out that one computation you might want to do requires diagonalizing a matrix. So I’d rather learn about the many simplified models of nuclei people have created, which offer partial understanding… and lots of beautiful math.
Protons minimize energy by forming pairs with opposite spin. Same for neutrons. Each pair acts like a particle in its own right. So nuclei act very differently depending on whether they have an even or odd number of protons, and an even or odd number of neutrons!
The ‘Interacting Boson Model‘ is a simple approximate model of ‘even-even’ atomic nuclei: nuclei with an even number of protons and an even number of neutrons. It treats the nucleus as consisting of bosons, each boson being either a pair of nucleons—that is, either protons or neutrons—where the members of a pair have opposite spin but are the same in every other way. So, these bosons are a bit like the paired electrons responsible for superconductivity, called ‘Cooper pairs’.
However, in the Interacting Boson Model we assume our bosons all have either spin 0 (s-bosons) or spin 2 (d-bosons), and we ignore all properties of the bosons except their spin angular momentum. A spin-0 particle has 1 spin state, since the spin-0 representation of is 1-dimensional. A spin-2 particle has 5, since the spin-2 representation is 5-dimensional.
If we assume the maximum amount of symmetry among all 6 states, both s-boson and d-boson states, we get a theory with symmetry! And part of why I got interested in this stuff was that it would be fun to see a rather large group like
showing up as symmetries—or approximate symmetries—in real world physics.
More sophisticated models recognize that not all these states behave the same, so they assume a smaller group of symmetries.
But there are some simpler questions to start with.
How do we make a spin-0 or spin-2 particle out of two nucleons? That’s easy. Two nucleons with opposite spin have total spin 0. But if they’re orbiting each other, they have orbital angular momentum too, so the pair can act like a particle with spin 0, 1, 2, 3, etc.
Why are these bosons in the Interacting Boson Model assumed to have spin 0 or spin 2, but not spin 1 or any other spin? This is a lot harder. I assume that at some level the answer is “because this model works fairly well”. But why does it work fairly well?
By now I’ve found two answers for this, and I’ll tell you the more exciting answer, which I found in this book:
• Igal Talmi, Simple Models of Complex Nuclei: the Shell Model and Interacting Boson Model, Harwood Academic Publishers, Chur, Switzerland, 1993.
In the ‘liquid drop model’ of nuclei, you think of a nucleus as a little droplet of fluid. You can think of an even-even nucleus as a roughly ellipsoidal droplet, which however can vibrate. But we need to treat it using quantum mechanics. So we need to understand quantum ellipsoids!
The space of ellipsoids in centered at the origin is 6-dimensional, because these ellipsoids are described by equations like
and there are 6 coefficients here. Not all nuclei are close to spherical! But perhaps it’s easiest to start by thinking about ellipsoids that are close to spherical, so that
where are small. If our nucleus were classical, we’d want equations that describe how these numbers change with time as our little droplet oscillates.
But the nucleus is deeply quantum mechanical. So in the Interacting Boson Model, invented by Iachello, it seems we replace with operators on a Hilbert space, say
, and introduce corresponding momentum operators
, obeying the usual ‘canonical commutation relations’:
As usual, we can take this Hilbert space to either be or the ‘Fock space’ on
: the Hilbert space completion of the symmetric algebra of
. These are two descriptions of the same thing. The Fock space on
gets an obvious representation of the unitary group
, since that group acts on
. And
gets an obvious representation of
, since rotations act on ellipsoids and thus on the tuples
that we’re using to describe ellipsoids.
The latter description lets us see where the s-bosons and d-bosons are coming from! Our representation of on
splits into two summands:
• the (real) spin-0 representation, which is 1-dimensional because it takes just one number to describe the rotation-invariant aspects of the shape of an ellipsoid centered at the origin: for example, its volume. In physics jargon this number tells us the monopole moment of our nucleus.
• the (real) spin-2 representation, which is 5-dimensional because it takes 5 numbers to describe all other aspects of the shape of an ellipsoid centered at the origin. You need 2 numbers to say in which direction its longest axis points, one number to say how long that axis is, 1 number to say which direction the second-longest axis point in (it’s at right angles to the longest axis), and 1 number to say how long it is. In physics jargon these 5 numbers tell us the quadrupole moment of our nucleus.
This shows us why we don’t get spin-1 bosons! We’d get them if the mass distribution of our nucleus could have a nonzero dipole moment. In other words, we’d get them if we added linear terms to our equation
But by conservation of momentum, we can assume the center of mass of our nucleus stays at the origin, and set these linear terms to zero.
As usual, we can take linear combinations of the operators and
to get annihilation and creation operators for s-bosons and d-bosons. If we want, we can think of these bosons as nucleon pairs. But we don’t need that microscopic interpretation if we don’t want it: we can just say we’re studying the quantum behavior of an oscillating ellipsoid!
After we have our Hilbert space and some operators on it, we can write down a Hamiltonian for our nucleous, or various possible candidate Hamiltonians, in terms of these operators. Talmi’s book goes into a lot of detail on that. And then we can compare the oscillations these Hamiltonians predict to what we see in the lab. (Often we just see the frequencies of the standing waves, which are proportional to the eigenvalues of the Hamiltonian.)
So, from a high-level mathematical viewpoint, what we’ve done is try to define a manifold of ellipsoid shapes, and then form its cotangent bundle
, and then quantize that and start studying ‘quantum ellipsoids’.
Pretty cool! And there’s a lot more to say about it. But I’m wondering if there might be a better manifold of ellipsoid shapes than just . After all, when
or
become negative things go haywire: our ellipsoid can turn into a hyperboloid! The approach I’ve described is probably fine ‘perturbatively’, i.e. when
are small. But it may not be the best when our ellipsoid oscillates so much it gets far from spherical.
I think we need a real algebraic geometer here. In both senses of the word ‘real’.
The quantum double-slit experiment, in which objects are sent toward a wall with two slits and then recorded on a screen behind the wall, creates an interference pattern that builds up gradually, object by object. And yet, it’s crucial that the path of each object on its way to the screen remain unknown. If one measures which of the slits each object passes through, the interference pattern never appears.
Strange things are said about this. There are vague, weird slogans: “measurement causes the wave function to collapse“; “the particle interferes with itself“; “electrons are both particles and waves“; etc. One reads that the objects are particles when they reach the screen, but they are waves when they go through the slits, causing the interference — unless their passage through the slits is measured, in which case they remain particles.
But in fact the equations of 1920s quantum physics say something different and not vague in the slightest — though perhaps equally weird. As we’ll see today, the elimination of interference by measurement is no mystery at all, once you understand both measurement and interference. Those of you who’ve followed my recent posts on these two topics will find this surprisingly straightforward; I guarantee you’ll say, “Oh, is that all?” Other readers will probably want to read
When do we expect quantum interference? As I’ll review in a moment, there’s a simple criterion:
To remind you what that means, let’s compare two contrasting cases (covered carefully in this post.) Figs. 1a and 1b show pre-quantum animations of different quantum systems, in which two balls (drawn blue and orange) are in a superposition of moving left OR moving right. I’ve chosen to stop each animation right at the moment when the blue ball in the top half of the superposition is at the same location as the blue ball in the bottom half, because if the orange ball weren’t there, this is when we’d expect it to see quantum interference.
But for interference to occur, the orange ball, too, must at that same moment be in the same place in both parts of the superposition. That does happen for the system in Fig. 1a — the top and bottom parts of the figure line up exactly, and so interference will occur. But the system in Fig. 1b, whose top and bottom parts never look the same, will not show quantum interference.
In other words, quantum interference requires that the two possibilities in the superposition become identical at some moment in time. Partial resemblance is not enough.
A measurement always involves an interaction of some sort between the object we want to measure and the device doing the measurement. We will typically
For today’s purposes, the details of the second step won’t matter, so I’ll focus on the first step.
We’ll call the object going through the slits a “particle”, and we’ll call the measurement device a “measuring ball” (or just “ball” for short.) The setup is depicted in Fig. 2, where the particle is approaching the slits and the measuring ball lies in wait.
Suppose we allow the particle to proceed and we make no measurement of its location as it passes through the slits. Then we can leave the ball where it is, at the position I’ve marked M in Fig. 3. If the particle makes it through the wall, it must pass through one slit or the other, leaving the system in a superposition of the form
as shown at the top of Fig. 3. (Note: because the ball and particle are independent [unentangled] in this superposition, it can be written in factored form as in Fig. 12 of this post.)
From here, the particle (whose motion is now quite uncertain as a result of passing through a narrow slit) can proceed unencumbered to the screen. Let’s say it arrives at the point marked P, as at the bottom of Fig. 3.
Crucially, both halves of the superposition now describe the same situation: particle at P, ball at M. The system has arrived here via two paths:
Therefore, since the system has reached a single possibility via two different routes, quantum interference may be observed.
Specifically, the system’s wave function, which gives the probability for the particle to arrive at any point on the screen, will display an interference pattern. We saw numerous similar examples in this post, this post and this post.
But now let’s make the measurement. We’ll do it by throwing the ball rapidly toward the particle, timed carefully so that, as shown in Fig. 4, either
(Recall that I assumed the measuring ball is lightweight, so the collision doesn’t much affect the particle; for instance, the particle might be an heavy atom, while the measuring ball is a light atom.)
The ball’s late-time behavior reveals — and thus measures — the particle’s behavior as it passed through the wall:
[Said another way, the ball and particle, which were originally independent before the measurement, have been entangled by the measurement process. Because of the entanglement, knowledge concerning the ball tells us something about the particle too.]
To make this measurement complete and permanent requires a longer story with more details; for instance, we might choose to amplify the result with a Geiger counter. But the details don’t matter, and besides, that takes place later. Let’s keep our focus on what happens next.
What happens next is that the particle reaches the point P on the screen. It can do this whether it traveled via the left slit or via the right slit, just as before, and so you might think there should still be an interference pattern. However, remembering Figs. 1a and 1b and the criterion for interference, take a look at Fig. 5.
Even though the particle by itself could have taken two paths to the point P, the system as a whole is still in a superposition of two different possibilities, not one — more like Fig. 1b than like Fig. 1a. Specifically,
The measurement process — by the very definition of “measurement” as a procedure that segregates left-slit cases from right-slit cases — has resulted in the two parts of the superposition being different even when they both have the particle reaching the same point P. Therefore, in contrast to Fig. 3, quantum interference between the two parts of the superposition cannot occur.
And that’s it. That’s all there is to it.
The double-slit experiment is hard to understand if one relies on vague slogans. But if one relies on the math, one sees that many of the seemingly mysterious features of the experiment are in fact straightforward.
I’ll say more about this in future posts. In particular, to convince you today’s argument is really correct, I’ll look more closely at the quantum wave function corresponding to Figs. 3-5, and will reproduce the same phenomenon in simpler examples. Then we’ll apply the resulting insights to other cases, including
Now finally, we come to the heart of the matter of quantum interference, as seen from the perspective of in 1920’s quantum physics. (We’ll deal with quantum field theory later this year.)
Last time I looked at some cases of two particle states in which the particles’ behavior is independent — uncorrelated. In the jargon, the particles are said to be “unentangled”. In this situation, and only in this situation, the wave function of the two particles can be written as a product of two wave functions, one per particle. As a result, any quantum interference can be ascribed to one particle or the other, and is visible in measurements of either one particle or the other. (More precisely, it is observable in repeated experiments, in which we do the same measurement over and over.)
In this situation, because each particle’s position can be studied independent of the other’s, we can be led to think any interference associated with particle 1 happens near where particle 1 is located, and similarly for interference involving the second particle.
But this line of reasoning only works when the two particles are uncorrelated. Once this isn’t true — once the particles are entangled — it can easily break down. We saw indications of this in an example that appeared at the ends of my last two posts (here and here), which I’m about to review. The question for today is: what happens to interference in such a case?
Let me now review the example of my recent posts. The pre-quantum system looks like this
Notice the particles are correlated; either both particles are moving to the left OR both particles are moving to the right. (The two particles are said to be “entangled”, because the behavior of one depends upon the behavior of the other.) As a result, the wave function cannot be factored (in contrast to most examples in my last post) and we cannot understand the behavior of particle 1 without simultaneously considering the behavior of particle 2. Compare this to Fig. 2, an example from my last post in which the particles are independent; the behavior of particle 2 is the same in both parts of the superposition, independent of what particle 1 is doing.
Let’s return now to Fig. 1. The wave function for the corresponding quantum system, shown as a graph of its absolute value squared on the space of possibilities, behaves as in Fig. 3.
But as shown last time in Fig. 19, at the moment where the interference in Fig. 3 is at its largest, if we measure particle 1 we see no interference effect. More precisely, if we do the experiment many times and measure particle 1 each time, as depicted in Fig. 4, we see no interference pattern.
We see something analogous if we measure particle 2.
Yet the interference is plain as day in Fig. 3. It’s obvious when we look at the full two-dimensional space of possibilities, even though it is invisible in Fig. 4 for particle 1 and in the analogous experiment for particle 2. So what measurements, if any, can we make that can reveal it?
The clue comes from the fact that the interference fringes lie at a 45 degree angle, perpendicular neither to the x1 axis nor to the x2 axis but instead to the axis for the variable 1/2(x1 + x2), the average of the positions of particle 1 and 2. It’s that average position that we need to measure if we are to observe the interference.
But doing so requires we that we measure both particles’ positions. We have to measure them both every time we repeat the experiment. Only then can we start making a plot of the average of their positions.
When we do this, we will find what is shown in Fig 5.
For each measurement, I’ve drawn a straight orange line between the measurement of x1 and the measurement of x2; the center of this line lies at the average position 1/2(x1+x2). The actual averages are then recorded in a different color, to remind you that we don’t measure them directly; we infer them from the actual measurements of the two particles’ positions.
In short, the interference is not associated with either particle separately — none is seen in either the top or bottom rows. Instead, it is found within the correlation between the two particles’ positions. This is something that neither particle can tell us on its own.
And where is the interference? It certainly lies near 1/2(x1+x2)=0. But this should worry you. Is that really a point in physical space?
You could imagine a more extreme example of this experiment in which Fig. 5 shows particle 1 located in Boston and particle 2 located in New York City. This would put their average position within appropriately-named Middletown, Connecticut. (I kid you not; check for yourself.) Would we really want to say that the interference itself is located in Middletown, even though it’s a quiet bystander, unaware of the existence of two correlated particles that lie in opposite directions 90 miles (150 km) away?
After all, the interference appears in the relationship between the particles’ positions in physical space, not in the positions themselves. Its location in the space of possibilities (Fig. 3) is clear. Its location in physical space (Fig. 5) is anything but.
Still, I can imagine you pondering whether it might somehow make sense to assign the interference to poor, unsuspecting Middletown. For that reason, I’m going to make things even worse, and take Middletown out of the middle.
Here’s another system with interference, whose pre-quantum version is shown in Figs. 6a and 6b:
The corresponding wave function is shown in Fig. 7. Now the interference fringes are oriented diagonally the other way compared to Fig. 3. How are we to measure them this time?
The average position 1/2(x1+x2) won’t do; we’ll see nothing interesting there. Instead the fringes are near (x1-x2)=4 — that is, they occur when the particles, no matter where they are in physical space, are at a distance of four units. We therefore expect interference near 1/2(x1-x2)=2. Is it there?
In Fig. 8 I’ve shown the analogue of Figs. 4 and 5, depicting
That quantity 1/2(x1-x2) is half the horizontal length of the orange line. Hidden in its behavior over many measurements is an interference pattern, seen in the bottom row, where the 1/2(x1-x2) measurements are plotted. [Note also that there is no interference pattern in the measurements of 1/2(x1+x2), in contrast to Fig. 4.]
Now the question of the hour: where is the interference in this case? It is found near 1/2(x1-x2)=2 — but that certainly is not to be identified with a legitimate position in physical space, such as the point x=2.
First of all, making such an identification in Fig. 8 would be like saying that one particle is in New York and the other is in Boston, while the interference is 150 kilometers offshore in the ocean. But second and much worse, I could change Fig. 8 by moving both particles 10 units to the left and repeating the experiment. This would cause x1, x2, and 1/2(x1+x2) in Fig. 8 to all shift left by 10 units, moving them off your computer screen, while leaving 1/2(x1-x2) unchanged at 2. In short, all the orange and blue and yellow points would move out of your view, while the green points would remain exactly where they are. The difference of positions — a distance — is not a position.
If 10 units isn’t enough to convince you, let’s move the two particles to the other side of the Sun, or to the other side of the galaxy. The interference pattern stubbornly remains at 1/2(x1-x2)=2. The interference pattern is in a difference of positions, so it doesn’t care whether the two particles are in France, Antarctica, or Mars.
We can move the particles anywhere in the universe, as long as we take them together with their average distance remaining the same, and the interference pattern remains exactly the same. So there’s no way we can identify the interference as being located at a particular value of x, the coordinate of physical space. Trying to do so creates nonsense.
This is totally unlike interference in water waves and sound waves. That kind of interference happens in a someplace; we can say where the waves are, how big they are at a particular location, and where their peaks and valleys are in physical space. Quantum interference is not at all like this. It’s something more general, more subtle, and more troubling to our intuition.
[By the way, there’s nothing special about the two combinations 1/2(x1+x2) and 1/2(x1-x2), the average or the difference. It’s easy to find systems where the intereference arises in the combination x1+2x2, or 3x1-x2, or any other one you like. In none of these is there a natural way to say “where” the interference is located.]
From these examples, we can begin to learn a central lesson of modern physics, one that a century of experimental and theoretical physics have been teaching us repeatedly, with ever greater subtlety. Imagining reality as many of us are inclined to do, as made of localized objects positioned in and moving through physical space — the one-dimensional x-axis in my simple examples, and the three-dimensional physical space that we take for granted when we specify our latitude, longitude and altitude — is simply not going to work in a quantum universe. The correlations among objects have observable consequences, and those correlations cannot simply be ascribed locations in physical space. To make sense of them, it seems we need to expand our conception of reality.
In the process of recognizing this challenge, we have had to confront the giant, unwieldy space of possibilities, which we can only visualize for a single particle moving in up to three dimensions, or for two or three particles moving in just one dimension. In realistic circumstances, especially those of quantum field theory, the space of possibilities has a huge number of dimensions, rendering it horrendously unimaginable. Whether this gargantuan space should be understood as real — perhaps even more real than physical space — continues to be debated.
Indeed, the lessons of quantum interference are ones that physicists and philosophers have been coping with for a hundred years, and their efforts to make sense of them continue to this day. I hope this series of posts has helped you understand these issues, and to appreciate their depth and difficulty.
Looking ahead, we’ll soon take these lessons, and other lessons from recent posts, back to the double-slit experiment. With fresher, better-informed eyes, we’ll examine its puzzles again.
Several people have asked me whether writing a popular-science book has fed back into my research. Nature Physics published my favorite illustration of the answer this January. Here’s the story behind the paper.
In late 2020, I was sitting by a window in my home office (AKA living room) in Cambridge, Massachusetts. I’d drafted 15 chapters of my book Quantum Steampunk. The epilogue, I’d decided, would outline opportunities for the future of quantum thermodynamics. So I had to come up with opportunities for the future of quantum thermodynamics. The rest of the book had related foundational insights provided by quantum thermodynamics about the universe’s nature. For instance, quantum thermodynamics had sharpened the second law of thermodynamics, which helps explain time’s arrow, into more-precise statements. Conventional thermodynamics had not only provided foundational insights, but also accompanied the Industrial Revolution, a paragon of practicality. Could quantum thermodynamics, too, offer practical upshots?
Quantum thermodynamicists had designed quantum engines, refrigerators, batteries, and ratchets. Some of these devices could outperform their classical counterparts, according to certain metrics. Experimentalists had even realized some of these devices. But the devices weren’t useful. For instance, a simple quantum engine consisted of one atom. I expected such an atom to produce one electronvolt of energy per engine cycle. (A light bulb emits about 1021 electronvolts of light per second.) Cooling the atom down and manipulating it would cost loads more energy. The engine wouldn’t earn its keep.
Autonomous quantum machines offered greater hope for practicality. By autonomous, I mean, not requiring time-dependent external control: nobody need twiddle knobs or push buttons to guide the machine through its operation. Such control requires work—organized, coordinated energy. Rather than receiving work, an autonomous machine accesses a cold environment and a hot environment. Heat—random, disorganized energy cheaper than work—flows from the hot to the cold. The machine transforms some of that heat into work to power itself. That is, the machine sources its own work from cheap heat in its surroundings. Some air conditioners operate according to this principle. So can some quantum machines—autonomous quantum machines.
Thermodynamicists had designed autonomous quantum engines and refrigerators. Trapped-ion experimentalists had realized one of the refrigerators, in a groundbreaking result. Still, the autonomous quantum refrigerator wasn’t practical. Keeping the ion cold and maintaining its quantum behavior required substantial work.
My community needed, I wrote in my epilogue, an analogue of solar panels in southern California. (I probably drafted the epilogue during a Boston winter, thinking wistfully of Pasadena.) If you built a solar panel in SoCal, you could sit back and reap the benefits all year. The panel would fulfill its mission without further effort from you. If you built a solar panel in Rochester, you’d have to scrape snow off of it. Also, the panel would provide energy only a few months per year. The cost might not outweigh the benefit. Quantum thermal machines resembled solar panels in Rochester, I wrote. We needed an analogue of SoCal: an appropriate environment. Most of it would be cold (unlike SoCal), so that maintaining a machine’s quantum nature would cost a user almost no extra energy. The setting should also contain a slightly warmer environment, so that net heat would flow. If you deposited an autonomous quantum machine in such a quantum SoCal, the machine would operate on its own.
Where could we find a quantum SoCal? I had no idea.
A few months later, I received an email from quantum experimentalist Simone Gasparinetti. He was setting up a lab at Chalmers University in Sweden. What, he asked, did I see as opportunities for experimental quantum thermodynamics? We’d never met, but we agreed to Zoom. Quantum Steampunk on my mind, I described my desire for practicality. I described autonomous quantum machines. I described my yearning for a quantum SoCal.
I have it, Simone said.
Simone and his colleagues were building a quantum computer using superconducting qubits. The qubits fit on a chip about the size of my hand. To keep the chip cold, the experimentalists put it in a dilution refrigerator. You’ve probably seen photos of dilution refrigerators from Google, IBM, and the like. The fridges tend to be cylindrical, gold-colored monstrosities from which wires stick out. (That is, they look steampunk.) You can easily develop the impression that the cylinder is a quantum computer, but it’s only the fridge.
The fridge, Simone said, resembles an onion: it has multiple layers. Outer layers are warmer, and inner layers are colder. The quantum computer sits in the innermost layer, so that it behaves as quantum mechanically as possible. But sometimes, even the fridge doesn’t keep the computer cold enough.
Imagine that you’ve finished one quantum computation and you’re preparing for the next. The computer has written quantum information to certain qubits, as you’ve probably written on scrap paper while calculating something in a math class. To prepare for your next math assignment, given limited scrap paper, you’d erase your scrap paper. The quantum computer’s qubits need erasing similarly. Erasing, in this context, means cooling down even more than the dilution refrigerator can manage.
Why not use an autonomous quantum refrigerator to cool the scrap-paper qubits?
I loved the idea, for three reasons. First, we could place the quantum refrigerator beside the quantum computer. The dilution refrigerator would already be cold, for the quantum computations’ sake. Therefore, we wouldn’t have to spend (almost any) extra work on keeping the quantum refrigerator cold. Second, Simone could connect the quantum refrigerator to an outer onion layer via a cable. Heat would flow from the warmer outer layer to the colder inner layer. From the heat, the quantum refrigerator could extract work. The quantum refrigerator would use that work to cool computational qubits—to erase quantum scrap paper. The quantum refrigerator would service the quantum computer. So, third, the quantum refrigerator would qualify as practical.
Over the next three years, we brought that vision to life. (By we, I mostly mean Simone’s group, as my group doesn’t have a lab.)
Postdoc Aamir Ali spearheaded the experiment. Then-master’s student Paul Jamet Suria and PhD student Claudia Castillo-Moreno assisted him. Maryland postdoc Jeffrey M. Epstein began simulating the superconducting qubits numerically, then passed the baton to PhD student José Antonio Marín Guzmán.
The experiment provided a proof of principle: it demonstrated that the quantum refrigerator could operate. The experimentalists didn’t apply the quantum refrigerator in a quantum computation. Also, they didn’t connect the quantum refrigerator to an outer onion layer. Instead, they pumped warm photons to the quantum refrigerator via a cable. But even in such a stripped-down experiment, the quantum refrigerator outperformed my expectations. I thought it would barely lower the “scrap-paper” qubit’s temperature. But that qubit reached a temperature of 22 milliKelvin (mK). For comparison: if the qubit had merely sat in the dilution refrigerator, it would have reached a temperature of 45–70 mK. State-of-the-art protocols had lowered scrap-paper qubits’ temperatures to 40–49 mK. So our quantum refrigerator outperformed our competitors, through the lens of temperature. (Our quantum refrigerator cooled more slowly than they did, though.)
Simone, José Antonio, and I have followed up on our autonomous quantum refrigerator with a forward-looking review about useful autonomous quantum machines. Keep an eye out for a blog post about the review…and for what we hope grows into a subfield.
In summary, yes, publishing a popular-science book can benefit one’s research.
This is a bit of a shaggy dog story, but I think it’s fun. There’s also a moral about the nature of mathematical research.
Once I was interested in the McGee graph, nicely animated here by Mamouka Jibladze:
This is the unique (3,7)-cage, meaning a graph such that each vertex has 3 neighbors and the shortest cycle has length 7. Since it has a very symmetrical appearance, I hoped it would be connected to some interesting algebraic structures. But which?
I read on Wikipedia that the symmetry group of the McGee graph has order 32. Let’s call it the McGee group. Unfortunately there are many different 32-element groups — 51 of them, in fact! — and the article didn’t say which one this was. (It does now.)
I posted a general question:
and Gordon Royle said the McGee group is “not a super-interesting group, it is SmallGroup(32,43) in either GAP or Magma”. Knowing this let me look up the McGee group on this website, which is wonderfully useful if you’re studying finite groups:
There I learned that the McGee group is the so-called holomorph of the cyclic group : that is, the semidirect product of and its automorphism group:
I resisted getting sucked into the general study of holomorphs, or what happens when you iterate the holomorph construction. Instead, I wanted a more concrete description of the McGee group.
is not just an abelian group: it’s a ring! Since multiplication in a ring distributes over addition, we can get automorphisms of the group by multiplying by those elements that have multiplicative inverses. These invertible elements form a group
called the multiplicative group of . In fact these give all the automorphisms of the group .
In short, the McGee group is
This is very nice, because this is the group of all transformations of of the form
If we think of as a kind of line — called the ‘affine line over ’ — these are precisely all the affine transformations of this line. Thus, the McGee group deserves to be called
This suggests that we can build the McGee graph in some systematic way starting from the affine line over . This turns out to be a bit complicated, because the vertices come in two kinds. That is, the McGee group doesn’t act transitively on the set of vertices. Instead, it has two orbits, shown as red and blue dots here:
The 8 red vertices correspond straightforwardly to the 8 points of the affine line, but the 16 blue vertices are more tricky. There are also the edges to consider: these come in three kinds! Greg Egan figured out how this works, and I wrote it up:
Then a decade passed.
About two weeks ago, I gave a Zoom talk at the Illustrating Math Seminar about some topics on my blog Visual Insight. I mentioned that the McGee group is SmallGroup(32,43) and the holomorph of . And then someone — alas, I forget who — instantly typed in the chat that this is one of the two smallest groups with an amazing property! Namely, this group has an outer automorphism that maps each element to an element conjugate to it.
I didn’t doubt this for a second. To paraphrase what Hardy said when he received Ramanujan’s first letter, nobody would have the balls to make up this shit. So, I posed a challenge to find such an exotic outer automorphism:
By reading around, I soon learned that people have studied this subject quite generally:
Martin Hertweck, Class-preserving automorphisms of finite groups, Journal of Algebra 241 (2001), 1–26.
Manoj K. Yadav, Class preserving automorphisms of finite p-groups: a survey, Journal of the London Mathematical Society 75 (2007), 755–772.
An automorphism is class-preserving if for each there exists some such that
If you can use the same for every we call an inner automorphism. But some groups have class-preserving automorphisms that are not inner! These are the class-preserving outer automorphisms.
I don’t know if class-preserving outer automorphisms are good for anything, or important in any way. They mainly just seem intriguingly spooky. An outer automorphism that looks inner if you examine its effect on any one group element is nothing I’d ever considered. So I wanted to see an example.
Rising to my challenge, Greg Egan found a nice explicit formula for some class-preserving outer automorphisms of the McGee group.
As we’ve seen, any element of the McGee group is a transformation
so let’s write it as a pair . Greg Egan looked for automorphisms of the McGee group that are of the form
for some function
It is easy to check that is an automorphism if and only if
Moreover, is an inner automorphism if and only if
for some .
Now comes something cool noticed by Joshua Grochow: these formulas are an instance of a general fact about group cohomology!
Suppose we have a group acting as automorphisms of an abelian group . Then we can define the cohomology to be the group of -cocycles modulo -coboundaries. We only need the case here. A 1-cocycle is none other than a function obeying
while a 1-coboundary is one of the form
for some . You can check that every 1-coboundary is a 1-cocycle. is the group of 1-cocycles modulo 1-coboundaries.
In this situation we can define the semidirect product , and for any we can define a function
by
Now suppose and suppose is abelian. Then by straightforward calculations we can check:
and
Thus, will have outer automorphisms if .
When then is abelian and is the McGee group. This puts Egan’s idea into a nice context. But we still need to actually find maps that give outer automorphisms of the McGee group, and then find class-preserving ones. I don’t know how to do that using general ideas from cohomology. Maybe someone smart could do the first part, but the ‘class-preserving’ condition doesn’t seem to emerge naturally from cohomology.
Anyway, Egan didn’t waste his time with such effete generalities: he actually found all choices of for which
is a class-preserving outer automorphism of the McGee group. Namely:
Last Saturday after visiting my aunt in Santa Barbara I went to Berkeley to visit the applied category theorists at the Topos Institute. I took a train, to lessen my carbon footprint a bit. The trip took 9 hours — a long time, but a beautiful ride along the coast and then through forests and fields.
The day before taking the train, I discovered my laptop was no longer charging! So, I bought a pad of paper. And then, while riding the train, I checked by hand that Egan’s first choice of really is a cocycle, and really is not a coboundary, so that it defines an outer automorphism of the McGee group. Then — and this was fairly easy — I checked that it defines a class-preserving automorphism. It was quite enjoyable, since I hadn’t done any long calculations recently.
One moral here is that interesting ideas often arise from the interactions of many people. The results here are not profound, but they are certainly interesting, and they came from online conversations with Greg Egan, Gordon Royle, Joshua Grochow, the mysterious person who instantly knew that the McGee group was one of the two smallest groups with a class-preserving outer automorphism, and others.
But what does it all mean, mathematically? Is there something deeper going on here, or is it all just a pile of curiosities?
What did we actually do, in the end? Following the order of logic rather than history, maybe this. We started with a commutative ring , took its group of affine transformations , and saw this group must have outer automorphisms if
We saw this cohomology group really is nonvanishing when and . Furthermore, we found a class-preserving outer automorphism of .
This raises a few questions:
What is the cohomology in general?
What are the outer automorphisms of ?
When does have class-preserving outer automorphisms?
I saw bit about the last question in this paper:
They say that this paper:
proves has a class-preserving outer automorphism when is a multiple of 8.
Does this happen only for multiples of 8? Is this somehow related to the most famous thing with period 8 — namely, Bott periodicity? I don’t know.
From August 2013 to January 2017 I ran a blog called Visual Insight, which was a place to share striking images that help explain topics in mathematics. Here’s the video of a talk I gave last week about some of those images:
It was fun showing people the great images created by Refurio Anachro, Greg Egan, Roice Nelson, Gerard Westendorp and many other folks. For more info on the images I talked about, read on….
Here are the articles I spoke about:
You can see the whole blog at the AMS website or my website, or you can check out individual articles here:
Is that 79 articles? If I’d stopped at 78, that would be the dimension of E6.
This guide is intended for the Chapman undergraduate students who are attending this year’s APS Global Summit. It may be useful for others as well.
The APS Global Summit is a ginormous event, featuring dozens of parallel sessions at any given time. It can be exciting for first-time attendees, but also overwhelming. Here, I compile some advice on how to navigate the meeting and some suggestions for sessions and events you might like to attend.
The next sections contain some more specific suggestions about events, talks and sessions that you might like to attend.
I have never been to an orientation or networking event at the APS meeting, but then again I did not go to the APS meeting as a student. Networking is one of the best things you can do at the meeting, so do take any opportunities to meet and talk to people.
Time | Event | Location |
2:00pm – 3:00pm | First Time Attendee Orientation | Anaheim Convention Center, 201AB (Level 2) |
3:00pm – 4:00pm | Undergraduate Student Get Together | Anaheim Convention Center, 201AB (Level 2) |
Time | Event | Location |
12:30pm – 2:00pm | Students Lunch with the Experts | Anaheim Convention Center, Exhibit Hall B |
The student lunch with the Experts is especially worth it because you get a one-on-eight meeting with a physicist who works on a topic you are interested in. You also get a free lunch. Spaces are limited, so you need to sign up for it on the Sunday, and early if you want to get your choice of expert.
Generally speaking, food is very expensive in the convention center. Therefore, the more places you can get free food the better. There are networking events, some of which are aimed at students and some of which have free meals. Other good bets for free food include the receptions and business meetings. (With a business meeting you may have to first sit through a boring administrative meeting for an APS unit, but at least the DQI meeting will feature me talking about The Quantum Times.)
The next few sections highlight talks and sessions that involve people at Chapman. You may want to come to these not only to support local people, but also to find out more about areas of research that you might want to do undergraduate research projects in.
The following sessions are being chaired by Chapman faculty. The chair does not give a talk during the session, but acts as a host. But chairs usually work in the areas that the session is about, so it is a good way to get more of an overview of things they are interested in.
Day | Time | Chair | Session Title | Location |
Monday 17 | 11:30pm – 1:54pm | Matt Leifer | Quantum Foundations: Bell Inequalities and Causality | Anaheim Convention Center, 256B (Level 2) |
Wednesday 19 | 8:00am – 10:48am | Andrew Jordan | Optimal Quantum Control | Anaheim Convention Center, 258A (Level 2) |
Wednesday 19 | 11:30am – 1:30pm | Bibek Bhandari | Explorations in Quantum Computing | Virtual Only, Room 1 |
The talks listed below all have someone who is currently affiliated with Chapman as one or more of the authors. The Chapman person is not necessarily the person giving the talk.
The people giving the talks, especially if they are students or postdocs, would appreciate your support. It is also a good way of finding out more about research that is going on at Chapman.
Time | Speaker | Title | Location |
9:36am – 9:48am | Irwin Huang | Beyond Single Photon Dissipation in Kerr Cat Qubits | Ahaheim Convention Center, 161 (Level 1) |
9:48am – 10am | Bingcheng Qing | Benchmarking Single-Qubit Gates on a Noise-Biased Qubit: Kerr cat qubit | Anaheim Convention Center, 161 (Level 1) |
10:12am – 10:24am | Ahmed Hjar | Strong light-matter coupling to protect quantum information with Schrodinger cat states | Anaheim Convention Center, 161 (Level 1) |
10:24am – 10:36am | Bibek Bhandari | Decoherence in dynamically protected qubits | Anaheim Convention Center, 161 (Level 1) |
10:36am – 10:48am | Ke Wang | Control-Z two-qubit gate on 2D Kerr cats | Anaheim Convention Center, 161 (Level 1) |
4:12pm – 4:24pm | Adithi Ajith | Stabilizing two-qubit entanglement using stochastic path integral formalism | Anaheim Convention Center, 258A (Level 2) |
4:36pm – 4:48 pm | Alok Nath Singh | Capturing an electron during a virtual transition via continuous measurement | Anaheim Convention Center, 252B (Level 2) |
Time | Speaker | Title | Location |
8:48am – 9:00am | Alexandria O Udenkwo | Characterizing the energy and efficiency of an entanglement fueled engine in a circuit QED processor | Anaheim Convention Center, 162 (Level 1) |
12:30pm – 12:42pm | Yile Ying | A review and analysis of six extended Wigner’s friend arguments | Anaheim Convention Center, 256B (Level 2) |
1:54pm – 2:06pm | Indrajit Sen | ΡΤ-symmetric axion electrodynamics: A pilot-wave approach | Anaheim Marriott, Platinum 1 |
3:48pm – 4:00pm | Chuanhong Liu | Planar Fluxonium Qubits Design with 4-way Coupling | Anaheim Convention Center, 162 (Level 1) |
4:36pm – 4:48pm | Robert Czupryniak | Reinforcement Learning Meets Quantum Control – Artificially Intelligent Maxwell’s Demon | Anaheim Convention Center, 258A (Level 2) |
Time | Speaker | Title | Location |
10:36am – 10:48am | Dominic Briseno-Colunga | Dynamical Sweet Spot Manifolds of Bichromatically Driven Floquet Qubits | Anaheim Convention Center, 162 (Level 1) |
2:30pm – 2:42pm | Sayani Ghosh | Equilibria and Effective Rates of Transition in Astromers | Anaheim Marriott, Platinum 7 |
3:00pm – 3:12pm | Matt Leifer | A Foundational Perspective on PT-Symmetric Quantum Theory | Anaheim Convention Center, 151 (Level 1) |
5:36pm – 5:48pm | Sacha Greenfield | A unified picture for quantum Zeno and anti-Zeno effects | Anaheim Convention Center, 161 (Level 1) |
Time | Speaker | Title | Location |
1:18pm – 1:30pm | Lucas Burns | Delayed Choice Lorentz Transformations on a Qubit | Anaheim Convention Center, 256B (Level 2) |
4:48pm – 5:00pm | Noah J Stevenson | Design of fluxonium coupling and readout via SQUID couplers | Anaheim Convention Center, 161 (Level 1) |
5:00pm – 5:12pm | Kagan Yanik | Flux-Pumped Symmetrically Threaded SQUID Josephson Parametric Amplifier | Anaheim Convention Center, 204C (Level 2) |
5:00pm – 5:12pm | Abhishek Chakraborty | Two-qubit gates for fluxonium qubits using a tunable coupler | Anaheim Convention Center, 161 (Level 1) |
Time | Speaker | Title | Location |
10:12am – 10:24am | Nooshin M. Estakhri | Distinct statistical properties of quantum two-photon backscattering | Anaheim Convention Center, 253A (Level 2) |
10:48am – 11:00am | Le Hu | Entanglement dynamics in collision models and all-to-all entangled states | Anaheim Hilton, San Simeon AB (Level 4) |
11:54am – 12:06pm | Luke Valerio | Optimal Design of Plasmonic Nanotweezers with Genetic Algorithm | Anaheim Convention Center, 253A (Level 2) |
Poster sessions last longer than talks, so you can view the posters at your leisure. The presenter is supposed to stand by their poster and talk to people who come to see it. The following posters are being presented by Chapman undergraduates. Please drop by and support them.
Poster Number | Presenter | Title |
267 | Ponthea Zahraii | Machine learning-assisted characterization of optical forces near gradient metasurfaces |
400 | Clara Hunt | What the white orchid can teach us about radiative cooling |
401 | Nathan Taormina | Optimizing Insulation and Geometrical Designs for Enhanced Sub-Ambient Radiative Cooling Efficiency |
These are sessions that reflect my own interests. It is a good bet that you will find me at one of these, unless I am teaching, or someone I know is speaking somewhere else. There are multiple sessions at the same time, but what I will typically do is select the one that has the most interesting looking talk at the time and switch sessions from time to time or take a break from sessions entirely if I get bored.
Time | Session Title | Location |
8:00am – 11:00am | Quantum Science and Technology at the National DOE Research Centers: Progress and Opportunities | Anaheim Convention Center, 158 (Level 1) |
8:00am – 11:00am | Learning and Benchmarking Quantum Channels | Anaheim Convention Center, 258A (Level 2) |
10:45am – 12:33pm | Beginners Guide to Quantum Gravity | Anaheim Marriott, Grand Ballroom Salon E |
11:30am – 1:54pm | Quantum Foundations: Bell Inequalities and Causality | Anaheim Convention Center, 256B (Level 2) |
1:30pm – 3:18pm | History and Physics of the Manhattan Project and the Bombings of Hiroshima and Nagasaki | Anaheim Marriott, Platinum 9 |
3:00pm – 6:00pm | DQI Thesis Award Session | Anaheim Convention Center, 158 (Level 1) |
Time | Session Title | Location |
8:30am – 10:18am | Forum on Outreach and Engagement of the Public Invited Session | Anaheim Marriott, Orange County Salon 1 |
10:45am – 12:33pm | Pais Prize Session | Anaheim Marriott, Platinum 2 |
11:30am – 2:30pm | Applied Quantum Foundations | Anaheim Convention Center, 256B (Level 2) |
1:30pm – 3:18pm | Mini-Symposium: Research Validated Assessments in Education | Anaheim Marriott, Grand Ballroom Salon D |
1:30pm – 3:18pm | Research in Quantum Mechanics Instruction | Anaheim Marriott, Orange County Salon 1 |
3:00pm – 5:24pm | Landauer-Bennett Award Prize Symposium | Anaheim Convention Center, 158 (Level 1) |
3:00pm – 6:00pm | Undergraduate and Graduate Education I | Anaheim Convention Center, 263A (Level 2) |
3:00pm – 6:00pm | Invited Session for the Forum on Outreach and Engagement of the Public | Anaheim Convention Center, 155 (Level 1) |
3:45pm – 5:33pm | Highlights from the Special Collections of AJP and TPT on Teaching About Quantum | Anaheim Marriott, Platinum 3 |
6:15pm – 9:00pm | DQI Business Meeting | Anaheim Convention Center, 160 (Level 1) |
Time | Session Title | Location |
11:30am – 2:30pm | Quantum Information: Thermodynamics out of Equilibrium | Anaheim Hilton, San Simeon AB (Level 4) |
3:00pm – 5:36pm | Quantum Foundations: Measurements, Contextuality, and Classicality | Anaheim Convention Center, 151 (Level 1) |
3:00pm – 6:00pm | Beyond Knabenphysik: Women in the History of Quantum Physics | Anaheim Convention Center, 154 (Level 1) |
Time | Session Title | Location |
8:00am – 10:48am | Undergraduate Education | Anaheim Convention Center, 263A (Level 2) |
8:00am – 11:00am | Open Quantum Systems and Many-Body Dynamics | Anaheim Hilton, San Simeon AB (Level 4) |
11:30am – 2:30pm | Time in Quantum Mechanics and Thermodynamics | Anaheim Hilton, California C (Ballroom Level) |
11:30am – 2:30pm | Intersections of Quantum Science and Society | Anaheim Convention Center, 159 (Level 1) |
11:30am – 2:18pm | Quantum Foundations: Relativity, Gravity, and Geometry | Anaheim Convention Center, 256B (Level 2) |
3:00pm – 6:00pm | The Early History of Quantum Information Physics | Anaheim Convention Center, 154 (Level 1) |
3:00pm – 6:00pm | Quantum Thermalization: Understanding the Dynamical Foundation of Quantum Thermodynamics | Anaheim Hilton, California A (Ballroom Level) |
Time | Session Title | Location |
8:00am – 11:00am | Structures in Quantum Systems | Anaheim Convention Center, 258A (Level 2) |
8:00am – 10:24am | Science Communication in an Age of Misinformation and Disinformation | Anaheim Convention Center, 156 (Level 1) |
It is worthwhile to spend some time in the exhibit hall. It features a Careers Fair and a Grad School Fair, which will be larger and more relevant to physics students than other such fairs you might attend in the area.
But, of course, the main purpose of going to the exhibition hall is to acquire SWAG. Some free items I have obtained from past APS exhibit halls include:
I recommend going when the hall first opens to get the highest quality SWAG.
Other fun stuff to do at this year’s meeting includes:
This week’s lectures on instantons in my gauge theory class (a very important kind of theory for understanding many phenomenon in nature – light is an example of a phenomenon that is described by gauge theory) were a lot of fun to do, and mark the culmination of a month-long … Click to continue reading this post
The post Valuable Instants appeared first on Asymptotia.
I had this neat calculation in my drawer and on the occasion of quantum mechanic's 100th birthday in 2025, I decided I submit a talk about it to the March meeting of the DPG, the German physical society, in Göttingen. And to have to show something, I put it out on the arxiv today. The idea is as follows:
The GHZ experiment is a beautiful version of Bell's inequality that demonstrates you get to wrong conclusions when you assume that a property of a quantum system has to have some (unknown) value even when you don't measure it. I would say it shows quantum theory is not realistic, in the sense that unmeasured properties do not have secret values (different for example from classical statistical mechanics where you could imagine to actually measure the exact position of molecule number 2342 in your container of gas). For details, see the paper or this beautiful explanation by Coleman. I should mention here that there is another way out by assuming some non-local forces that conspire to make the result come out right never the less.
On the other hand there is Bohmian mechanics. This is well known to be a non-local theory (as the time evolution of its particles depend on the positions of all other particles in the system or even universe) but what I found more interesting is also realistic: There, it is claimed that all that matters are particles positions (including the positions of pointers on your measurement devices that you might interpret as showing something different than positions for example velocities or field strengths or whatever) and those have all (possibly unknown) values at all times even if you don't measure them.
So how can the two be brought together? There might be an obstacle in the fact that GHZ is usually presented to be a correlation of spins and in the Bohmian literature spins are not really positions, you will always have to make use of some Stern-Gerlach experiments to translate those into actual positions. But we can circumvent this the other way: We don't really need spins, we just need observables of the commutation relation of Pauli matrices. You might think that those cannot be realised with position measurements as they always commute but this is only true as you do the position measurements at equal times. If you wait between them, you can in fact have almost Pauli type operators.
So we can set up a GHZ experiment in terms of three particles in three boxes and for each particle you measure whether it is in the left or the right half of the box but for each particle you decide if you do it at time 0 or at a later moment. You can look at the correlation of the three measurements as a function of time (of course, as you measure different particles, the actual measurements you do still commute independent of time) and what you find is the blue line in
You can also (numerically) solve the Bohmian equation of motion and compute the expectation of the correlation of positions of the three particles at different times which gives the orange line, clearly something else. No surprise, the realistic theory cannot predict the outcome of an experiment that demonstrates that quantum theory is not realistic. And the non-local character of the evolution equation does not help either.
To save the Bohmian theory, one can in fact argue that I have computed the wrong thing: After measuring the position of one particle at time 0 or by letting it interact with a measuring device, the future time evolution of all particles is affected and one should compute that correlation with the corrected (effectively collapsed) wave function. That, however, I cannot do and I claim is impossible since it would depend on the details of how the first particle's position is actually measured (whereas the orthodox prediction above is independent of those details as those interactions commute with the later observations). In any case, at least my interpretation is that if you don't want to predict the correlation wrong the best you can do is to say you cannot do the calculation as it depends on unknown details (but the result of course shouldn't).
In any case, the standard argument why Bohmian mechanics is indistinguishable from more conventional treatments is that all that matters are position correlations and since those are given by psi-squared they are the same for all approaches. But I show this is not the case for these multi-time correlations.
Post script: What happens when you try to discuss physics with a philosopher: