Planet Musings

February 17, 2019

John BaezClimeworks

This article describes some recent work on ‘direct air capture’ of carbon dioxide—essentially, sucking it out of the air:

• Jon Gerntner, The tiny Swiss company that thinks it can help stop climate change, New York Times Magazine, 12 February 2019.

There’s a Swiss company called Climeworks that’s built machines that do this—shown in the picture above. So far they are using these machines for purposes other than reducing atmospheric CO2 concentrations: namely, making carbonated water for soft drinks, and getting greenhouses to have lots of carbon dioxide in the air, for tastier vegetables. And they’re just experimental, not economically viable yet:

The company is not turning a profit. To build and install the 18 units at Hinwil, hand-assembled in a second-floor workshop in Zurich, cost between $3 million and $4 million, which is the primary reason it costs the firm between $500 and $600 to remove a metric ton of CO₂ from the air. Even as the company has attracted about $50 million in private investments and grants, it faces the same daunting task that confronted Carl Bosch a century ago: How much can it bring costs down? And how fast can it scale up?

If they ever make it in these markets, greenhouses and carbonation might want 6 megatonnes of CO₂ annually. This is nothing compared to the 37 gigatonnes of CO₂ that we put into the atmosphere in 2018. In principle the technology Climeworks is using could be massively scaled up. After all, Napoleon used aluminum silverware, back when aluminum was more precious than gold… and only later did the technology for making aluminum improve to the point where the metal gained a mass market.

But can Climeworks’ technology actually be scaled up? Some are dubious:

M.I.T.’s Howard Herzog, for instance, an engineer who has spent years looking at the potential for these machines, told me that he thinks the costs will remain between $600 and $1,000 per metric ton. Some of Herzog’s reasons for skepticism are highly technical and relate to the physics of separating gases. Some are more easily grasped. He points out that because direct-air-capture machines have to move tremendous amounts of air through a filter or solution to glean a ton of CO₂ — the gas, for all its global impact, makes up only about 0.04 percent of our atmosphere — the process necessitates large expenditures for energy and big equipment. What he has likewise observed, in analyzing similar industries that separate gases, suggests that translating spreadsheet projections for capturing CO₂ into real-world applications will reveal hidden costs. “I think there has been a lot of hype about this, and it’s not going to revolutionize anything,” he told me, adding that he thinks other negative-emissions technologies will prove cheaper. “At best it’s going to be a bit player.”

What actually is the technology Climeworks is using? And what other technologies are available for sucking carbon dioxide out of the air—or out of the exhaust from fossil-fuel-burning power plants, or out of water?
I’ll have a lot more to say about the latter question in future articles. As for Climeworks, they describe their technology rather briefly here:

• Climeworks, Our technology.

They write:

Our plants capture atmospheric carbon with a filter. Air is drawn into the plant and the CO2 within the air is chemically bound to the filter.

Once the filter is saturated with CO2 it is heated (using mainly low-grade heat as an energy source) to around 100 °C (212 °F). The CO2 is then released from the filter and collected as concentrated CO2 gas to supply to customers or for negative emissions technologies.

CO2-free air is released back into the atmosphere. This continuous cycle is then ready to start again. The filter is reused many times and lasts for several thousand cycles.

What is the filter material?

The filter material is made of porous granulates modified with amines, which bind the CO2 in conjunction with the moisture in the air. This bond is dissolved at temperatures of 100 °C.

So, it seems their technology is an example of ‘amine gas treating’:

• Wikipedia, Amine gas treating.

In future posts I’ll talk a bit more about amine gas treating, but also other methods for absorbing carbon dioxide from air or from solution in water. Maybe you can help me figure out what’s the best method!

February 16, 2019

David Hoggcandidate Williamson

Today Marc Williamson (NYU) passed (beautifully, I might say) his PhD Candidacy exam. He is working on the progenitors of core-collapse supernovae, making inferences from post-peak-brightness spectroscopy. He has a number of absolutely excellent results. One is (duh!) that the supernovae types seem to form a continuum, which makes perfect sense, given that we think they come from a continuous process of envelope loss. Another is that the best time to type a supernova with spectroscopy is 10-15 days after maximum light. That's new! His work is based on the kind of machine-learning I love: Linear models and linear support vector machines. I love them because they are convex, (relatively) interpretable, and easy to visualize and check.

One amusing idea that came up is that if the stripped supernova types were not in a continuum, but really distinct types, then it might get really hard to explain. Like really hard. So I proposed that it could be a technosignature! That's a NASA neologism, but you can guess what it means. I discussed this more late in the day with Soledad Villar (NYU) and Adrian Price-Whelan (NYU), with whom we came up with ideas about wisdom signatures and foolishness signatures. See twitter for more.

Also with Villar I worked out a very simple toy problem to think about GANs: Have the data be two-d vectors drawn from a trivial distribution (like a 2-d Gaussian) and have the generator take a one-d gaussian draw and transform it into fake data. We were able to make a strong prediction about how the transform from the one-d to the two-d should look in the generator.

February 15, 2019

Matt von HippelValentine’s Day Physics Poem 2019

It’s that time of year again! Time for me to dig in to my files and bring you yet another of my old physics poems.

Plagued with Divergences

“The whole scheme of local field theory is plagued with divergences”

Is divergence ever really unexpected?

If you asked a computer, what would it tell you?

You’d hear a whirring first, lungs and heart of the machine beating faster and faster.

And you’d dismiss it.
You knew this wasn’t going to be an easy interaction.
It doesn’t mean you’re going to diverge.

And perhaps it would try to warn you, write it there on the page.
It might even notice, its built-in instincts telling you, by the book,
“This will diverge.”

But instincts lie, and builders cheat.
And it doesn’t mean you’re going to diverge.

Now, you do everything the slow way,
You need a different answer.
Dismiss your instincts and force yourself through
Piece by piece.

And now, you can’t stop hearing the whir
The machine’s beating heart
Even when it should be at rest

And step by step, it tries to minimize its errors
And step by step, the errors grow

And exhausted, in the end, you see splashed across the screen
Something bigger than it should ever have been.

But sometimes things feel big and strange.
That’s just the way of the big wide world.
And it doesn’t mean you’re going to diverge.

You could have seen the signs,
Power-counted, seen what could overwhelm.
And you could have regulated, with an epsilon of flexibility.

But this one, this time, was supposed to be
Needed to be
Physical Truth
And truth doesn’t diverge

So you keep going,
Wheezing breath and painstaking calculation,
And every little thing blowing up

It’s not like there’s a better way to live.

Jacques Distler Brotli

I finally got around to enabling Brotli compression on Golem. Reading the manual, I came across the BrotliAlterETag directive:

Description: How the outgoing ETag header should be modified during compression
Syntax: BrotliAlterETag AddSuffix|NoChange|Remove

with the description:

Append the compression method onto the end of the ETag, causing compressed and uncompressed representations to have unique ETags. In another dynamic compression module, mod_deflate, this has been the default since 2.4.0. This setting prevents serving “HTTP Not Modified (304)” responses to conditional requests for compressed content.
Don’t change the ETag on a compressed response. In another dynamic compression module, mod_deflate, this has been the default prior to 2.4.0. This setting does not satisfy the HTTP/1.1 property that all representations of the same resource have unique ETags.
Remove the ETag header from compressed responses. This prevents some conditional requests from being possible, but avoids the shortcomings of the preceding options.

Sure enough, it turns out that ETags+compression have been completely broken in Apache 2.4.x. Two methods for saving bandwidth, and delivering pages faster, cancel each other out and chew up more bandwidth than if one or the other were disabled.

To unpack this a little further, the first time your browser requests a page, Apache computes a hash of the page and sends that along as a header in the response

etag: "38f7-56d65f4a2fcc0"

When your browser requests the page again, it sends an

If-None-Match: "38f7-56d65f4a2fcc0"

header in the request. If that matches the hash of the page, Apaches sends a “HTTP Not Modified (304)” response, telling your browser the page is unchanged from the last time it requested it.

If the page is compressed, using mod_deflate, then the header Apache sends is slightly different

etag: "38f7-56d65f4a2fcc0-gzip"

So, when your browser sends its request with an

If-None-Match: "38f7-56d65f4a2fcc0-gzip"

header, Apache compares “38f7-56d65f4a2fcc0-gzip” with the hash of the page, concludes that they don’t match, and sends the whole page again (thus wasting all the bandwidth you originally saved by sending the page compressed).

This is completely brain-dead. And, even though the problem has been around for years, the Apache folks don’t seem to have gotten around to fixing it. Instead, they just replicated the problem in mod_brotli (with a “-br” suffix replacing “-gzip”).

The solution is drop-dead simple. Add the line

RequestHeader edit "If-None-Match" '^"((.*)-(gzip|br))"$' '"$1", "$2"'

to your Apache configuration file. This gives Apache two ETags to compare with: the one with the suffix and the original unmodified one. The latter will match the hash of the file and Apache will return a “HTTP Not Modified (304)” as expected.

Why Apache didn’t just implement this in their code is beyond me.

BackreactionDark Matter – Or What?

Yesterday I gave a colloq about my work with Tobias Mistele on superfluid dark matter. Since several people asked for the slides, I have uploaded them to slideshare. You can also find the pdf here. I previously wrote about our research here and here. All my papers are openly available on the arXiv. Dark Matter - Or What? from Sabine Hossenfelder

February 14, 2019

BackreactionWhen gravity breaks down

[img:clipartmax] Einstein’s theory of general relativity is more than a hundred years old, but still it gives physicists headaches. Not only are Einstein’s equations hideously difficult to solve, they also clash with physicists other most-cherish achievement, quantum theory. Problem is, particles have quantum properties. They can, for example, be in two places at once. These particles also

David Hoggtruly theoretical work on growth of structure

The highlight of my day was a great NYU CCPP Brown-Bag talk by Mikhail Ivanov (NYU) about the one-point pdf of dark-matter density in the Universe, using a modified spherical-collapse model, based on things in this paper. It turns out that you can do a very good job of predicting counts-in-cells or equivalent one-point functions for the dark-matter density by considering the relationship between the linear theory and a non-linearity related to the calculable non-linearity you can work out in spherical collapse. More specifically, his approach is to expand the perturbations in the neighborhood of a point into a monopole term and a sum of radial functions times spherical harmonics. The monopole term acts like spherical collapse and the higher harmonics lead to a multiplicative correction. The whole framework depends on some mathematical properties of gravitational collapse that Ivanov can't prove but seem to be true in simulations. The theory is non-perturbative in the sense that it goes well into non-linear scales, and does well. That's some impressive theory, and it was a beautiful talk.

David HoggHow does the disk work?

Jason Hunt (Toronto) was in town today to discuss things dynamical. We discussed various things. I described to him the MySpace project in which Price-Whelan (Princeton) and I are trying to make a data-driven classification of kinematic structures in the thin disk. He described a project in which he is trying to build consistent dynamical models of these structures. He finds that there is no trivial explanation of all the visible structure; probably multiple things are at work. But his models do look very similar to the data qualitatively, so it sure is promising.

John PreskillWhy care about physics that doesn’t care about us?

A polar vortex had descended on Chicago.

I was preparing to fly in, scheduled to present a seminar at the University of Chicago. My boyfriend warned, from Massachusetts, that the wind chill effectively lowered the temperature to -50 degrees F. I’d last encountered -50 degrees F in the short story “To Build a Fire,” by Jack London. Spoiler alert: The protagonist fails to build a fire and freezes to death.

The story exemplifies naturalism, according to my 11th-grade English class. The naturalist movement infiltrated American literature and art during the late 19th century. Naturalists portrayed nature as as harsh and indifferent: The winter doesn’t care if Jack London’s protagonist dies.

The protagonist lingered in my mind as my plane took off. I was flying into a polar vortex for physics, the study of nature. Physics doesn’t care about me. How can I care so much about physics? How can humans generally?

Peeling apart that question, I found more layers than I’d packed for protection against the polar vortex.


Intellectualism formed the parka of the answer: You can’t hug space, time, information, energy, and the nature of reality. You can’t smile at them and watch them smile back. But their abstractness doesn’t block me from engaging with them; it attracts me. Ideas attract me; their purity does. Physics consists partially of a framework of ideas—of mathematical models, of theorems and examples, of the hypotheses and plots and revisions that underlie a theory.

The framework of physics needs construction. Some people compose songs; some build businesses; others bake soufflés; I used to do arts and crafts. Many humans create—envision, shape, mold, and coordinate—with many different materials. Theoretical physics overflows with materials and with opportunities to create. As humans love to create, we can love physics. Theoretical-physics materials consist of ideas, which might sound less suited to construction than paint does. But painters glob mixtures of water, resin, acrylic, and pigment onto woven fabric. Why shouldn’t ideas appeal as much as resin does? I build worlds in my head for a living. Doesn’t that sound romantic?

Painters derive joy from painting; dancers derive joy from moving; physics offers outlets for many skills. Doing physics, I use math. I learn history: What paradoxes about quantum theory did Albert Einstein pose to Niels Bohr? I write papers and blog posts, and I present seminars and colloquia. I’ve moonlighted as a chemist, studied biology, dipped into computer science, and sought to improve engineering. Beyond these disciplines, physics requires uniquely physical skills: the identification of questions about the natural world, the translation of those questions into math, and the translation of mathematical results into statements about the natural world. In college, I hated having to choose a major because I wanted to study everything. Physics lets me.


My attraction to physics worried me in college. Jim Yong Kim became Dartmouth’s president in my junior year. Jim, who left to helm the World Bank, specializes in global health. He insisted that “the world’s troubles are your troubles,” quoting former Dartmouth president John Sloan Dickey. I was developing a specialization in quantum information theory. I wasn’t trying to contain ebola, mitigate droughts, or eradicate Alzheimer’s disease. Should I not have been trying to save the world?

I could help save the world, a mentor said, through theoretical physics.1 Society needs a few people to develop art, a few to write music, a few to curate history, and a few to study the nature of the universe. Such outliers help keep us human, and the reinforcement of humanity helps save the world. You may indulge in physics, my mentor said, because physics affords the opportunity to do good. If I appreciate that opportunity, how can I not appreciate physics?

The opportunity to do good has endeared physics to me more as I’ve advanced. The more I advance, the fewer women I see. According to the American Physical Society (APS), in 2017, women received about 21% of the physics Bachelor’s degrees awarded in the U.S. Women received about 18% of the doctorates. In 2010, women numbered 8% of the full professors in U.S. departments that offered Bachelor’s or higher degrees in physics. The APS is conducting studies, coordinating workshops, and offering grants to improve the gender ratio. Departments, teachers, and mentors are helping. They have my gratitude. Yet they can accomplish only so much, especially since many are men. They can encourage women to change the gender ratio; they can’t change the ratio directly. Only women can, and few women are undertaking the task. Physics affords an opportunity to do good—to improve a field’s climate, to mentor, and to combat stereotypes—that few people can tackle. For that opportunity, I’m grateful to physics.


Physics lifts us beyond the Platonic realm of ideas in two other ways. At Caltech, I once ate lunch with Charlie Marcus. Marcus is a Microsoft researcher and a professor of physics at the University of Copenhagen’s Niels Bohr Institute. His lab is developing topological quantum computers, in which calculations manifest as braids. Why, I asked, does quantum computing deserve a large chunk of Marcus’s life?

Two reasons, he replied. First, quantum computing straddles the border between foundational physics and applications. Quantum science satisfies the intellect but doesn’t tether us to esoterica. Our science could impact technology, industry and society. Second, the people. Quantum computing has a community steeped in congeniality.

Marcus’s response delighted me: His reasons for caring about quantum computing coincided with two of mine. Reason two has expanded, in my mind, to opportunities for engagement with people. Abstractions attract me partially because intellectualism runs in my family. I grew up surrounded by readers, encouraged to ask questions. Physics enables me to participate in a family tradition and to extend that tradition to the cosmos. My parents now ask me the questions—about black holes and about whether I’m staying warm in Chicago.

Beyond family, physics enables me to engage with you. This blog has connected me to undergraduates, artists, authors, computer programmers, science teachers, and museum directors across the world. Scientific outreach inspires reading, research, art, and the joy of learning. I love those outcomes, participating in them, and engaging with you.


Why fly into a polar vortex for the study of nature—why care about physics that can’t care about us? In my case, primarily because of the ideas, the abstraction, and the chances to create and learn. Partially for the chance to help save the world through humanness, outreach, and a gender balance. Partially for the chance to impact technology, and partially to connect with people: Physics can strengthen ties to family and can introduce you to individuals across the globe. And physics can— heck, tomorrow is February 14th—lead you to someone who cares enough to track Chicago’s weather from Cambridge.


1I’m grateful that Jim Kim, too, encouraged me to pursue theoretical physics.

February 13, 2019

John BaezExploring New Technologies

I’ve got some good news! I’ve been hired by Bryan Johnson to help evaluate and explain the potential of various technologies to address the problem of climate change.

Johnson is an entrepreneur who sold his company Braintree for $800M and started the OS Fund in 2014, seeding it with $100M to invest in the hard sciences so that we can move closer towards becoming proficient system administrators of our planet: engineering atoms, molecules, organisms and complex systems. The fund has invested in many companies working on synthetic biology, genetics, new materials, and so on. Here are some writeups he’s done on these companies.

As part of my research I’ll be blogging about some new technologies, asking questions and hoping experts can help me out. Stay tuned!

Terence Tao255B, Notes 2: Onsager’s conjecture

We consider the incompressible Euler equations on the (Eulerian) torus {\mathbf{T}_E := ({\bf R}/{\bf Z})^d}, which we write in divergence form as

\displaystyle  \partial_t u^i + \partial_j(u^j u^i) = - \eta^{ij} \partial_j p \ \ \ \ \ (1)

\displaystyle  \partial_i u^i = 0, \ \ \ \ \ (2)

where {\eta^{ij}} is the (inverse) Euclidean metric. Here we use the summation conventions for indices such as {i,j,l} (reserving the symbol {k} for other purposes), and are retaining the convention from Notes 1 of denoting vector fields using superscripted indices rather than subscripted indices, as we will eventually need to change variables to Lagrangian coordinates at some point. In principle, much of the discussion in this set of notes (particularly regarding the positive direction of Onsager’s conjecture) could also be modified to also treat non-periodic solutions that decay at infinity if desired, but some non-trivial technical issues do arise non-periodic settings for the negative direction.

As noted previously, the kinetic energy

\displaystyle  \frac{1}{2} \int_{\mathbf{T}_E} |u(t,x)|^2\ dx = \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} u^i(t,x) u^j(t,x)\ dx

is formally conserved by the flow, where {\eta_{ij}} is the Euclidean metric. Indeed, if one assumes that {u,p} are continuously differentiable in both space and time on {[0,T] \times \mathbf{T}}, then one can multiply the equation (1) by {u^l} and contract against {\eta_{il}} to obtain

\displaystyle  \eta_{il} u^l \partial_t u^i + \eta_{il} u^l \partial_j (u^j u^i) = - \eta_{il} u^l \eta^{ij} \partial_j p = 0

which rearranges using (2) and the product rule to

\displaystyle  \partial_t (\frac{1}{2} \eta_{ij} u^i u^j) + \partial_j( \frac{1}{2} \eta_{il} u^i u^j u^l ) + \partial_j (u^j p)

and then if one integrates this identity on {[0,T] \times \mathbf{T}_E} and uses Stokes’ theorem, one obtains the required energy conservation law

\displaystyle  \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} u^i(T,x) u^j(T,x)\ dx = \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} u^i(0,x) u^j(0,x)\ dx. \ \ \ \ \ (3)

It is then natural to ask whether the energy conservation law continues to hold for lower regularity solutions, in particular weak solutions that only obey (1), (2) in a distributional sense. The above argument no longer works as stated, because {u^i} is not a test function and so one cannot immediately integrate (1) against {u^i}. And indeed, as we shall soon see, it is now known that once the regularity of {u} is low enough, energy can “escape to frequency infinity”, leading to failure of the energy conservation law, a phenomenon known in physics as anomalous energy dissipation.

But what is the precise level of regularity needed in order to for this anomalous energy dissipation to occur? To make this question precise, we need a quantitative notion of regularity. One such measure is given by the Hölder space {C^{0,\alpha}(\mathbf{T}_E \rightarrow {\bf R})} for {0 < \alpha < 1}, defined as the space of continuous functions {f: \mathbf{T}_E \rightarrow {\bf R}} whose norm

\displaystyle  \| f \|_{C^{0,\alpha}(\mathbf{T}_E \rightarrow {\bf R})} := \sup_{x \in \mathbf{T}_E} |f(x)| + \sup_{x,y \in \mathbf{T}_E: x \neq y} \frac{|f(x)-f(y)|}{|x-y|^\alpha}

is finite. The space {C^{0,\alpha}} lies between the space {C^0} of continuous functions and the space {C^1} of continuously differentiable functions, and informally describes a space of functions that is “{\alpha} times differentiable” in some sense. The above derivation of the energy conservation law involved the integral

\displaystyle  \int_{\mathbf{T}_E} \eta_{ik} u^k \partial_j (u^j u^i)\ dx

that roughly speaking measures the fluctuation in energy. Informally, if we could take the derivative in this integrand and somehow “integrate by parts” to split the derivative “equally” amongst the three factors, one would morally arrive at an expression that resembles

\displaystyle  \int_{\mathbf{T}} \nabla^{1/3} u \nabla^{1/3} u \nabla^{1/3} u\ dx

which suggests that the integral can be made sense of for {u \in C^0_t C^{0,\alpha}_x} once {\alpha > 1/3}. More precisely, one can make

Conjecture 1 (Onsager’s conjecture) Let {0 < \alpha < 1} and {d \geq 2}, and let {0 < T < \infty}.

  • (i) If {\alpha > 1/3}, then any weak solution {u \in C^0_t C^{0,\alpha}([0,T] \times \mathbf{T} \rightarrow {\bf R})} to the Euler equations (in the Leray form {\partial_t u + \partial_j {\mathbb P} (u^j u) = u_0(x) \delta_0(t)}) obeys the energy conservation law (3).
  • (ii) If {\alpha \leq 1/3}, then there exist weak solutions {u \in C^0_t C^{0,\alpha}([0,T] \times \mathbf{T} \rightarrow {\bf R})} to the Euler equations (in Leray form) which do not obey energy conservation.

This conjecture was originally arrived at by Onsager by a somewhat different heuristic derivation; see Remark 7. The numerology is also compatible with that arising from the Kolmogorov theory of turbulence (discussed in this previous post), but we will not discuss this interesting connection further here.

The positive part (i) of Onsager conjecture was established by Constantin, E, and Titi, building upon earlier partial results by Eyink; the proof is a relatively straightforward application of Littlewood-Paley theory, and they were also able to work in larger function spaces than {C^0_t C^{0,\alpha}_x} (using {L^3_x}-based Besov spaces instead of Hölder spaces, see Exercise 3 below). The negative part (ii) is harder. Discontinuous weak solutions to the Euler equations that did not conserve energy were first constructed by Sheffer, with an alternate construction later given by Shnirelman. De Lellis and Szekelyhidi noticed the resemblance of this problem to that of the Nash-Kuiper theorem in the isometric embedding problem, and began adapting the convex integration technique used in that theorem to construct weak solutions of the Euler equations. This began a long series of papers in which increasingly regular weak solutions that failed to conserve energy were constructed, culminating in a recent paper of Isett establishing part (ii) of the Onsager conjecture in the non-endpoint case {\alpha < 1/3} in three and higher dimensions {d \geq 3}; the endpoint {\alpha = 1/3} remains open. (In two dimensions it may be the case that the positive results extend to a larger range than Onsager’s conjecture predicts; see this paper of Cheskidov, Lopes Filho, Nussenzveig Lopes, and Shvydkoy for more discussion.) Further work continues into several variations of the Onsager conjecture, in which one looks at other differential equations, other function spaces, or other criteria for bad behavior than breakdown of energy conservation. See this recent survey of de Lellis and Szekelyhidi for more discussion.

In these notes we will first establish (i), then discuss the convex integration method in the original context of the Nash-Kuiper embedding theorem. Before tackling the Onsager conjecture (ii) directly, we discuss a related construction of high-dimensional weak solutions in the Sobolev space {L^2_t H^s_x} for {s} close to {1/2}, which is slightly easier to establish, though still rather intricate. Finally, we discuss the modifications of that construction needed to establish (ii), though we shall stop short of a full proof of that part of the conjecture.

We thank Phil Isett for some comments and corrections.

— 1. Energy conservation for sufficiently regular weak solutions —

We now prove the positive part (i) of Onsager’s conjecture, which turns out to be a straightforward application of Littlewood-Paley theory. We need the following relation between Hölder spaces and Littlewood-Paley projections:

Exercise 2 Let {u \in C^{0,\alpha}(\mathbf{T}_E \rightarrow {\bf R})} for some {0 < \alpha < 1} and {d \geq 1}. Establish the bounds

\displaystyle  \| u \|_{C^{0,\alpha}(\mathbf{T}_E \rightarrow {\bf R})} \sim_{\alpha,d} \| P_{\leq 1} u \|_{C^0(\mathbf{T}_E \rightarrow {\bf R})} + \sup_{N > 1} N^\alpha \| P_N u \|_{C^0(\mathbf{T}_E \rightarrow {\bf R})}.

Let {u \in C^0_t C^{0,\alpha}([0,T] \times \mathbf{T}_E \rightarrow {\bf R})} be a weak solution to the Euler equations for some {1/3 < \alpha < 1}, thus

\displaystyle  \partial_t u + \partial_j {\mathbb P} (u^j u) = u_0(x) \delta_0(t). \ \ \ \ \ (4)

To show (3), it will suffice from dominated convergence to show that

\displaystyle  \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} P_{\leq N} u^i(T,x) P_{\leq N} u^i(T,x)\ dx

\displaystyle = \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} P_{\leq N} u^i(0,x) P_{\leq N} u^j(0,x)\ dx + o(1)

as {N \rightarrow \infty}. Applying {P_{\leq N}} to (4), we have

\displaystyle  \partial_t P_{\leq N} u + \partial_j P_{\leq N} {\mathbb P} (u^j u) = P_{\leq N} u_0(x) \delta_0(t).

From Bernstein’s inequality we conclude that {\partial_t P_{\leq N} u \in C^0_t C^\infty_x( \mathbf{T}_E \rightarrow {\bf R}^d)}, and thus {u \in C^1_t C^\infty_x( \mathbf{T}_E \rightarrow {\bf R}^d)}. Thus {P_{\leq N} u} solves the PDE

\displaystyle  \partial_t P_{\leq N} u + \partial_j P_{\leq N} {\mathbb P} (u^j u) = 0

\displaystyle  P_{\leq N} u(0,x) = P_{\leq N} u_0(x)

in the classical sense. We can then apply the fundamental theorem of calculus to write

\displaystyle  \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} P_{\leq N} u^i(T,x) P_{\leq N} u^j(T,x)\ dx

\displaystyle = \frac{1}{2} \int_{\mathbf{T}_E} \eta_{ij} P_{\leq N} u^i(0,x) P_{\leq N} u^j(0,x)\ dx

\displaystyle  + \int_0^T \int_{\mathbf{T}} P_{\leq N} u \cdot \partial_t P_{\leq N} u\ dx dt

and so it will suffice to show that

\displaystyle \int_0^T \int_{\mathbf{T}_E} P_{\leq N} u \cdot \partial_j P_{\leq N} {\mathbb P} (u^j u)\ dx dt = o(1).

We can integrate by parts to place the Leray projection {{\mathbb P}} onto the divergence-free factor {u}, at which point it may be removed. Moving the derivative {\partial_j} over there as well, we now reduce to showing that

\displaystyle \int_0^T \int_{\mathbf{T}_E} \partial_j P_{\leq N} u \cdot P_{\leq N} (u^j u)\ dx dt = o(1).

On the other hand, the expression {\partial_j P_{\leq N} u \cdot P_{\leq N} u^j P_{\leq N} u} is a total derivative (as {u} is divergence-free), and thus has vanishing integral. Thus it will remain to show that

\displaystyle \int_0^T \int_{\mathbf{T}_E} \partial_j P_{\leq N} u \cdot [ P_{\leq N} (u^j u) - P_{\leq N} u^j P_{\leq N} u] dx dt = o(1).

From Bernstein’s inequality, Exercise 2, and the triangle inequality one has for any time {t} that

\displaystyle  \| \partial_j P_{\leq N} u(t) \|_{L^\infty_x} \lesssim_d \| P_{\leq 1} u(t) \|_{L^\infty_x} + \sum_{1 < N' \leq N} N' \| P_{N'} u(t) \|_{L^\infty_x}

\displaystyle  \lesssim_{d,\alpha} \| u(t) \|_{C^{0,\alpha}_x} + \sum_{1 < N' \leq N} (N')^{1-\alpha} \| u(t) \|_{C^{0,\alpha}_x}

\displaystyle  \lesssim_{d,\alpha} N^{1-\alpha} \| u \|_{C^0_t C^{0,\alpha}_x};

as { \| u \|_{C^0_t C^{0,\alpha}_x}} is finite, it thus suffices to establish the pointwise bound

\displaystyle  P_{\leq N} (u^j u) - P_{\leq N} u^j P_{\leq N} u = o( N^{\alpha-1} ).

We split the left-hand side into the sum of

\displaystyle  P_{\leq N} (P_{\leq N/4} u^j u) - P_{\leq N/4} u^j P_{\leq N} u \ \ \ \ \ (5)

\displaystyle  P_{\leq N} (P_{> N/4} u^j P_{\leq N/4} u) - P_{\leq N} P_{> N/4} u^j P_{\leq N/4} u \ \ \ \ \ (6)


\displaystyle  P_{\leq N} (P_{> N/4} u^j P_{> N/4} u) - P_{\leq N} P_{> N/4} u^j P_{\leq N} P_{> N/4} u \ \ \ \ \ (7)

where we use the fact that {P_{\leq N} P_{\leq N/4} = P_{\leq N/4}}.

To treat (7), we use Exercise 2 to conclude that

\displaystyle  P_{> N/4} u \lesssim_{d,\alpha} N^{-\alpha} \| u \|_{C^0_t C^{0,\alpha}_x}

and so the quantity (7) is {O_{\alpha,d,u}(N^{-2\alpha})}, which is acceptable since {\alpha > 1/3}. Now we turn to (5). This is a commutator of the form

\displaystyle  P_{\leq N} (f u) - f P_{\leq N} u

where {f := P_{\leq N/4} u^j}. Observe that this commutator would vanish if {u} were replaced by {P_{\leq N/4} u}, thus we may write this commutator as

\displaystyle  P_{\leq N} (f g) - f P_{\leq N} g

where {g := P_{> N/4} u}. If we write

\displaystyle  P_{\leq N} g(x) = \int_{{\bf R}^d} \psi(y) g( x + y/N )\ dy

for a suitable Schwartz function {\psi} of total mass one, we have

\displaystyle  P_{\leq N} (f g)(x) - f P_{\leq N} g(x)

\displaystyle = \int_{{\bf R}^d} \psi(y) g( x + y/N ) (f(x+y/N) - f(x))\ dy;

writing {f(x+y/N) - f(x) = O( \frac{|y|}{N} \|\nabla f \|_{C^0})}, we thus have the bound

\displaystyle  P_{\leq N} (f g) - f P_{\leq N} g = O_d( N^{-1} \| g \|_{C^0} \|\nabla f \|_{C^0}.

But from Bernstein’s inequality and Exercise 2 we have

\displaystyle  \|\nabla f \|_{C^0} \lesssim_{d,\alpha} N^{1-\alpha} \| u \|_{C^0_t C^{0,\alpha}_x}


\displaystyle  \|g \|_{C^0} \lesssim_{d,\alpha} N^{-\alpha} \| u \|_{C^0_t C^{0,\alpha}_x}

and so we see that (5) is also of size {O_{\alpha,d,u}(N^{-2\alpha})}, which is acceptable since {\alpha > 1/3}. A similar argument gives (6), and the claim follows.

As shown by Constantin, E, and Titi, the Hölder regularity in the above result can be relaxed to Besov regularity, at least in non-endpoint cases:

Exercise 3 Let {\alpha > 1/3}. Define the Besov space {B^3_{\alpha,\infty}(\mathbf{T}_E \rightarrow {\bf R}^d)} to be the space of functions {u \in L^3(\mathbf{T}_E \rightarrow {\bf R}^d)} such that the Besov space norm

\displaystyle  \| u \|_{B^3_{\alpha,\infty}(\mathbf{T}_E \rightarrow {\bf R}^d)} := \| P_{\leq 1} u \|_{L^3(\mathbf{T}_E \rightarrow {\bf R}^d)} + \sup_{N>1} N^{\alpha} \| P_N u \|_{L^3(\mathbf{T}_E \rightarrow {\bf R}^d)}.

Show that if {u \in L^3_t B^3_{\alpha,\infty}([0,T] \times \mathbf{T}_E \rightarrow {\bf R}^d)} is a weak solution to the Euler equations, the energy {t \mapsto \frac{1}{2} \int_{\mathbf{T}_E} |u(t,x)|^2\ dx} is conserved in time.

The endpoint case {\alpha=1/3} of the above exercise is still open; however energy conservation in the slightly smaller space {B^3_{1/3,c({\bf N})}} is known thanks to the work of Cheskidov, Constantin, Friedlander, and Shvydkoy (see also this paper of Isett for further discussion, and this paper of Isett and Oh for an alternate argument that also works on Riemannian manifolds).

As observed by Isett (see also the recent paper of Colombo and de Rosa), the above arguments also give some partial information about the energy in the low regularity regime:

Exercise 4 Let {u \in C^0_t C^{0,\alpha}([0,T] \times \mathbf{T} \rightarrow {\bf R})} be a weak solution to the Euler equations for {0 < \alpha < 1}.

  • (i) If {\alpha=1/3}, show that the energy {t \mapsto \frac{1}{2} \int_{\mathbf{T}} |u(t,x)|^2\ dx} is a {C^1} function of time. (Hint: express the energy as the uniform limit of {t \mapsto \frac{1}{2} \int_{\mathbf{T}} |P_{\leq N} u(t,x)|^2\ dx}.)
  • (ii) If {0 < \alpha < 1/3}, show that the energy {t \mapsto \frac{1}{2} \int_{\mathbf{T}} |u(t,x)|^2\ dx} is a {C^{0,\frac{2\alpha}{1-\alpha}}} function of time.

Exercise 5 Let {u \in C^0_t C^{0,\alpha}([0,T] \times \mathbf{T} \rightarrow {\bf R})} be a weak solution to the Navier-Stokes equations for some {1/3 < \alpha < 1} with initial data {u_0}. Establish the energy identity

\displaystyle  \frac{1}{2} \int_{\mathbf{T}} |u(T,x)|^2 + \nu \int_0^T \int_{\mathbf{T}} |\nabla u(t,x)|^2\ dx dt = \frac{1}{2} \int_{\mathbf{T}} |u_0(x)|^2\ dx.

Remark 6 An alternate heuristic derivation of the {\alpha = 1/3} threshold for the Onsager conjecture is as follows. If {u \in C^0_t C^{0,\alpha}}, then from Exercise 2 we see that the portion of {u} that fluctuates at frequency {N} has amplitude {A} at most {A = O(N^{-\alpha})}; in particular, the amount of energy at frequencies {\sim N} is at most {O(N^{-2\alpha})}. On the other hand, by the heuristics in Remark 11 of 254A Notes 3, the time {T} needed for the portion of the solution at frequency {N} to evolve to a higher frequency scale such as {2N} is of order {T \sim \frac{1}{AN} \gg N^{2\alpha-1}}. Thus the rate of energy flux at frequency {N} should be {O( N^{-2\alpha}/T ) = O(N^{1-3\alpha})}. For {\alpha>1/3}, the energy flux goes to zero as {N \rightarrow \infty}, and so energy cannot escape to frequency infinity in finite time.

Remark 7 Yet another alternate heuristic derivation of the {\alpha = 1/3} threshold arises by considering the dynamics of individual Fourier coefficients. Using a Fourier expansion

\displaystyle  u(t,x) = \sum_{k \in {\bf Z}^d} \hat u(t,k) e^{2\pi i k \cdot x},

the Euler equations may be written as

\displaystyle  \partial_t \hat u^i(t,k) + 2\pi i k_j \sum_{k = k' + k''} \hat u^i(t,k') \hat u^j(t,k'') = - 2\pi i k_i \hat p(t,k)

\displaystyle  k_i \hat u^i(t,k) = 0.

In particular, the energy {|\hat u(t,k)|^2} at a single Fourier mode {k \in {\bf Z}^d} evolves according to the equation

\displaystyle  \partial_t |\hat u(t,k)|^2 = - 4 \mathrm{Re}( \pi i k_j \sum_{k = k' + k''} \eta_{il} \hat u^i(t,-k)\hat u^l(t,k') \hat u^j(t,k'') ). \ \ \ \ \ (8)

If {u \in C^0_t C^{0,\alpha}}, then we have {P_N u = O( N^{-\alpha})} for any {N > 1}, hence by Plancherel’s theorem

\displaystyle  \sum_{k \in {\bf Z}^d: |k| \sim N} |\hat u(t,k)|^2 \lesssim N^{-2\alpha}

which suggests that (up to logarithmic factors) one would expect {\hat u(t,k)} to be of magnitude about {N^{-\alpha-d/2}}. Onsager posited that for typically “turbulent” or “chaotic” flows, the main contributions to (8) come when {k',k''} have magnitude roughly comparable to that of {k}, and that the various summands should not be correlated strongly to each other. For {k \sim N}, one might expect about {N^d} significant terms in the sum, which according to standard “square root cancellation heuristics” (cf. the central limit theorem) suggests that the sum is about as large as {N^{-d/2}} times the main term. Thus the total flux of energy in or out of a single mode {k} would be expected to be of size {O( N^{d/2} N N^{-3(\alpha+d/2)} = N^{1-3\alpha} N^{-d} )}, and so the total flux in or out of the frequency range {|k| \sim N} (which consists of {\sim N^d} modes {k}) should be about {O(N^{1-3\alpha})}. As such, for {\alpha>1/3} the energy flux should decay in {N} and so there is no escape of energy to frequency infinity, whereas for {\alpha < 1/3} such an escape should be possible. Related heuristics can also support Kolmogorov’s 1941 model of the distribution of energy in the vanishing viscosity limit; see this blog post for more discussion. On the other hand, while Onsager did discuss the dynamics of individual Fourier coefficients in his paper, it appears that he arrived at the {1/3} threshold by a more physical space based approach, a rigorous version of which was eventually established by Duchon and Robert; see this survey of Eyink and Sreenivasan for more discussion.

— 2. The (local) isometric embedding problem —

Before we develop the convex integration method for fluid equations, we first discuss the simpler (and historically earlier) instance of this technique for the isometric embedding problem for Riemannian manifolds. To avoid some topological technicalities that are not the focus of the present set of notes, we only consider the local problem of embedding a small neighbourhood {U} of the origin {0} in {{\bf R}^d} into Euclidean space {{\bf R}^n}.

Let {U} be an open neighbourhood of {0} in {{\bf R}^d}. A (smooth) Riemannian metric on {U}, when expressed in coordinates, is a family {g = (g_{ij})_{1 \leq i,j \leq d}} of smooth maps {g_{ij}: U \rightarrow {\bf R}} for {i,j=1,\dots,d}, such that for each point {x \in U}, the matrix {(g_{ij}(x))_{1 \leq i,j \leq d}} is symmetric and positive definite. Any such metric {g} gives {U} the structure of an (incomplete) Riemannian manifold {(U, g)}. An isometric embedding of this manifold into a Euclidean space {{\bf R}^n} is a map {\Phi: U \rightarrow {\bf R}^n} which is continuously differentiable, injective, and obeys the equation

\displaystyle  \langle \partial_i \Phi, \partial_j \Phi \rangle_{{\bf R}^n} = g_{ij} \ \ \ \ \ (9)

pointwise on {U}, where {\langle, \rangle_{{\bf R}^n}} is the usual inner product (or dot product) on {{\bf R}^n}. In the differential geometry language from Notes 1, we are looking for an injective map {\Phi} such that the Euclidean metric {\eta} on {{\bf R}^n} pulls back to {g} via {\Phi}: {\Phi^* \eta = g}.

The isometric embedding problem asks, given a Riemannian manifold such as {(U,g)}, whether there is an isometric embedding from this manifold to a Euclidean space {{\bf R}^n}; for simplicity we only discuss the simpler local isometric embedding problem of constructing an isometric immersion of {(U', g)} into {{\bf R}^n} for some sufficiently small neighbourhood {U' \subset U} of the origin. In particular for the local problem we do not need to worry about injectivity since (9) ensures that the derivative map {D\Phi} is injective at the origin, and hence {\Phi} is injective near the origin by the inverse function theorem (indeed it is an immersion near the origin).

It is a celebrated theorem of Nash (discussed in this previous blog post) that the isometric embedding problem is possible in the smooth category if the dimension {n} is large enough. For sake of discussion we just present the local version:

Theorem 8 (Nash embedding theorem) Suppose that {n} is sufficiently large depending on {d}. Then for any smooth metric {g} on a neighbourhood {U} of the origin, there is a smooth local isometric embedding {\Phi: U' \rightarrow {\bf R}^n} on some smaller neighbourhood {U'} of the origin.

The optimal value of {n} depending on {d} is not completely known, but it grows roughly quadratically in {d}. Indeed, in this paper of Günther it is shown that one can take

\displaystyle  n = \frac{d(d+3)}{2} + 5.

In the other direction, one cannot take {n} below {\frac{d(d+1)}{2}}:

Proposition 9 Suppose that {n < \frac{d(d+1)}{2}}. Then there exists a smooth Riemannian metric {g} on an open neighbourhood {U} of the origin in {{\bf R}^d} such that there is no smooth embedding {\Phi} from any smaller neighbourhood {U'} of the origin to {{\bf R}^n}.

Proof: Informally, the reason for this is that the given field {g_{ij}} has {\frac{d(d+1)}{2}} degrees of freedom (which is the number of independent fields after accounting for the symmetry {g_{ij}=g_{ji}}), but there are only {n} degrees of freedom for the unknown {\Phi}. To make this rigorous, we perform a Taylor expansion of both {g_{ij}} and {\Phi} around the origin up to some large order {N}, valid for a sufficiently small neighbourhood {U'}:

\displaystyle  g_{ij}(x) = \sum_{r_1,\dots,r_d \geq 0: r_1 + \dots + r_d \leq N} g_{ij, r_1,\dots, r_d} x_1^{r_1} \dots x_d^{r_d} + O( |x|^{N+1})

\displaystyle  \Phi(x) = \sum_{r_1,\dots,r_d \geq 0: r_1 + \dots + r_d \leq N+1} \Phi_{r_1,\dots, r_d} x_1^{r_1} \dots x_d^{r_d} + O( |x|^{N}).

Equating coefficients, we see that the coefficients

\displaystyle  (g_{ij,r_1,\dots,r_d})_{1 \leq i \leq j \leq d; r_1+\dots+r_d \leq N} \in {\bf R}^{\frac{d(d+1)}{2} \binom{N+d-1}{d-1}} \ \ \ \ \ (10)

are a polynomial function of the coefficients

\displaystyle  (\Phi_{r_1,\dots, r_d})_{1 \leq i \leq j \leq d; r_1+\dots+r_d \leq N} \in {\bf R}^{n \binom{N+d}{d-1}};

this polynomial can be written down explicitly if desired, but its precise form will not be relevant for the argument. Observe that the space of possible coefficients contains an open ball, as can be seen by considering arbitrary perturbations {g_{ij}} of the Euclidean metric {\delta_{ij}} on {\mathbf{T}} (here it is important to restrict to {i \leq j} in order to avoid the symmetry constraint {g_{ji} = g_{ij}}; also, the positive definiteness of the metric will be automatic as long as one restricts to sufficiently small perturbations). Comparing dimensions, we conclude that if every smooth metric {g} had a smooth embedding {\Phi}, one must have the inequality

\displaystyle  \frac{d(d+1)}{2} \binom{N+d-1}{d-1} \geq n \binom{N+d}{d-1}.

Dividing by {N^{d-1}} and sending {N \rightarrow \infty}, we conclude that {\frac{d(d+1)}{2} \geq n}. Taking contrapositives, the claim follows. \Box

Remark 10 If one replaces “smooth” with “analytic”, one can reverse the arguments here using the Cauchy-Kowaleski theorem and show that any analytic metric on {{\bf R}^d} can be locally analytically embedded into {{\bf R}^{\frac{d(d+1)}{2}}}; this is a classical result of Cartan and Janet.

Apart from the slight gap in dimensions, this would seem to settle the question of when a {d}-dimensional metric may be locally isometrically embedded in {{\bf R}^n}. However, all of the above arguments required the immersion map {\Phi} to be smooth (i.e., {C^\infty}), whereas the definition of an isometric embedding only required the regularity of {C^1}.

It is a remarkable (and somewhat counter-intuitive) result of Nash and Kuiper that if one only requires the embedding to be in {C^1}, then one can embed into a much lower dimensional space:

Theorem 11 (Nash-Kuiper embedding theorem) Let {n \geq d+1}. Then for any smooth metric {g} on a neighbourhood {U} of the origin, there is a {C^1} local isometric embedding {\Phi: U' \rightarrow {\bf R}^n} on some smaller neighbourhood {U'} of the origin.

Nash originally proved this theorem with the slightly weaker condition {n \geq d+2}; Kuiper then obtained the optimal condition {n \geq d+1}. The case {n=d} fails due to curvature obstructions; for instance, if the Riemannian metric {g} has positive scalar curvature, then small Riemannian balls of radius {r} will have (Riemannian) volume slightly less than their Euclidean counterparts, whereas any {C^1} embedding into {{\bf R}^d} will preserve both Riemannian length and volume, preventing such an isometric embedding from existing.

Remark 12 One striking illustration of the distinction between the {C^1} and smooth categories comes when considering isometric embeddings of the round sphere {S^2} (with the usual metric) into Euclidean space {{\bf R}^3}. It is a classical result (see e.g., Spivak’s book) that the only {C^2} isometric embeddings of {S^2} in {{\bf R}^3} are the obvious ones coming from composing the inclusion map {\iota: S^2 \rightarrow {\bf R}^3} with an isometry of the Euclidean space; however, the Nash-Kuiper construction allows one to create an {C^1} embedding of {S^2} into an arbitrarily small ball! Thus the {C^1} embedding problem lacks the “rigidity” of the {C^2} embedding problem. This is an instance of a more general principle that nonlinear differential equations such as (10) can become much less rigid when one weakens the regularity hypotheses demanded on the solution.

To prove this theorem we work with a relaxation of the isometric embedding problem. We say that {(\Phi,R)} is a short isometric embedding on {U'} if {\Phi: U' \rightarrow {\bf R}^n}, {R: U' \rightarrow {\bf R}^{d^2}} solve the equation

\displaystyle  \langle \partial_i \Phi, \partial_j \Phi \rangle_{{\bf R}^n} + R_{ij} = g_{ij}  \ \ \ \ \ (11)

on {U'} with the matrix {R(x) = (R_{ij}(x))_{1 \leq i,j \leq d}} symmetric and positive definite for all {x \in U'}. With the additional unknown field {R_{ij}} it is much easier to solve the short isometric problem than the true problem. For instance:

Proposition 13 Let {n \geq d}, and let {g} be a smooth Riemannian metric on a neighbourhood {U} of the origin in {{\bf R}^d}. There is at least one short isometric embedding {(\Phi, R)}.

Proof: Set {\Phi(x) := \varepsilon \iota(x)} and {R_{ij} := g_{ij} - \varepsilon \eta_{ij}} for a sufficiently small {\varepsilon>0}, where {\iota: {\bf R}^d \rightarrow {\bf R}^n} is the standard embedding, and {\eta} the Euclidean metric on {U}; this will be a short isometric embedding on some neighbourhood of the origin. \Box

To create a true isometric embedding {\Phi}, we will first construct a sequence {(\Phi^{(N)},R^{(N)})} of short embeddings with {R^{(N)}} converging to zero in a suitable sense, and then pass to a limit. The key observation is then that by using the fact that the positive matrices lie in the convex hull of the rank one matrices, one can add a high frequency perturbation to the first component {\Phi} of a short embedding {(\Phi,R)} to largely erase the error term {R}, replacing it instead with a much higher frequency error.

We now prove Theorem 11. The key iterative step is

Theorem 14 (Iterative step) Let {n \geq d+1}, let {B} be a closed ball in {{\bf R}^d}, let {g} be a smooth Riemannian metric on {B}, and let {(\Phi,R)} be a short isometric embedding on {B}. Then for any {\varepsilon > 0}, one can find a short isometric embedding {(\Phi',R')} to (11) on {B} with

\displaystyle  \| R' \|_{C^0(B \rightarrow {\bf R}^{d^2})} \lesssim_{d,n} \varepsilon

\displaystyle  \| \Phi' - \Phi \|_{C^0(B \rightarrow {\bf R}^{n})} \lesssim_{d,n} \varepsilon

\displaystyle  \| \Phi' - \Phi \|_{C^1(B \rightarrow {\bf R}^{n})} \lesssim_{d,n} \| R \|_{C^0(B \rightarrow {\bf R}^{d^2})}^{1/2} + \varepsilon.

Suppose for the moment that we had Theorem 14. Starting with the short isometric embedding {(\Phi^{(0)}, R^{(0)})} on a ball {B} provided by Proposition 13, we can iteratively apply the above theorem to obtain a sequence of short isometric embeddings {(\Phi^{(m)}, R^{(m)}, g)} on {B} with

\displaystyle  \| R^{(m)} \|_{C^0(B \rightarrow {\bf R}^{d^2})} \lesssim_{d,n} 2^{-m}

\displaystyle  \| \Phi^{(m)} - \Phi^{(m-1)} \|_{C^0(B \rightarrow {\bf R}^{n})} \lesssim_{d,n} 2^{-m}

\displaystyle  \| \Phi^{(m)} - \Phi^{(m-1)} \|_{C^1(B \rightarrow {\bf R}^{n})} \lesssim_{d,n} \| R_{m-1} \|_{C^0(B \rightarrow {\bf R}^{d^2})}^{1/2} + 2^{-m}.

for {m \geq 1}. From this we see that {R^{(m)}} converges uniformly to zero, while {\Phi^{(m)}} converges in {C^1} norm to a {C^1} limit {\Phi}, which then solves (10) on {B}, giving Theorem 11. (Indeed, this shows that the space of {C^1} isometric embeddings is dense in the space of {C^1} short maps in the {C^0} topology.)

We prove Theorem 14 through a sequence of reductions. Firstly, we can rearrange it slightly:

Theorem 15 (Iterative step, again) Let {n \geq d+1}, let {B} be a closed ball in {{\bf R}^d}, let {\Phi: B \rightarrow {\bf R}^n} be a smooth immersion, and let {R = (R_{ij})_{1 \leq i,j \leq d}} be a smooth Riemannian metric on {B}. Then there exists a sequence {\Phi^{(N)}: B \rightarrow {\bf R}^n} of smooth immersions for {N=1,2,\dots} obeying the bounds

\displaystyle  \langle \partial_i \Phi^{(N)}, \partial_j \Phi^{(N)} \rangle_{{\bf R}^n} = \langle \partial_i \Phi, \partial_j \Phi \rangle_{{\bf R}^n} + R_{ij} + o(1)

\displaystyle  \Phi^{(N)} = \Phi + o(1)

\displaystyle  \nabla \Phi^{(N)} = \nabla \Phi + O_{d,n}( \| R \|_{C^0(B \rightarrow {\bf R}^{d^2})}^{1/2} ) + o(1) \ \ \ \ \ (12)

uniformly on {B} for {i,j=1,\dots,d}, where {o(1)} denotes a quantity that goes to zero as {N \rightarrow \infty} (for fixed choices of {n,d,B,\Phi,R}).

Let us see how Theorem 15 implies Theorem 14. Let the notation and hypotheses be as in Theorem 14. We may assume {\varepsilon>0} to be small. Applying Theorem 15 with {R} replaced by {R - \varepsilon g} (which will be positive definite for {\varepsilon} small enough), we obtain a sequence {\Phi^{(N)}: B \rightarrow {\bf R}^n} of smooth immersions obeying the estimates

\displaystyle  \| \langle \partial_i \Phi^{(N)}, \partial_j \Phi^{(N)} \rangle_{{\bf R}^n} - \langle \partial_i \Phi, \partial_j \Phi \rangle_{{\bf R}^n} - R_{ij} - \varepsilon g_{ij} \|_{C^0( B \rightarrow {\bf R} )} = o(1) \ \ \ \ \ (13)

\displaystyle  \| \Phi^{(N)} - \Phi \|_{C^0(B \rightarrow {\bf R}^n)} = o(1)

\displaystyle  \| \Phi^{(N)} - \Phi \|_{C^1(B \rightarrow {\bf R}^n)} \lesssim_{d,n} \| R \|_{C^0(B \rightarrow {\bf R}^{d^2})}^{1/2} + O(\varepsilon) + o(1).

If we set

\displaystyle  R^{(N)}_{ij} := \langle \partial_i \Phi, \partial_j \Phi \rangle_{{\bf R}^n} + R_{ij} - \langle \partial_i \Phi^{(N)}, \partial_j \Phi^{(N)} \rangle_{{\bf R}^n}

then {R^{(N)}} is smooth, symmetric, and (from (13)) will be positive definite for {N} large enough. By construction, we thus have {(\Phi^{(N)}, R^{(N)}, g)} solving (11), and Theorem 14 follows.

To prove Theorem 15, it is convenient to break up the metric {R} into more “primitive” pieces that are rank one matrices:

Lemma 16 (Rank one decomposition) Let {B} be a closed ball in {{\bf R}^d}, and let {R = (R_{ij})_{1 \leq i,j \leq d}} be a smooth Riemannian metric on {B}. Then there exists a finite collection {v^1,\dots,v^K} of unit vectors {v^k = (v^k_i)_{1 \leq i \leq d}} in {{\bf R}^d}, and smooth functions {a^1,\dots,a^K: B \rightarrow {\bf R}}, such that

\displaystyle  R_{ij}(x) = \sum_{k=1}^K a^k(x)^2 v^k_i v^k_j

for all {x \in B}. Furthermore, for each {x \in B}, at most {O_d(1)} of the {a_k(x)} are non-zero.

Remark 17 Strictly speaking, the unit vectors {v^k} should belong to the dual space {({\bf R}^d)^*} of {{\bf R}^d} rather than {{\bf R}^d} itself, in order to have the index {i} appear as subscripts instead of superscripts. A similar consideration occurs for the frequency vectors {k = (k_i)_{1 \leq i \leq d}} from Remark 7. However, we will not bother to distinguish between {{\bf R}^d} and {({\bf R}^d)^*} here (since they are identified using the Euclidean metric).

Proof: Fix a point {x_0} in {B}. Then the matrix {R = (R_{ij}(x_0))_{1 \leq i,j \leq d}} is symmetric and positive definite; one can thus write {R = \sum_{m=1}^d \lambda_m (u^m)^T u^m}, where {u^1,\dots,u^d} is an orthonormal basis of (column) eigenvectors of {R} and {\lambda_m>0} are the eigenvalues (we suppress for now the dependence of these objects on {x_0}). Using the parallelogram identity

\displaystyle  (u^m)^T (u^m) + (u^{m'})^T (u^{m'})

\displaystyle = \frac{1}{2} (u^m + u^{m'})^T (u^m + u^{m'}) + \frac{1}{2} (u^m - u^{m'})^T (u^m - u^{m'})

we can then write

\displaystyle  R(x_0) = \sum_{k=1}^{2d^2} (a^k)^2 (v^k)^T v^k \ \ \ \ \ (14)

for some positive real numbers {a^k>0}, where {v^k} are the {2d^2} unit vectors of the form {\frac{1}{\sqrt{2}} (u^m \pm u^{m'})} for {1 \leq m,m' \leq d}, enumerated in an arbitrary order. From the further parallelogram identity

\displaystyle  (u^m)^T (u^{m'}) + (u^{m'})^T u^m

\displaystyle = \frac{1}{2} (u^m + u^{m'})^T (u^m + u^{m'}) - \frac{1}{2} (u^m - u^{m'})^T (u^m - u^{m'})

we see that every sufficiently small symmetric perturbation of {R(x_0)} also has a representation of the form (14) with slightly different coefficients {a^k} that depend smoothly on the perturbation. As {R} is smooth, we thus see that for {x} sufficiently close to {x_0} we have the decomposition

\displaystyle  R(x) = \sum_{k=1}^{2d^2} a^k(x)^2 (v^k)^T (v^k)

for some positive quantities {a^k(x)} varying smoothly with {x}. This gives the lemma in a small ball {B(x_0,r_0)} centred at {x}; the claim then follows by covering {B} by a finite number balls of the form {B(x_0,r_0/4)} (say), covering these balls by balls {B(y_\alpha,\delta)} of a fixed radius {\delta} smaller than all the {r_0/4} in the finite cover, in such a way that any point lies in at most {O_d(1)} of the balls {B(y_\alpha,2\delta)}, constructing a smooth partition of unity {1 = \sum_\alpha \psi_\alpha(x)^2} adapted to the {B(y_\alpha,2\delta)}, multiplying each of the decompositions of {R(x)} previously obtained on {B(y_i,2\delta)} (which each lie in one of the {B(x_0,r_0)}) by {\psi_\alpha(x)^2}, and summing to obtain the required decomposition on {B}. \Box

Remark 18 Informally, Lemma 16 lets one synthesize a metric {R_{ij}} as a “convex integral” of rank one pieces, so that if the problem at hand has the freedom to “move” in the direction of each of these rank one pieces, then it also has the freedom to move in the direction {R_{ij}}, at least if one is working in low enough regularities that one can afford to rapidly change direction from one rank one perturbation to another. This convex integration technique was formalised by Gromov in his interpretation of the Nash-Kuiper method as part of his “{h}-principle“, which we will not discuss further here.

One can now deduce Theorem 15 from

Theorem 19 (Iterative step, rank one version) Let {n \geq d+1}, let {B} be a closed ball in {{\bf R}^d}, let {\Phi: B \rightarrow {\bf R}^n} be a smooth immersion, let {v = (v_i)_{i=1,\dots,d} \in {\bf R}^d} be a unit vector, and let {a: B \rightarrow {\bf R}} be smooth. Then there exists a sequence {\Phi^{(N)}: B \rightarrow {\bf R}^n} of smooth immersions for {N=1,2,\dots} obeying the bounds

\displaystyle  \langle \partial_i \Phi^{(N)}, \partial_j \Phi^{(N)} \rangle_{{\bf R}^n} = \langle \partial_i \Phi, \partial_j \Phi \rangle_{{\bf R}^n} + a^2 v_i v_j + o(1) \ \ \ \ \ (15)

\displaystyle  \Phi^{(N)} = \Phi + o(1)

\displaystyle  \nabla \Phi^{(N)} = \nabla \Phi + O_{d,n}(\| a \|_{C^0(B \rightarrow {\bf R})}) + o(1)

uniformly on {B} for {i,j=1,\dots,d}. Furthermore, the support of {\Phi^{(N)}-\Phi} is contained in the support of {a}.

Indeed, suppose that Theorem 19 holds, and we are in the situation of Theorem 15. We apply Lemma 16 to obtain the decomposition

\displaystyle  R_{ij}(x) = \sum_{k=1}^K a^k(x)^2 v^k_i v^k_j

with the stated properties. On taking traces we see that

\displaystyle \|a^k\|_{C^0(B \rightarrow {\bf R})} \lesssim_d \| R \|_{C^0(B \rightarrow {\bf R}^{d^2})}^{1/2}

for all {k}. Applying Theorem 19 {K} times (and diagonalising the sequences as necessary), we obtain sequences {\Phi^{(N,k)}: B \rightarrow {\bf R}^n} of smooth immersions for {k=0,1,\dots,K} such that {\Phi^{(N,0)} = \Phi} and one has

\displaystyle  \langle \partial_i \Phi^{(N,k)}, \partial_j \Phi^{(N,k)} \rangle_{{\bf R}^n} = \langle \partial_i \Phi^{(N,k-1)}, \partial_j \Phi^{(N,k-1)} \rangle_{{\bf R}^n} + (a^k)^2 v^k_i v^k_j + o(1)

\displaystyle  \Phi^{(N,k)} = \Phi^{(N,k-1)} + o(1)

\displaystyle  \Phi^{(N,k)} = \Phi^{(N,k-1)} + O_{d,n}(\| R \|_{C^0(B \rightarrow {\bf R}^{d^2})}^{1/2}) + o(1),

an such that the support of {\Phi^{(N,k)}-\Phi^{(N,k-1)}} is contained in that of {a^k}. The claim then follows from the triangle inequality, noting that the implied constant in (12) will not depend on {K} because of the bounded overlap in the supports of the {\Phi^{(N,k)}-\Phi^{(N,k-1)}}.

It remains to prove Theorem 19. We note that the requirement that {\Phi^{(N)}} be an immersion will be automatic from (15) for {N} large enough since {\Phi} was already an immersion, making the matrix {(\langle \partial_i \Phi(x), \partial_j \Phi(x) \rangle_{{\bf R}^n} )_{1 \leq i,j \leq n}} positive definite uniformly for {x \in B}, and this being unaffected by the addition of the {o(1)} perturbation and the positive semi-definite rank one matrix {(a^2 v_i v_j)_{1 \leq i,j \leq n}}.

Writing {\Phi^{(N)} = \Phi + \Psi^{(N)}}, it will suffice to find a sequence of smooth maps {\Psi^{(N)}: B \rightarrow {\bf R}^n} supported in the support of {a} and obeying the approximate difference equation

\displaystyle  \langle \partial_i \Psi^{(N)}, \partial_j \Phi \rangle_{{\bf R}^n} + \langle \partial_i \Phi, \partial_j \Psi^{(N)} \rangle_{{\bf R}^n} + \langle \partial_i \Psi^{(N)}, \partial_j \Psi^{(N)} \rangle_{{\bf R}^n} = a^2 v_i v_j + o(1) \ \ \ \ \ (16)

and the bounds

\displaystyle  \Psi^{(N)} = o(1) \ \ \ \ \ (17)

\displaystyle  \nabla \Psi^{(N)} = O_{d,n}(\| a \|_{C^0(B \rightarrow {\bf R})}) + o(1) \ \ \ \ \ (18)

uniformly on {B}.

To locate these functions {\Psi^{(N)}}, we use the method of slow and fast variables. First we observe by applying a rotation that we may assume without loss of generality that {v} is the unit vector {e_1}, thus {v_i = 1_{i=1}}. We then use the ansatz

\displaystyle  \Psi^{(N)}(x) = \frac{1}{N} \mathbf{\Psi}( x, N x \hbox{ mod } {\bf Z}^d )

where {\mathbf{\Psi}: B \times \mathbf{T}_F \rightarrow {\bf R}^n} is a smooth function independent of {N} to be chosen later; thus {\Psi^{(N)}} is a function both of the “slow” variable {x \in B} and the “fast” variable {y := Nx \hbox{ mod } {\bf Z}^d} taking values in the“fast torus” {\mathbf{T}_F := ({\bf R}/{\bf Z})^d}. (We adopt the convention here of using boldface symbols to denote functions of both the fast and slow variables. The fast torus is isomorphic to the Eulerian torus {\mathbf{T}_E} from the introduction, but we denote them by slightly different notation as they play different roles.) Thus {\Psi^{(N)}} is a low amplitude but highly oscillating perturbation to {\Phi}. The fast variable oscillation means that {\Psi^{(N)}} will not be bounded in regularity norms higher than {C^1} (and so this ansatz is not available for use in the smooth embedding problem), but because we only wish to control {C^0} and {C^1} type quantities, we will still be able to get adequate bounds for the purposes of {C^1} embedding. Now that we have twice as many variables, the problem becomes more “underdetermined” and we can arrive at a simpler PDE by decoupling the role of the various variables (in particular, we will often work with PDE where the derivatives of the main terms are in the fast variables, but the coefficients only depend on the slow variables, and are thus effectively constant coefficient with respect to the fast variables).

Remark 20 Informally, one should think of functions {f: B \times \mathbf{T}_F \rightarrow {\bf R}^m} that are independent of the fast variable {y} as being of “low frequency”, and conversely functions that have mean zero in the fast variable {y} (thus {\int_{{\mathbf T}_F} f(x,y)\ dy = 0} for all {x}) as being of “high frequency”. Thus for instance any smooth function on {B \times \mathbf{T}_F} can be uniquely decomposed into a “low frequency” component and a “high frequency” component, with the two components orthogonal to each other. In later sections we will start inverting “fast derivatives” {N \nabla_y} on “high frequency” functions, which will effectively gain important factors of {N^{-1}} in the analysis. See also the table below for the dictionary between ordinary physical coordinates and fast-slow coordinates.

Position {x \in U} Slow variable {x \in U}
Fast variable {Nx \hbox{ mod } {\bf Z}^d} Fast variable {y \in {\bf T}_F}
Function {f( x, Nx \hbox{ mod } {\bf Z}^d)} Function {\mathbf{f}(x,y)}
{\partial_{x^i}} {\partial_{x^i} + N \partial_{y^i}}
Low-frequency function {f(x)} Function {f(x)} independent of {y}
High-frequency function {f(x)} Function {\mathbf{f}(x,y)} mean zero in {y}
N/A Slow derivative {\partial_{x^i}}
N/A Fast derivative {\partial_{y^i}}

If we expand out using the chain rule, using {\partial_{x_i}} and {\partial_{y_i}} to denote partial differentiation in the coordinates of the slow and fast variables respectively, and noting that all terms with at least one power of {1/N} can be absorbed into the {o(1)} error, we see that we will be done as long as we can construct {\mathbf{\Psi}} to obey the bounds

\displaystyle  \partial_{y_i} \mathbf{\Psi} = O_{d,n}(\| a \|_{C^0(B \rightarrow {\bf R})}) \ \ \ \ \ (19)

and solve the exact equation

\displaystyle  \langle \partial_{y_i} \mathbf{\Psi}, \partial_{x_j} \Phi \rangle_{{\bf R}^n} + \langle \partial_{x_i} \Phi, \partial_{y_j} \mathbf{\Psi} \rangle_{{\bf R}^n} + \langle \partial_{y_i} \mathbf{\Psi}, \partial_{y_j} \mathbf{\Psi} \rangle_{{\bf R}^n} = a^2 1_{i=j=1} \ \ \ \ \ (20)

where {\Phi, a} are viewed as functions of the slow variable {x} only. The original approach of Nash to solve this equation was to use a function {\mathbf{\Psi}} that was orthogonal to the entire gradient of {\Phi}, thus

\displaystyle  \langle \partial_{x_i} \Phi, \mathbf{\Psi} \rangle_{{\bf R}^n} = 0 \ \ \ \ \ (21)

for {i=1,\dots,d}. Taking derivatives in {y_j} one would conclude that

\displaystyle  \langle \partial_{x_i} \Phi, \partial_{y_j} \mathbf{\Psi} \rangle_{{\bf R}^n} = 0

and similarly

\displaystyle  \langle \partial_{y_i} \mathbf{\Psi}, \partial_{x_j} \Phi \rangle_{{\bf R}^n} = 0,

and one now just had to solve the equation

\displaystyle  \langle \partial_{y_i} \mathbf{\Psi}, \partial_{y_j} \mathbf{\Psi} \rangle_{{\bf R}^n} = a^2 1_{i=j=1}. \ \ \ \ \ (22)

For this, Nash used a “spiral” construction

\displaystyle  \mathbf{\Psi}(x,y) = \frac{a(x)}{2\pi} ( u(x) \cos(2\pi y_1) + v(x) \sin(2\pi y_1) )

where {u,v: B \rightarrow {\bf R}^n} were unit vectors varying smoothly with respect to the slow variable; this obeys (22) and (19), and would also obey (21) if the vectors {u(x)} and {v(x)} were both always orthogonal to the entire gradient of {\Phi}. This is not possible in {n=d+1} (as {{\bf R}^n} cannot then support {d+2} linearly independent vectors), but there is no obstruction for {n \geq d+2}:

Lemma 21 (Constructing an orthogonal frame) Let {\Phi: B \rightarrow {\bf R}^n} be an immersion. If {n \geq d+2}, then there exist smooth vector fields {u,v: B \rightarrow {\bf R}^n} such that at every point {x}, {u(x), v(x)} are unit vectors orthogonal to each other and to {\partial_{x_i} \Phi(x)} for {i=1,\dots,d}.

Proof: Applying the Gram-Schmidt process to the linearly independent vectors {\partial_{x_i} \Phi(x)} for {i=1,\dots,d}, we can find an orthonormal system of vectors {w_1(x),\dots,w_d(x)}, depending smoothly on {x \in B}, whose span is the same as the span of the {\partial_{x_i} \Phi(x)}. Our task is now to find smooth functions {u,v: B \rightarrow {\bf R}^n} solving the system of equations

\displaystyle  \langle u, u \rangle_{{\bf R}^n} = \langle v,v \rangle_{{\bf R}^n} = 1 \ \ \ \ \ (23)

\displaystyle  \langle u, v \rangle_{{\bf R}^n} = 0 \ \ \ \ \ (24)

\displaystyle  \langle u, w_i \rangle_{{\bf R}^n} = \langle v, w_i \rangle_{{\bf R}^n} = 0 \ \ \ \ \ (25)

on {B}.

For {n \geq d+2} this is possible at the origin {x=0} from the Gram-Schmidt process. Now we extend in the {e_1} direction to the line segment {\{ x_1 e_1: |x_1| \leq 1 \}}. To do this we evolve the fields {u,v} by the parallel transport ODE

\displaystyle  \partial_{x_1} u := - \langle u, \partial_{x_1} w_i \rangle w_i

\displaystyle  \partial_{x_1} v := - \langle v, \partial_{x_1} w_i \rangle w_i

on this line segment. From the Picard existence and uniqueness theorem we can uniquely extend {u,v} smoothly to this segment with the specified initial data at {0}, and a simple calculation using Gronwall’s inequality shows that the system of equations (23), (24), (25) is preserved by this evolution. Then, one can extend to the disk {\{ x_1 e_1 + x_2 e_2: x_1^2+x_2^2 \leq 1 \}} by using the previous extension to the segment as initial data and solving the parallel transport ODE

\displaystyle  \partial_{x_2} u := - \langle u, \partial_{x_2} w_i \rangle w_i.

\displaystyle  \partial_{x_2} v := - \langle v, \partial_{x_2} w_i \rangle w_i.

Iterating this procedure we obtain the claim. \Box

This concludes Nash’s proof of Theorem 11 when {n \geq d+2}. Now suppose that {n=d+1}. In this case we cannot locate two unit vector fields {u,v} orthogonal to each other and to the entire gradient of {\Phi}; however, we may still obtain one such vector field {u: B \rightarrow {\bf R}^n} by repeating the above arguments. By Gram-Schmidt, we can then locate a smooth unit vector field {v: B \rightarrow {\bf R}^n} which is orthogonal to {u} and to {\partial_i \Phi} for {i=2,\dots,d}, but for which the quantity {c := 2\langle v, \partial_1 \Phi \rangle_{{\bf R}^n}} is positive. If we use the “Kuiper corrugation” ansatz

\displaystyle  \mathbf{\Psi}(x,y) = u(x) f(x, y_1) + v(x) g(x, y_1)

for some smooth functions {f,g: B \times {\bf R}/{\bf Z} \rightarrow {\bf R}}, one is reduced to locating such functions {f,g} that obey the bounds

\displaystyle  \partial_{y_1} f, \partial_{y_1} g = O( a )

and the ODE

\displaystyle  (\partial_{y_1} f)^2 + (\partial_{y_1} g)^2 + c \partial_{y_1} g = a^2.

This can be done by an explicit construction:

Exercise 22 (One-dimensional corrugation) For any positive {c>0} and any {a \geq 0}, show that there exist smooth functions {f,g: {\bf R}/{\bf Z} \rightarrow {\bf R}} solving the ODE

\displaystyle  (f')^2 + (g')^2 + c g' = a^2

and which vary smoothly with {a,c} (even at the endpoint {a=0}), and obey the bounds

\displaystyle  f', g' = O(a).

(Hint: one can renormalise {c=1}. The problem is basically to locate a periodic function {t \mapsto (X_a(t), Y_a(t))} mapping {{\bf R}/{\bf Z}} to the circle {\{ (x,y): x^2 + y^2 + y = a^2 \}} of mean zero and Lipschitz norm {O(a)} that varies smoothly with {a}. Choose {X_a(t) = a^2 X(t)} for some smooth and small {X} that is even and compactly supported in {(0,1/2) \cup (1/2,1)} with mean zero on each interval, and then choose {Y_a} to be odd.)

This exercise supplies the required functions {f,g: B \times {\bf R}/{\bf Z} \rightarrow {\bf R}}, completing Kuiper’s proof of Theorem 11 when {n \geq d+1}.

Remark 23 For sake of discussion let us restrict attention to the surface case {d=2}. For the local isometric embedding problem, we have seen that we have rigidity at regularities at or above {C^2}, but lack of regularity at {C^1}. The precise threshold at which rigidity occurs is not completely known at present: a result of Borisov (also reproven here) gives rigidity at the {C^{1,\alpha}} level for {\alpha > 2/3}, while a result of de Lellis, Inauen, and Szekelyhidi (building upon a series of previous results) establishes non-rigidity when {\alpha < 1/5}. For recent results in higher dimensions, see this paper of Cao and Szekelyhidi.

— 3. Low regularity weak solutions to Navier-Stokes in high dimensions —

We now turn to constructing solutions (or near-solutions) to the Euler and Navier-Stokes equations. For minor technical reasons it is convenient to work with solutions that are periodic in both space and time, and normalised to have zero mean at every time (although the latter restriction is not essential for our arguments, since one can always reduce to this case after a Galilean transformation as in 254A Notes 1). Accordingly, let {\Omega} denote the periodic spacetime

\displaystyle  \Omega := ({\bf R}/{\bf Z}) \times \mathbf{T}_E,

and let {X^\infty} denote the space of smooth periodic functions {u: \Omega \rightarrow {\bf R}^d} that have mean zero and are divergence-free at every time {t \in {\bf R}/{\bf Z}}, thus

\displaystyle  \int_{\mathbf{T}_E} u(t,x)\ dx = 0


\displaystyle  \partial_i u^i = 0.

We use {L^p_{t,x}} as an abbreviation for {L^p_t L^p_x(\Omega \rightarrow {\bf R}^m)} for various vector spaces {{\bf R}^m} (the choice of which will be clear from context).

Let {\nu \geq 0} (for now, our discussion will apply both to the Navier-Stokes equations {\nu>0} and the Euler equations {\nu=0}). Smooth solutions to Navier-Stokes equations then take the form

\displaystyle  \partial_t u^i + \partial_j (u^i u^j) = \nu \Delta u^i - \eta^{ij} \partial_j p

for some {u \in X^\infty} and smooth {p: \Omega \rightarrow {\bf R}^d}. Here of course {\Delta = \eta^{ij} \partial_i \partial_j} denotes the spatial Laplacian.

Much as we replaced the equation (10) in the previous section with (11), we will consider the relaxed version

\displaystyle  \partial_t u^i + \partial_j( u^i u^j ) = \nu \Delta u^i + \partial_j R^{ij} - \eta^{ij} \partial_j p \ \ \ \ \ (26)

\displaystyle  \partial_i u^i = 0 \ \ \ \ \ (27)

\displaystyle  R^{ij} = R^{ji} \ \ \ \ \ (28)

of the Navier-Stokes equations, where we have now introduced an additional field {R: \Omega \rightarrow {\bf R}^{d^2}}, known as the Reynolds stress (cf. the Cauchy stress tensor from 254A Notes 0). If {u: \Omega \rightarrow {\bf R}^d}, {p: \Omega \rightarrow {\bf R}}, {R: \Omega \rightarrow {\bf R}^{d^2}} are smooth solutions to (26), (27), (28), with {u} having mean zero at every time, then we call {(u,p,R)} a Navier-Stokes-Reynolds flow (or Euler-Reynolds flow, if {\nu=0}). Note that if {R=0} then we recover a solution to the true Navier-Stokes equations. Thus, heuristically, the smaller {R} is, the closer {u} and {p} should become to a solution to the true Navier-Stokes equations. (The Reynolds stress tensor {R^{ij}} here is a rank {(2,0)} tensor, as opposed to the rank {(0,2)} tensor {R_{ij}} used in the previous section to measure the failure of isometric embedding, but this will not be a particularly significant distinction.)

Note that if {(u,p,R)} is a Navier-Stokes-Reynolds flow, and {v: \Omega \rightarrow {\bf R}^d}, {q: \Omega \rightarrow {\bf R}}, {S: \Omega \rightarrow {\bf R}^{d^2}} are smooth functions, then {(u+v, p+q, S)} will also be a Navier-Stokes-Reynolds flow if and only if {v} has mean zero at every time, and {(v,q,S)} obeys the difference equation

\displaystyle  (\partial_t - \nu \Delta) v^i + \partial_j( u^i v^j + u^j v^i + v^i v^j + R^{ij} + q \eta^{ij} - S^{ij} ) = 0 \ \ \ \ \ (29)

\displaystyle  \partial_i v^i = 0 \ \ \ \ \ (30)

\displaystyle  S^{ij} = S^{ji}. \ \ \ \ \ (31)

When this occurs, we say that {(v,q,S)} is a difference Navier-Stokes-Reynolds flow at {(u,p,R)}.

It will be thus of interest to find, for a given {(u,p,R)}, difference Navier-Stokes-Reynolds flows {(v,q,S)} at {(u,p,R)} with {S} small, as one could hopefully iterate this procedure and take a limit to construct weak solutions to the true Euler equations. The main strategy here will be to choose a highly oscillating (and divergence-free) correction velocity field {v^i} such that {v^i v^j} approximates {-R^{ij} - q \eta^{ij}} up to an error which is also highly oscillating (and somewhat divergence-free). The effect of this error can then eventually be absorbed efficiently into the new Reynolds stress tensor {S^{ij}}. Of course, one also has to manage the other terms {\partial_t v^i}, {\nu \Delta v^i}, {u^i v^j}, {u^j v^i} appearing in (29). In high dimensions it turns out that these terms can be made very small in {L^1_{t,x}} norm, and can thus be easily disposed of. In three dimensions the situation is considerably more delicate, particularly with regards to the {\nu \Delta v^i} and {u^i v^j} terms; in particular, the transport term {u^i v^j} term is best handled by using a local version of Lagrangian coordinates. We will discuss these subtleties in later sections.

To execute above strategy, it will be convenient to have an even more flexible notion of solution, in which {v} is no longer required to be perfectly divergence-free and mean zero, and is also allowed to be slightly inaccurate in solving (29). We say that {(v,q,S,f,F)} is an approximate difference Navier-Stokes-Reynolds flow at {(u,p,R)} if {v, F: \Omega \rightarrow {\bf R}^d}, {q, f: \Omega \rightarrow {\bf R}}, {S: \Omega \rightarrow {\bf R}^{d^2}} are smooth functions obeying the system

\displaystyle  (\partial_t - \nu \Delta) v^i + \partial_j( u^i v^j + u^j v^i + v^i v^j + R^{ij} + q \eta^{ij} - S^{ij} ) = F^i \ \ \ \ \ (32)

\displaystyle  \partial_i v^i = f \ \ \ \ \ (33)

\displaystyle  S^{ij} = S^{ji}. \ \ \ \ \ (34)

If the error terms {f,F}, as well as the mean of {v}, are all small, one can correct an approximate difference Navier-Stokes-Reynolds flow {(v,q,S,f,F)} to a true difference Navier-Stokes-Reynolds flow {(v',q',S')} with only small adjustments:

Exercise 24 (Removing the error terms) Let {(u,p,R)} be a Navier-Stokes-Reynolds flow, and let {(v,q,S,f,F)} be an approximate difference Navier-Stokes-Reynolds flow at {(u,p,R)}. Show that {(v',q',S')} is an approximate difference Navier-Stokes-Reynolds flow at {(u,p,R)}, where

\displaystyle  (v')^i := v^i - c^i - \Delta^{-1} \eta^{ij} \partial_j f

\displaystyle  q' := q + \Delta^{-1} \partial_j \tilde F^j

\displaystyle  (S')^{ij} = S^{ij} - u^i ((v')^j-v^j) - u^j ((v')^i-v^i)

\displaystyle  - ((v')^i - v^i) v^j - v^i ((v')^j-v^j) - ((v')^i-v^i) ((v')^j - v^j)

\displaystyle  + \Delta^{-1} \eta^{jl} \partial_l \tilde F^i + \Delta^{-1} \eta^{il} \partial_l \tilde F^j

\displaystyle  \tilde F^i := F^i - (\partial_t - \nu \Delta) (v^i - (v')^i)

and {c: {\bf R}/{\bf Z} \rightarrow {\bf R}^d} is the mean of {v}, thus

\displaystyle c^i(t) := \int_{\mathbf{T}} v^i(t,x)\ dx.

(Hint: one will need at some point to show that {\tilde F^i} has mean zero in space at every time; this can be achieved by integrating (32) in space.)

Because of this exercise we will be able to tolerate the error terms {f,F} if they (and the mean {c}) are sufficiently small.

As a simple corollary of Exercise 24, we have the following analogue of Proposition 13:

Proposition 25 Let {u \in X^\infty}. Then there exist smooth fields {p: \Omega \rightarrow {\bf R}}, {R: \Omega \rightarrow {\bf R}^{d^2}} such that {(u,p,R)} is a Navier-Stokes-Reynolds flow. Furthermore, if {u} is supported in {I \times \mathbf{T}} for some compact time interval {I}, then {p,R} can be chosen to also be supported in this region.

Proof: Clearly {(u, 0, 0, 0, F)} is an approximate difference Navier-Stokes-Reynolds flow at {(0,0,0)}, where

\displaystyle  F^i := (\partial_t - \nu \Delta) u^i + \partial_j (u^i u^j).

Applying Exercise 24, we can construct an difference Navier-Stokes-Reynolds flow {(u,p,R)} at {(0,0,0)}, which then verifies the claimed properties. \Box

Now, we show that, in sufficiently high dimension, a Navier-Stokes-Reynolds flow {(u, p, R)} can be approximated (in an {L^1} sense) as the limit of Navier-Stokes-Reynolds flows {(u^{(N)}, p^{(N)}, R^{(N)})}, with the Reynolds stress {R^{(N)}} going to zero.

Proposition 26 (Weak improvement of Navier-Stokes-Reynolds flows) Let {\varepsilon>0}, and let {d} be sufficiently large depending on {\varepsilon}. Let {U = (u,p,R)} be a Navier-Stokes-Reynolds flow. Then for sufficiently large {N}, there exists a Navier-Stokes-Reynolds flow {\tilde U = (\tilde u, \tilde p, \tilde R)} obeying the estimates

\displaystyle  \| \partial_t^j \nabla_x^k \tilde u \|_{L^2_{t,x}} \lesssim_{U,\varepsilon,d,j,k} N^{k} \ \ \ \ \ (35)

\displaystyle  \| \tilde R \|_{L^1_{t,x}} \lesssim_{U,\varepsilon,d} N^{-1+\varepsilon}

for all {j,k \geq 0}, and such that

\displaystyle  \|\tilde u - u \|_{L^2_{t,x}} \lesssim_{\varepsilon,d} \| R \|_{L^1_{t,x}}^{1/2}. \ \ \ \ \ (36)

\displaystyle  \|\tilde u - u \|_{L^1_{t,x}} \lesssim_{U,\varepsilon,d} N^{-10}.

Furthermore, if {U} is supported in {I \times \mathbf{T}_E} for some interval {I \subset {\bf R}/{\bf Z}}, then one can arrange for {\tilde U} to be supported on {I' \times \mathbf{T}_E} for any interval {I'} containing {I} in its interior (at the cost of allowing the implied constants in the above to depend also on {I,I'}).

This proposition can be viewed as an analogue of Theorem 14. For an application at the end of this section it is important that the implied constant in (36) is uniform in the choice of initial flow {U}. The estimate (35) can be viewed as asserting that the new velocity field {\tilde u} is oscillating at frequencies {O(N)}, at least in an {L^2_{t,x}} sense. In the next section, we obtain a stronger version of this proposition with more quantitative estimates that can be iterated to obtain higher regularity weak solutions.

To simplify the notation we adopt the following conventions. Given an {n}-dimensional vector {(D_1,\dots,D_n)} of differential operators, we use {(D_1,\dots,D_n)^m} to denote the {n^m}-tuple of differential operators {D_{i_1} \dots D_{i_m}} with {i_1,\dots,i_m \in \{1,\dots,n\}}. We use {(D_1,\dots,D_n)^{\leq m}} to denote the {\sum_{0 \leq m' \leq m} n^{m'}}-tuple formed by concatenating {(D_1,\dots,D_n)^{m'}} for {0 \leq m' \leq m}. Thus for instance the estimate (35) can be abbreviated as

\displaystyle  \| (\partial_t, N^{-1} \nabla_x)^{\leq m} \tilde u \|_{L^2_{t,x}} \lesssim_{U,\varepsilon,d,m} 1

for all {m \geq 0}. Informally, one should read the above estimate as asserting that {\tilde u} is bounded in {L^2_{t,x}} with norm {O_{U,\varepsilon,d}(1)}, and oscillates with frequency {O(1)} in time and {O(N)} in space (or equivalently, with a temporal wavelength of {\gtrsim 1} and a spatial wavelength of {\gtrsim 1/N}).

Proof: We can assume that {R} is non-zero, since if {R=0} we can just take {U^{(N)} = U}. We may assume that {U} is supported in {I \times \mathbf{T}} for some interval {I \subset {\bf R}/{\bf Z}} (which may be all of {{\bf R}/{\bf Z}}), and let {I'} be an interval containing {I} in its interior. To abbreviate notation, we allow all implied constants to depend on {\varepsilon,d,I,I'}.

Assume {N} is sufficiently large. Using the ansatz

\displaystyle  (\tilde u, \tilde p, \tilde R) = (u + v, p + q, \tilde R),

and the triangle inequality, it suffices to construct a difference Navier-Stokes-Reynolds flow {(v, q, \tilde R)} at {U} supported on {I' \times \mathbf{T}} and obeying the bounds

\displaystyle  \| (\partial_t, N^{-1} \nabla_x)^{\leq m} v \|_{L^2_{t,x}} \lesssim_{U,m} 1

\displaystyle  \| \tilde R \|_{L^1_{t,x}} \lesssim_{U} N^{-1+\varepsilon}

\displaystyle  \| v \|_{L^2_{t,x}} \lesssim \| R \|_{L^1_{t,x}}^{1/2}

\displaystyle  \| v \|_{L^1_{t,x}} \lesssim_{U} N^{-10}

for all {m \geq 0}.

It will in fact suffice to construct an approximate difference Navier-Stokes-Reynolds flow {(v, q, \tilde R, f, F)} at {U} supported on {I' \times \mathbf{T}} and obeying the bounds

\displaystyle  \| (\partial_t, N^{-1} \nabla_x)^{\leq m} v \|_{L^2_{t,x}} \lesssim_{U,m} 1 \ \ \ \ \ (37)

\displaystyle  \| \tilde R\|_{L^1_{t,x}} \lesssim_{U} N^{-1+\varepsilon} \ \ \ \ \ (38)

\displaystyle  \| (\partial_t, N^{-1} \nabla_x)^{\leq m} v \|_{L^1_{t,x}} \lesssim_{U,m} N^{-20} \ \ \ \ \ (39)

\displaystyle  \| (\partial_t, N^{-1} \nabla_x)^{\leq m} f \|_{L^2_{t,x}} \lesssim_{U,m} N^{-20} \ \ \ \ \ (40)

\displaystyle  \| F \|_{L^1_{t,x}} \lesssim_{U} N^{-20} \ \ \ \ \ (41)

\displaystyle  \| v \|_{L^2_{t,x}} \lesssim \| R \|_{L^1_{t,x}}^{1/2} \ \ \ \ \ (42)

for {m \geq 0}, since an application of Exercise 24 and some simple estimation will then give a difference Navier-Stokes-Reynolds flow {(v', q', R')} obeying the desired estimates (using in particular the fact that {\Delta \nabla_x} is bounded on {L^1_{t,x}} and {L^2_{t,x}}, as can be seen from Littlewood-Paley theory; also note that (39) can be used to ensure that the mean of {v^{(N)}} is very small).

To construct this approximate solution, we again use the method of fast and slow variables. Set {N_1 := \lfloor N^{1-\varepsilon} \rfloor}, and introduce the fast-slow spacetime {\mathbf{\Omega} := \Omega \times \mathbf{T}_F = {\bf R}/{\bf Z} \times \mathbf{T}_E \times \mathbf{T}_F}, which we coordinatise as {(t,x,y)}; we use {\partial_{x^i}} to denote partial differentiation in the coordinates of the slow variable {x \in \mathbf{T}_E}, and {\partial_{y^i}} to denote partial differentiation in the coordinates of the fast variable {y \in \mathbf{T}_F}. We also use {L^p_{t,x,y}} as shorthand for {L^p_t L^p_x L^p_y(\mathbf{\Omega})}. Define an approximate fast-slow solution to the difference Navier-Stokes-Reynolds equation at {(u,p,R)} (at the frequency scale {N_1}) to be a tuple {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} of smooth functions {\mathbf{v}, \mathbf{f}: \mathbf{\Omega} \rightarrow {\bf R}^d}, {\mathbf{q}, \mathbf{f}: \mathbf{\Omega} \rightarrow {\bf R}}, {\mathbf{R}: \mathbf{\Omega} \rightarrow {\bf R}^d} that obey the system of equations

\displaystyle  (\partial_t - \nu \Delta_x - \nu N_1^2 \Delta_y) \mathbf{v}^i

\displaystyle  + (\partial_{x^j} + N_1 \partial_{y^j}) ( u^i \mathbf{v}^j + u^j \mathbf{v}^i + \mathbf{v}^i \mathbf{v}^j + R^{ij} + \mathbf{q} \eta^{ij} - \mathbf{R}^{ij} ) = \mathbf{F}^i \ \ \ \ \ (43)

\displaystyle  (\partial_{x^i} + N_1 \partial_{y^i}) \mathbf{v}^i = \mathbf{f} \ \ \ \ \ (44)

\displaystyle  \mathbf{R}^{ij} = \mathbf{R}^{ji}. \ \ \ \ \ (45)

Here we think of {u} as a “low-frequency” function (in the sense of Remark 20) that only depends on {t} and the slow variable {x}, but not on the fast variable {y}.

Let {D} denote the tuple {D := (\partial_t, \nabla_x, N^{-\varepsilon} \nabla_y)}. Suppose that for any sufficiently large {N}, we can construct an approximate fast-slow solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} to the difference equations at {(u,p,R)} supported on supported on {I' \times \mathbf{T}_E \times \mathbf{T}_F} that obeys the bounds

\displaystyle  \| D^{\leq m} \mathbf{v} \|_{L^2_{t,x,y}} \lesssim_{U,m} 1 \ \ \ \ \ (46)

\displaystyle  \| \mathbf{R} \|_{L^1_{t,x}} \lesssim_{U} N^{-1+\varepsilon} \ \ \ \ \ (47)

\displaystyle  \| D^{\leq m} \mathbf{v} \|_{L^1_{t,x,y}} \lesssim_{U,m} N^{-20} \ \ \ \ \ (48)

\displaystyle  \| D^{\leq m} \mathbf{f} \|_{L^2_{t,x,y}} \lesssim_{U,m} N^{-20} \ \ \ \ \ (49)

\displaystyle  \| \mathbf{F} \|_{L^1_{t,x,y}} \lesssim_{U} N^{-20} \ \ \ \ \ (50)

\displaystyle  \| \mathbf{v} \|_{L^2_{t,x}} \lesssim \| R \|_{L^1_{t,x}}^{1/2} \ \ \ \ \ (51)

for all {m \geq 0}. (Informally, the presence of the derivatives {D} means that the fields involved are allowed to oscillate in time at wavelength {\gtrsim 1}, in the slow variable {x} at wavelength {\gtrsim 1}, and in the fast variable {y} at wavelength {\gtrsim N^{-\varepsilon}}.) From (46) and the choice of {N_1} we then have

\displaystyle  \| (\partial_t, N^{-1}(\nabla_x + N_1 \nabla_y)^{\leq m} \mathbf{v} \|_{L^2_{t,x,y}} \lesssim_{U,m} 1

for all {m \geq 0}, and similarly for (48), (49). (Note here that there was considerable room in the estimates with regards to regularity in the {x} variable; this room will be exploited more in the next section.) For any shift {\theta \in \mathbf{T}_F}, we see from the chain rule that {(v^\theta, q^\theta, R^\theta, f^\theta, F^\theta)} is an approximate difference Navier-Stokes-Reynolds flow at {U} supported on {I' \times \mathbf{T}_E}, where

\displaystyle  v^\theta(t,x) := \mathbf{v}( t, x, N_1 x + \theta)

\displaystyle  q^\theta(t,x) := \mathbf{q}( t, x, N_1 x + \theta)

\displaystyle  R^\theta(t,x) := \mathbf{R}( t, x, N_1 x + \theta)

\displaystyle  f^\theta(t,x) := \mathbf{f}( t, x, N_1 x + \theta)

\displaystyle  F^\theta(t,x) := \mathbf{F}( t, x, N_1 x + \theta).

Also from (46) and Fubini’s theorem we have

\displaystyle  \int_{\mathbf{T}_F} \| (\partial_t, N^{-1} \nabla_x)^{\leq m} v^\theta \|_{L^2_{t,x}}^2\ d\theta \lesssim_{U,m} 1 \ \ \ \ \ (52)

and similarly

\displaystyle  \int_{\mathbf{T}_F} \| R^\theta \|_{L^1_{t,x}}\ d\theta \lesssim_{U} N^{-1+\varepsilon}

\displaystyle  \int_{\mathbf{T}_F} \| (\partial_t, N^{-1} \nabla_x)^{\leq m} v^\theta \|_{L^1_{t,x}}\ d\theta \lesssim_{U,m} N^{-20}

\displaystyle  \int_{\mathbf{T}_F} \| (\partial_t, N^{-1} \nabla_x)^{\leq m} f^\theta \|_{L^2_{t,x}}^2\ d\theta \lesssim_{U,m} N^{-40}

\displaystyle  \int_{\mathbf{T}_F} \| F^\theta \|_{L^1_{t,x}}\ d\theta \lesssim_{U} N^{-20}

\displaystyle  \int_{\mathbf{T}_F} \| v^\theta \|_{L^2_{t,x}}^2\ d\theta \lesssim \| R \|_{L^1_{t,x}}

for all {m \geq 0}. By Markov’s inequality and (52), we see that for each {m}, we have

\displaystyle \| (\partial_t, N^{-1} \nabla_x)^{\leq m} v^\theta \|_{L^2_{t,x}} \lesssim_{U,m} 1

for all {\theta} outside of an exceptional set of measure (say) {2^{-m-10}}. Similarly for the other equations above. Applying the union bound, we can then find a {\theta} such that {(v^\theta, q^\theta, R^\theta, f^\theta, F^\theta)} obeys all the required bounds (37)-\eqref[bd-5} simultaneously for all {m}. (This is an example of the probabilistic method, originally developed in combinatorics; one can think of {\theta} probabilistically as a shift drawn uniformly at random from the torus {\mathbf{T}}, in order to relate the fast-slow Lebesgue norms {L^p_{t,x,y}} to the original Lebesgue norms {L^p_{t,x}}.)

It remains to construct an approximate fast-slow solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} supported on {I' \times \mathbf{T}_E \times \mathbf{T}_F} with the required bounds (46)(51). Actually, in this high-dimensional setting we can afford to simplify the situation here by removing some of the terms (and in particular eliminating the role of the reference velocity field {u}). Define a simplified fast-slow solution at {U} to be a tuple {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} of smooth functions on {\mathbf{\Omega}} obeying the simplified equations

\displaystyle  (\partial_{x^j} + N_1 \partial_{y^j}) ( \mathbf{v}^i \mathbf{v}^j + R^{ij} + \mathbf{q} \eta^{ij} - \mathbf{R}^{ij} ) = \mathbf{F}^i \ \ \ \ \ (53)

\displaystyle  (\partial_{x^i} + N_1 \partial_{y^i}) \mathbf{v}^i = \mathbf{f} \ \ \ \ \ (54)

\displaystyle  \mathbf{R}^{ij} = \mathbf{R}^{ji}. \ \ \ \ \ (55)

If we can find a simplified fast-slow solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} of smooth functions on {\mathbf{\Omega}} supported on {I' \times \mathbf{T}_E \times \mathbf{T}_F} obeying the bounds

\displaystyle  \| D^{\leq m} \mathbf{v} \|_{L^2_{t,x,y}} \lesssim_{U,m} 1 \ \ \ \ \ (56)

\displaystyle  \| \mathbf{R} \|_{L^1_{t,x,y}} \lesssim_{U} N^{-1+\varepsilon} \ \ \ \ \ (57)

\displaystyle  \| D^{\leq m} \mathbf{v} \|_{L^1_{t,x,y}} \lesssim_{U,m} N^{-30} \ \ \ \ \ (58)

\displaystyle  \| D^{\leq m} \mathbf{f} \|_{L^2_{t,x,y}} \lesssim_{U,m} N^{-30} \ \ \ \ \ (59)

\displaystyle  \| \mathbf{F} \|_{L^1_{t,x,y}} \lesssim_{U} N^{-30} \ \ \ \ \ (60)

\displaystyle  \| \mathbf{v} \|_{L^2_{t,x,y}} \lesssim_{U} \| R \|_{L^1_{t,x}}^{1/2} \ \ \ \ \ (61)

for all {m \geq 0}, then the {(\mathbf{v}, \mathbf{q}, \mathbf{R}', \mathbf{f}, \mathbf{F}')} will be an approximate fast-slow solution supported on {I' \times \mathbf{T}_E \times \mathbf{T}_F} obeying the required bounds (46)(51), where

\displaystyle  (\mathbf{R}')^{ij} := \mathbf{R}^{ij} + u^i \mathbf{v}^j + u^j \mathbf{v}^i

\displaystyle  (\mathbf{F}')^i := \mathbf{F}^i + (\partial_t - \nu \Delta_x - \nu N_1^2 \Delta_y) \mathbf{v}^i.

Now we need to construct a simplified fast-slow solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} supported on {I' \times \mathbf{T}_E \times \mathbf{T}_F} obeying the bounds (56)(60). We do this in stages, first finding a solution that cancels off the highest order terms {N_1 \partial_{y^j}(\mathbf{v}^i \mathbf{v}^j)} and {N_1 \partial_{y^i} \mathbf{v}^i}, and also such that {\mathbf{v}^i \mathbf{v}^j + R^{ij} + q \delta^{ij}} has mean zero in the fast variable {y} (so that it is “high frequency” in the sense of Remark 20). This still leads to fairly large values of {\mathbf{F}} and {\mathbf{f}}, but we will then apply a “divergence corrector” to almost completely eliminate {\mathbf{f}}, followed by a “stress corrector” that almost completely eliminates {\mathbf{F}}, at which point we will be done.

We turn to the details. Our preliminary construction of the velocity field {\mathbf{v}} will be a “Mikado flow”, consisting of flows along narrow tubes. (Earlier literature used other flows, such as Beltrami flows; however, Mikado flows have the advantage of being localisable to small subsets of spacetime, which is particularly useful in high dimensions.) We need the following modification of Lemma 16:

Exercise 27 Let {S} be a compact subset of the space of positive definite {d \times d} matrices {A = (A^{ij})_{1 \leq i,j \leq d}}. Show that there exist non-zero lattice vectors {e_1,\dots,e_K \in {\bf Z}^d} and smooth functions {a_1,\dots,a_K: S \rightarrow {\bf R}} for some {K \geq 0} such that

\displaystyle  A^{ij} = \sum_{k=1}^K a_k(A)^2 (e_k)^i (e_k)^j \ \ \ \ \ (62)

for all {A \in S}. (This decomposition is essentially due to de Lellis and Szekelyhidi. The subscripting and superscripting here is reversed from that in Lemma 16; this is because we are now trying to decompose a rank {(2,0)} tensor rather than a rank {(0,2)} tensor.)

We would like to apply this exercise to the matrix with entries {-R^{ij} - q \eta^{ij}}. We thus need to select the pressure {q} so that this matrix is positive definite. There are many choices available for this pressure; we will take

\displaystyle  q := -( \| R \|_{L^1_{t,x}}^2 + 100 d |R|^2)^{1/2}

where {|R|} is the Frobenius norm of {\mathbf{R}}. Then {q} is smooth and {-R^{ij} - q \eta^{ij}} is positive definite on all of the compact spacetime {\Omega} (recall that we can assume {R} to not be identically zero), and in particular ranges in a compact subset of positive definite matrices. Applying the previous exercise and composing with the function {(t,x) \mapsto -R(t,x)}, we conclude that there exist non-zero lattice vectors {e_1,\dots,e_K \in {\bf Z}^d} and smooth functions {a_1,\dots,a_K: \Omega \rightarrow {\bf R}} for some {K \geq 0} such that

\displaystyle  - R^{ij}(t,x) - q(t,x) \delta^{ij} = \sum_{k=1}^K a_k(t,x)^2 e_k^i e_k^j \ \ \ \ \ (63)

for all {(t,x) \in \Omega}. As {K, e^k, a^k} depend only on {d,R}, and {R} is a component of {U}, all norms of these quantities are bounded by {O_{U}(1)}; they are independent of {N}. Furthermore, on taking traces and integrating on {\Omega}, we obtain the estimate

\displaystyle  \sum_{k=1}^K \| a_k \|_{L^2_{t,x}}^2 |e_k|^2 \lesssim \| R \|_{L^1_{t,x}} \ \ \ \ \ (64)

(note here that the implied constant is uniform in {K}, {U}). By applying a smooth cutoff in time that equals {1} on {I} and vanishes outside of {I'}, we may assume that the {a_k} are supported in {I' \times \mathbf{T}}.

Now for each {k=1,\dots,K}, the closed subgroup {\ell_k := \{ t e_k \hbox{ mod } {\bf Z}^d: t \in {\bf R}/{\bf Z} \}} is a one-dimensional subset of {\mathbf{T}_F}, so the {N^{-\varepsilon}}-neighbourhood of this subgroup has measure {O_{d,R}(N^{-\varepsilon(d-1)})}; crucially, this will be a large negative power of {N} when {d} is very large. let {T_k \subset \mathbf{T}_F} be a translate of this {N^{-\varepsilon}}-neighbourhood such that all the {T_1,\dots,T_K} are disjoint; this is easily accomplished by the probabilistic method for {N} large enough, translating each of the {T_k} by an independent random shift and noting that the probability of a collision goes to zero as {N \rightarrow \infty} (here we need the fact that we are in at least three dimensions).

Let {M} be a large integer (depending on {\varepsilon}) to be chosen later. For any {k=1,\dots,K}, let {\mathbf{\psi}_k: \mathbf{T}_F \rightarrow {\bf R}} be a scalar test function supported on {T_k} that is constant in the {e_k} direction, thus

\displaystyle e_k^i \partial_{y^i} \mathbf{\psi}_k(y) = 0,

and is not identically zero, which implies that the iterated Laplacian {\Delta^M_y \mathbf{\psi}_k} of {\mathbf{\psi}_k} is also not identically zero (thanks to the unique continuation property of harmonic functions). We can normalise so that

\displaystyle  |e_k|^2 \| \Delta^M_y \mathbf{\psi}_k \|_{L^2_y} = 1

and we can also arrange to have the bounds

\displaystyle  |e_k|^2 \| (N^{-\varepsilon} \nabla_y)^{-m} \mathbf{\psi}_k \|_{L^2_y} \lesssim_{m,M} N^{-2M\varepsilon}

for all {m \geq 0} (basically by first constructing a version of {\mathbf{\psi}_k} on a standard cylinder {B^{d-1}(0,1) \times {\bf R}} and the applying an affine transformation to map onto {T_k}).

Let {\mathbf{w}_k: \mathbf{T}_F \rightarrow {\bf R}^d} denote the function

\displaystyle  \mathbf{w}_k^i(y) := \Delta^M_y \mathbf{\psi}_k(y) e_k^i;

intuitively this represents the velocity field of a fluid traveling along the tube {T_k}, with the presence of the Laplacian {\Delta^M} ensuring that this function is extremely well balanced (for instance, it will have mean zero, and thus “high frequency” in the sense of Remark 20). Clearly {\mathbf{w}_k} is divergence free, and one also has the steady-state Euler equation

\displaystyle  \partial_{y_j}( \mathbf{w}_k^i \mathbf{w}_k^j ) = 0 \ \ \ \ \ (65)

and the normalisation

\displaystyle  \int_{{\mathbf T}_F} \mathbf{w}_k^i \mathbf{w}_k^j = e_k^i e_k^j \ \ \ \ \ (66)


\displaystyle  \| (N^{-\varepsilon} \nabla_y)^m (N^{-2\varepsilon} \Delta_y)^{-m'} \mathbf{w}_k \|_{L^2_y} \lesssim_{j,m} 1

for all {m \geq 0} and {0 \leq m' \leq M}. If we then set

\displaystyle \mathbf{v}^i(t,x,y) := \sum_{k=1}^K a_k(t,x) \mathbf{w}_k^i(y)

\displaystyle \mathbf{F}^i(t,x,y) := \partial_{x^j}( \sum_{k=1}^K a_k(t,x)^2 (\mathbf{w}_k^i(y) \mathbf{w}_k^j(y) - e_k^i e_k^j) )

\displaystyle \mathbf{f}(t,x,y) := \sum_{k=1}^K \partial_{x^i} a_k(t,x) \mathbf{w}_k^i(y)

then one easily checks that {(\mathbf{v}, q, 0, \mathbf{f}, \mathbf{F})} is a simplified fast-slow solution supported in {I' \times \mathbf{T}_E \times \mathbf{T}_F}. Direct calculation using the Leibniz rule then gives the bounds

\displaystyle  \| D^{\leq m} \mathbf{v} \|_{L^2_{t,x,y}} \lesssim_{U,m} 1 \ \ \ \ \ (67)

\displaystyle  \| D^{\leq m} \mathbf{f} \|_{L^2_{t,x,y}} \lesssim_{U,m} 1 \ \ \ \ \ (68)

\displaystyle  \| \mathbf{F} \|_{L^1_{t,x,y}} \lesssim_{U} 1 \ \ \ \ \ (69)

for all {m \geq 0}, while from (64) one has

\displaystyle  \| \mathbf{v} \|_{L^2_{t,x,y}} \lesssim \| R \|_{L^1_{t,x}}^{1/2} \ \ \ \ \ (70)

(note here that the implied constant is uniform in {U}).

This looks worse than (56)(60). However, observe that {\mathbf{v}} is supported on the set {\Omega \times \bigcup_{k=1}^K T^k}, which has measure {O_{U,d}( N^{-\varepsilon(d-1)})}, which for {d} large enough can be taken to be (say) {O_{U,d}(N^{-100})}. Thus by Cauchy-Schwarz one has

\displaystyle  \| D^{\leq m} \mathbf{v} \|_{L^1_{t,x,y}} \lesssim_{U,m} N^{-30} \ \ \ \ \ (71)

for all {m \geq 0}. Also, from construction (particularly (66)) we see that {\mathbf{F}} is of mean zero in the {y} variable (thus it is “high frequency” in the sense of Remark 20).

We are now a bit closer to (56)(60), but our bounds on {\mathbf{f}, \mathbf{F}} are not yet strong enough. We now apply a “divergence corrector” to make {\mathbf{f}} much smaller. Observe from construction that {\mathbf{f} = \Delta_y^M \mathbf{g}} where

\displaystyle \mathbf{g}(t,x,y) := \sum_{k=1}^K \partial_{x^i} a_k(t,x) \mathbf{\psi}_k(y) e_k^i(y)

and {\mathbf{g}} is supported on {\Omega \times \bigcup_{k=1}^K T_k} and obeys the estimates

\displaystyle  \| D^{\leq m} \mathbf{g} \|_{L^2_{t,x,y}} \lesssim_{U,m} N^{-2M \varepsilon} \ \ \ \ \ (72)

for all {m \geq 0}. Observe that

\displaystyle  \mathbf{f} = (\partial_{x^i} + N_1 \partial_{y^i}) ( N_1^{-1} \eta_{ij} \Delta_y^{-1} \partial_{y^j} \mathbf{f} ) - N_1^{-1} \Delta_y^{-1} \eta_{ij} \partial_{x^i} \partial_{y^j} \mathbf{f}.

We abbreviate the differential operator {\eta_{ij} \partial_{x^i} \partial_{y^j}} as {\nabla_x \cdot \nabla_y}. Iterating the above identity {M} times, we obtain

\displaystyle  \mathbf{f} = (\partial_{x^i} + N_1 \partial_{y^i}) \mathbf{d}^i + \mathbf{f}'


\displaystyle  \mathbf{d}^i := \sum_{m=1}^{M} (-1)^{m-1} N_1^{-m} \Delta^{M-m}_y \eta_{ij} \partial_{y^j} (\nabla x \cdot \nabla_y)^m \mathbf{g}


\displaystyle  \mathbf{f}' := (-1)^M N_1^{-M} (\nabla_x \cdot \nabla_y)^M \mathbf{g}.

In particular, {\mathbf{d}^i} is supported in {\Omega \times \bigcup_{k=1}^K T^k}. Observe that {(\mathbf{v}', 0, \mathbf{R}, \mathbf{f}', \mathbf{F})} is a simplified fast-slow solution supported in {I' \times \mathbf{T}_E \times \mathbf{T}_F}, where

\displaystyle  \mathbf{v}' := \mathbf{v}^i - \mathbf{d}^i

\displaystyle  \mathbf{R}^{ij} := - \mathbf{d}^i \mathbf{v}^j - \mathbf{v}^i \mathbf{d}^j - \mathbf{d}^i \mathbf{d}^j.

From (72) we have

\displaystyle  \| D^{\leq m} \mathbf{f}' \|_{L^2_{t,x,y}} \lesssim_{U,m} N_1^{-M} N^{-\varepsilon M}

so in particular for {M} large enough

\displaystyle  \| D^{\leq m} \mathbf{f}' \|_{L^2_{t,x,y}} \lesssim_{U,m} N^{-30}

for any {m \geq 0}. Meanwhile, another appeal to (72) yields

\displaystyle  \| D^{\leq m} \mathbf{d} \|_{L^2_{t,x,y}} \lesssim_{U,m} \sum_{m'=1}^M N_1^{-m'} N^{\varepsilon (-2m'+1)} \lesssim_U N^{-1} \ \ \ \ \ (73)

for any {m \geq 0}, and hence by (67) and the triangle inequality

\displaystyle  \| D^{\leq m} \mathbf{v}' \|_{L^2_{t,x,y}} \lesssim_{U,m} 1.

Similarly one has

\displaystyle  \| \mathbf{v}' \|_{L^2_{t,x,y}} \lesssim \| R \|_{L^1_{t,x}}^{1/2}.

Since {\mathbf{v}'} continues to be supported on the thin set {\Omega \times \bigcup_{k=1}^K T_k}, we can apply Hölder as before to conclude that

\displaystyle  \| D^{\leq k} \mathbf{v}' \|_{L^1_{t,x,y}} \lesssim_{U,m} N^{-30}

for any {m \geq 0}. Also, from (73) and Hölder we have

\displaystyle  \| \mathbf{R} \|_{L^1_{t,x,y}} \lesssim_{U} N^{-1}.

We have now achieved the bound (59); the remaining estimate that needs to be corrected for is (60). This we can do by a modification of the previous argument, where we now work to reduce the size of {\mathbf{F}} rather than {\mathbf{f}}. Observe that as {\mathbf{F}} is “high frequency” (mean zero in the {y} variable), one can write

\displaystyle  \mathbf{F}^i = (\partial_{x^j} + N_1 \partial_{y^j}) ( N_1^{-1} \Delta_y^{-1} \eta^{il} \partial_{y^l} \mathbf{F}^j + N_1^{-1} \Delta_y^{-1} \eta^{jl} \partial_{y^j} \mathbf{F}^i )

\displaystyle  - \partial_{y^i} ( \partial_{y_j} \mathbf{F}^j )

\displaystyle  + N^{-1} T \mathbf{F}^i

where {T} is the linear operator on smooth vector-valued functions on {\mathbf{\Omega}} of mean zero defined by the formula

\displaystyle  (T \mathbf{F})^i := -\eta^{il} \Delta_y^{-1} \partial_{x^j} \partial_{y^l} \mathbf{F}^j - \eta^{jl} \Delta_y^{-1} \partial_{x^j} \partial_{y^l} \mathbf{F}^i.

Note that {T\mathbf{F}} also has mean zero. We can thus iterate and obtain

\displaystyle  \mathbf{F}^i = (\partial_{x^j} + N_1 \partial_{y^j}) \mathbf{S}^{ij} - \partial_{y^i} \mathbf{q} + (\mathbf{F}')^i


\displaystyle  \mathbf{S}^{ij} := \sum_{m=1}^{M} N_1^{-m} \eta^{il} \Delta_y^{-1} \partial_{y^l} T^{m-1} \mathbf{F}^j + \eta^{jl} \Delta_y^{-1} \partial_{y^l} T^{m-1} \mathbf{F}^i

\displaystyle  \mathbf{F}' := N_1^{-M} T^M \mathbf{F}

and {\mathbf{q}} is a smooth function whose exact form is explicit but irrelevant for our argument. We then see that {(\mathbf{v}', \mathbf{q}, \mathbf{R} + \mathbf{S}, \mathbf{f}', \mathbf{F}')} is a simplified fast-slow solution supported in {I' \times \mathbf{T} \times \mathbf{T}}. Since {\Delta_y^{-1} \partial_{y^i}} is bounded in {L^1_y}, we see from (69) that

\displaystyle  \| \mathbf{S} \|_{L^1_{t,x,y}} \lesssim_{U,M} \sum_{m=1}^M N_1^{-m} \lesssim N^{-1+\varepsilon},


\displaystyle  \| \mathbf{F} \|_{L^1_{t,x,y}} \lesssim_{U,M} N_1^{-M} \lesssim N^{-30}

if {M} is large enough. Thus {(\mathbf{v}', \mathbf{q}, \mathbf{R} + \mathbf{S}, \mathbf{f}', \mathbf{F}')} obeys the required bounds (56)(60), concluding the proof. \Box

As an application of this proposition we construct a low-regularity weak solution to high-dimensional Navier-Stokes that does not obey energy conservation. More precisely, for any {s \geq 0}, let {X^s} be the Banach space of periodic functions {u \in C^0_t H^s_x(\Omega \rightarrow {\bf R}^d)} which are divergence free, and of mean zero at every time. For {\nu \geq 0}, define a time-periodic weak {H^s} solution {u} of the Navier-Stokes (or Euler, if {\nu=0}) equations to be a function {u \in X^s} that solves the equation

\displaystyle  \partial_t u + \partial_j {\mathbb P} (u^j u) = \nu \Delta u

in the sense of distributions. (Note that one may easily define {{\mathbb P}} on {L^1_{t,x}} functions in a distributional sense, basically because the adjoint operator {{\mathbb P}^*} maps test functions to bounded functions.)

Corollary 28 (Low regularity non-trivial weak solutions) Assume that the dimension {d} is sufficiently large. Then for any {\nu \geq 0}, there exists a periodic weak {L^2} solution {u} to Navier-Stokes which equals zero at time {t=0}, but is not identically zero. In particular, periodic weak {L^2} solutions are not uniquely determined by their initial data, and do not necessarily obey the energy inequality

\displaystyle  \frac{1}{2} \int_{\mathbf{T}} |u(t,x)|^2\ dx \leq \frac{1}{2} \int_{\mathbf{T}} |u_0(x)|^2\ dx. \ \ \ \ \ (74)

Proof: Let {u^{(0)}} be an element of {X^0} that is supported on {[0.4, 0.6] \times \mathbf{T}_E} and is not identically zero (it is easy to show that such an element exists). By Proposition 25, we may then find a Navier-Stokes-Reynolds flow {(u^{(0)}, p^{(0)}, R^{(0)})} also supported on {[0.4, 0.6] \times \mathbf{T}_E}. Let {N_0} be sufficiently large. By applying Proposition 26 repeatedly (with say {\varepsilon=1/2}) and with a sufficiently rapidly increasing sequence {N_0 < N_1 < \dots}, we can find a sequence {(u^{(n)}, p^{(n)}, R^{(n)})} of Navier-Stokes-Reynolds flows supported on (say) {[0.3 + 2^{-n}/100, 0.7 - 2^{-n}/100] \times \mathbf{T}_E} obeying the bounds

\displaystyle  \| R^{(n+1)} \|_{L^1_{t,x}} \lesssim N_n^{-1/4}

\displaystyle  \|u^{(n+1)} - u^{(n)} \|_{L^2_{t,x}} \lesssim_d \| R^{(n)} \|_{L^1_{t,x}}^{1/2}

\displaystyle  \|u^{(n+1)} - u^{(n)} \|_{L^1_{t,x}} \lesssim N_n^{-9}

(say) for {n \geq 0}. For {N_n} sufficiently rapidly growing, this implies that {R^{(n)}} converges strongly in {L^1} to zero, while {u^{(n)}} converges strongly in {L^2} to some limit {u \in X^0} supported in {[0.3, 0.7] \times \mathbf{T}_E}. From the triangle inequality we have

\displaystyle  \|u - u^{(0)} \|_{L^1_{t,x}} \lesssim N_0^{-9}

(if {N_n} is sufficiently rapidly growing) and hence {u} is not identically zero if {N_0} is chosen large enough. Applying Leray projections to the Navier-Stokes-Reynolds equation we have

\displaystyle  \partial_t u^{(n)} + \partial_j {\mathbb P} ((u^{(n)})^j u^{(n)} ) = \partial_j \mathbf{P} (R^{(n)})^{\cdot j} + \nu \Delta u^{(n)}

in the sense of distributions (where {(R^{(n)})^{\cdot j}} is the vector field with components {(R^{(n)})^{ij}} for {i=1,\dots,d}); taking distributional limits as {n \rightarrow \infty}, we conclude that {u} is a periodic weak {L^2} solution to the Navier-Stokes equations, as required. \Box

— 4. High regularity weak solutions to Navier-Stokes in high dimensions —

Now we refine the above arguments to give a higher regularity version of Corollary 28, in which we can give the weak solutions almost half a derivative of regularity in the Sobolev scale:

Theorem 29 (Non-trivial weak solutions) Let {0 < s < 1/2}, and assume that the dimension {d} is sufficiently large depending on {s}. Then for any {\nu \geq 0}, there exists a periodic weak {H^s} solution {u} which equals zero at time {t=0}, but is not identically zero. In particular, periodic weak {H^s} solutions are not uniquely determined by their initial data, and do not necessarily obey the energy inequality (74).

This result is inspired by a three-dimensional result of Buckmaster and Vicol (with a small value of {s>0}) and a higher dimensional result of Luo (taking {\alpha = 1/200}, and restricting attention to time-independent solutions). In high dimensions one can create fairly regular functions which are large in {L^2} type norms but tiny in {L^1} type norms; when using the Sobolev scale {H^s} to control the solution {u} (and {L^1} type norms to measure an associated stress tensor), this has the effect of allowing one to treat as negligible the linear terms {\partial_t u - \nu \Delta u} in (variants of) the Navier-Stokes equation, as well as interaction terms between low and high frequencies. As such, the analysis here is simpler than that required to establish the Onsager conjecture. The construction used to prove this theorem shows in fact that periodic weak {H^s} solutions are in some sense “dense” in {X^s}, but we will not attempt to quantify this fact here.

In the proof of Corollary 28, we took the frequency scales {N_n} to be extremely rapidly growing in {n}. This will no longer be good enough for proving Theorem 29, and in fact we need to take a fairly dense set of frequency scales in which {N_{n+1} = N_n^{1+\varepsilon}} for a small {\varepsilon}. In order to do so, we have to replace Proposition 26 with a more quantitative version in which the dependence of bounds on the size of the original Navier-Stokes-Reynolds flow {U} is made much more explicit.

We turn to the details. We select the following parameters:

  • A regularity {0 < s < 1/2};
  • A quantity {\varepsilon>0}, assumed to be sufficiently small depending on {s};
  • An integer {M \geq 1}, assumed to be sufficiently large depending on {s,\varepsilon}; and
  • A dimension {d}, assumed to be sufficiently large depending on {s,\varepsilon,M}.

Then we let {\nu\geq 0}. To simplify the notation we allow all implied constants to depend on {s,\varepsilon,M,d,\nu} unless otherwise specified. We recall from the previous section the notion of a Navier-Stokes-Reynolds flow {(u,p,R)}. The basic strategy is to start with a Navier-Stokes-Reynolds flow {(u,p,R)} and repeatedly adjust {u} by increasingly high frequency corrections in order to significantly reduce the size of the stress {R} (at the cost of making both of these expressions higher in frequency).

As before, we abbreviate {L^p_t L^p_x(\Omega \rightarrow {\bf R}^m)} as {L^p_{t,x}}. We write {\nabla_x} for the spatial gradient to distinguish it from the time derivative {\partial_t}.

The main iterative statement (analogous to Theorem 14) starts with a Navier-Stokes-Reynolds flow {(u_0,p_0,R_0)} oscillating at spatial scales up to some small wavelength {1/N_0}, and modifies it to a Navier-Stokes-Reynolds flow {(u_1,p_1,R_1)} oscillating at a slightly smaller wavelength {\sim 1/N_0^{1+\varepsilon}}, with a smaller Reynolds stress. It can be viewed as a more quantitative version of Proposition 26.

Theorem 30 (Iterative step) Let {N_0} be sufficiently large depending on the parameters {s,\varepsilon,M,d,\nu}. Set {s' := s + 10\varepsilon}. Suppose that one has a Navier-Stokes-Reynolds flow {(u^{(0)},p^{(0)},R^{(0)})} obeying the estimates

\displaystyle  \| (N_0^{-\varepsilon} \partial_t, N_0^{-1} \nabla x)^{\leq M} \nabla_x u^{(0)} \|_{L^2_{t,x}} \leq A N_0^{1-s'} \ \ \ \ \ (75)

\displaystyle  \| R^{(0)} \|_{L^1_{t,x}} \leq A^2 N_0^{-2(1+\varepsilon)s' - 10\varepsilon^2}. \ \ \ \ \ (76)

for some {A \geq 1}. Set {N_1 := N_0^{1+\varepsilon}}. Then there exists a Navier-Stokes-Reynolds flow {(u^{(1)},p^{(1)},R^{(1)})} obeying the estimates

\displaystyle  \| (N_1^{-\varepsilon} \partial_t, N_1^{-1} \nabla x)^{\leq M} \nabla_x u^{(1)} \|_{L^2_{t,x}} \leq A N_1^{1-s'}

\displaystyle  \| R^{(1)} \|_{L^1_{t,x}} \leq A^2 N_1^{-2(1+\varepsilon)s' - 10\varepsilon^2}

\displaystyle  \| (N_1^{-\varepsilon} \partial_t)^{\leq 1} (u^{(1)} - u^{(0)}) \|_{L^2_{t,x}} \lesssim A N_1^{-s'}

\displaystyle  \| u^{(1)} - u^{(0)} \|_{L^1_{t,x}} \lesssim A N_1^{-10}.

Furthermore, if {(u^{(0)},p^{(0)},R^{(0)})} is supported on {I \times \mathbf{T}_E} for some interval {I}, then one can ensure that {(u^{(1)}, p^{(1)},R^{(1)})} is supported in {I' \times \mathbf{T}_E}, where {I'} is the {N_0^{-\varepsilon^2}}-neighbourhood of {I}.

Let us assume Theorem 30 for the moment and establish Theorem 29. Let {u^{(0)} \in X^\infty} be chosen to be supported on (say) {[0.4,0.6] \times \mathbf{T}_E} and not be identically zero. By Proposition 25, we can then find a Navier-Stokes-Reynolds flow {U = (u^{(0)},p^{(0)},R^{(0)})} supported on {[0.4,0.6] \times \mathbf{T}_E}. Let {N_0 > 1} be a sufficiently large parameter, and set {A := N_0^{(1+\varepsilon)s' + 10 \varepsilon^2}}, then the hypotheses (75), (76) will be obeyed for {N_0} large enough. Set {N_{k+1} := N_k^{1+\varepsilon}} for all {i \geq 0}. By iteratively applying Theorem 30, we may find a sequence {(u^{(k)}, p^{(k)}, R^{(k)})} of Navier-Stokes-Reynolds flows, all supported on (say) {[0.3, 0.7] \times \mathbf{T}_E}, obeying the bounds

\displaystyle  \| (N_{k+1}^{-\varepsilon} \partial_t, N_{k+1}^{-1} \nabla x)^{\leq M} \nabla_x u^{(k+1)} \|_{L^2_{t,x}} \leq A N_{k+1}^{1-s'}

\displaystyle  \| R^{(k+1)} \|_{L^1_{t,x}} \leq A^2 N_{k+1}^{-2(1+\varepsilon)s' - 10\varepsilon^2}

\displaystyle  \| (N_{k+1}^{-\varepsilon} \partial_t)^{\leq 1} (u^{(k+1)} - u^{(k)}) \|_{L^2_{t,x}} \lesssim A N_{k+1}^{-s'}

\displaystyle  \| u^{(k+1)} - u^{(k)} \|_{L^1_{t,x}} \lesssim A N_k^{-10}.

for {k \ge 0}. In particular, the {R^{(k)}} converge weakly to zero on {\Omega}, and we have the bound

\displaystyle  \| u^{(k+1)} - u^{(k)}\|_{H^1_t H^s_x} \lesssim A N_{k+1}^{s-s'+\varepsilon}

from Plancherel’s theorem, and hence by Sobolev embedding in time

\displaystyle  \| u^{(k+1)} - u^{(k)}\|_{C^0_t H^s_x} \lesssim A N_{k+1}^{s-s'+\varepsilon}.

Thus {u^{(k)}} converges strongly in {C^0_t H^s_x} (and in particular also in {C^0_t L^p_x} for some {p>2}) to some limit {u^{(\infty)}} ; as the {u^{(k)}} are all divergence-free, {u^{(\infty)}} is also. From applying Leray projections to (26) one has

\displaystyle  \partial_t u^{(k)} + \partial_j {\mathbb P} ( (u^{(k)})^j u^{(k)} ) = \nu \Delta^{(k)} u^{(k)} + \partial_j {\mathbb P} (R^{(k)})^{\cdot j}

Taking weak limits we conclude that {u^{(\infty)}} is a weak solution to Navier-Stokes. Also, from construction one has

\displaystyle  \| u - u^{(\infty)} \|_{L^1_{t,x}} \lesssim A N_1^{-10} \lesssim N_0^{-1}

(say), and so for {N_0} large enough {u^{(\infty)}} is not identically zero. This proves Theorem 29.

It remains to establish Theorem 30. It will be convenient to introduce the intermediate frequency scales

\displaystyle  N_0 \leq \tilde N_0 \leq N'_1 \leq N_1


\displaystyle  \tilde N_0 := N_0^{1+\varepsilon^2}

is slightly larger than {N_0}, and

\displaystyle  N'_1 := \lfloor N_1^{1-\varepsilon^2} \rfloor

is slightly smaller than {N_1} (and constrained to be integer).

Before we begin the rigorous argument, we first give a heuristic explanation of the numerology. The initial solution {u^{(0)}} has about {M} degrees of regularity controlled at {N_0}. For technical reasons we will upgrade this to an infinite amount of regularity, at the cost of worsening the frequency bound slightly from {N_0} to {\tilde N_0}. Next, to cancel the Reynolds stress {R^{(0)}} up to a smaller error {R^{(1)}}, we will perturb {u^{(0)}} by some high frequency correction {v}, basically oscillating at spatial frequency {N_1} (and temporal frequency {N_1^\varepsilon}), so that {v^i v^j} is approximately equal to {-(R^{(0)})^{ij}} (minus a pressure term) after averaging at spatial scales {1/N'_1}. Given the size bound (76), one expects to achieve this with {v} of {L^2_{t,x}} norm about {A N_0^{-(1+\varepsilon)s' - 5 \varepsilon^2} = N_1^{-s'} N_0^{-5\varepsilon^2}}. By exploiting the small gap between {N'_1} and {N_1}, we can make {v} concentrate on a fairly small measure set (of density something like {N_1^{-\varepsilon^2(d-1)}}), which in high dimension allows us to make linear terms such as {\partial_t v} and {\nu \Delta v} (as well as the correlation terms {\partial_j ((u^{(0)})^i v^j)} and {\partial_j (v^i (u^{(0)})^j)}) negligible in size (as measured using {L^1} type norms) when compared against quadratic terms such as {\partial_j (v^i v^j)} (cf. the proof of Proposition 26). The defect {\partial_j (v^i v^j) - \partial_j R^{ij}} will then oscillate at frequency {N'_1}, but can be selected to be of size about {A^2 N'_0 (N_1^{-s'} N_0^{-5\varepsilon^2})^2} in {L^1_{t,x}} norm, because can choose {v} to cancel off all the high-frequency (by which we mean {N'_1} or greater) contributions to this term, leaving only low frequency contributions (at frequencies {N'_0} or below). Using the ellipticity of the Laplacian, we can then express this defect as {\partial_j \tilde R^{ij}} where the {L^1_{t,x}} norm of {\tilde R^{ij}} is of order

\displaystyle  A^2 (N'_1)^{-1} \tilde N_0 (N_1^{-s'} N_0^{-5\varepsilon^2})^2 = N_1^{-2s' - \varepsilon + O(\varepsilon^2)}.

When {s<1/2}, this is slightly less than {A^2 N_1^{-2(1+\varepsilon)s' - 10\varepsilon^2} }, allowing one to close the argument.

We now turn to the rigorous details. In a later part of the argument we will encounter a loss of derivatives, in that the new Navier-Stokes-Reynolds flow {(u^{(1)},p^{(1)},R^{(1)})} has lower amounts of controlled regularity (in both space and time) than the Navier-Stokes-Reynolds flow{(u^{(0)},p^{(0)},R^{(0)})} used to construct it. To counteract this loss of derivatives we need to perform an initial mollification step, which improves the amount of regularity from {M} derivatives in space and one in time to an unlimited number of derivatives in space and two derivatives in time, at the cost of worsening the estimates on {(u^{(0)},p^{(0)},R^{(0)})} slightly (basically by replacing {N_0} with {\tilde N_0}).

Proposition 31 (Mollification) Let the notation and hypotheses be as above. Then we can find a Navier-Stokes-Reynolds flow {(\tilde u^{(0)}, \tilde p^{(0)}, \tilde R^{(0)})} obeying the estimates

\displaystyle  \| (\tilde N_0^{-\varepsilon} \partial_t, \tilde N_0^{-1} \nabla x)^m \nabla_x \tilde u^{(0)} \|_{L^2_{t,x}} \lesssim_m A \tilde N_0^{1-s'} \ \ \ \ \ (77)


\displaystyle  \| (\tilde N_0^{-\varepsilon} \partial_t, \tilde N_0^{-1} \nabla x)^m \tilde R \|_{L^1_{t,x}} \lesssim_m A^2 N_0^{-9\varepsilon^2} N_1^{-2s'} \ \ \ \ \ (78)

for all {m \geq 0}, and such that

\displaystyle  \| (N_1^{-\varepsilon} \partial_t)^{\leq 1} (\tilde u^{(0)} - u^{(0)}) \|_{L^2_{t,x}} \lesssim A N_1^{-s'} \ \ \ \ \ (79)


\displaystyle  \| \tilde u^{(0)} - u^{(0)} \|_{L^1_{t,x}} \lesssim A N_1^{-10}. \ \ \ \ \ (80)

Furthermore, if {(u^{(0)},p^{(0)},R^{(0)})} is supported on {I \times \mathbf{T}_E} for some interval {I}, then one can ensure that {(\tilde u^{(0)}, \tilde p^{(0)},\tilde R^{(0)})} is supported in {I'' \times \mathbf{T}_E}, where {I''} is the {N_0^{-\varepsilon^2}/2}-neighbourhood of {I}.

We remark that this sort of mollification step is now a standard technique in any iteration scheme that involves loss of derivatives, including the Nash-Moser iteration scheme that was first used to prove Theorem 8.

Proof: Let {\phi: {\bf R} \times {\bf R}^d \rightarrow {\bf R}} be a bump function (depending only on {M}) supported on the region {\{ (s,y): |s| \leq 1/2M, |y| \leq 1/M\}} of total mass {1}, and define the averaging operator {P} on smooth functions {f: \Omega \rightarrow {\bf R}} by the formula

\displaystyle  Pf(t,x) := \int_{{\bf R} \times {\bf R}^d} \phi(s,y) f(t-\tilde N_0^{-\varepsilon} s,x- \tilde N_0^{-1} y)\ ds dy.

From the fundamental theorem of calculus we have

\displaystyle  P = I - Q

where {I} is the identity operator and

\displaystyle  Qf(t,x) := \int_0^1 \int_{{\bf R} \times {\bf R}^d} \phi(s,y)

\displaystyle  (s \tilde N_0^{-\varepsilon} \partial_t + y \cdot \tilde N_0^{-1} \nabla_x) f(t-\theta \tilde N_0^{-\varepsilon} s, x- \theta \tilde N_0^{-1} y)\ ds dy.

The operators {P} and {Q} will behave like low and high frequency Littlewood-Paley projections. (We cannot directly use these projections here because their convolution kernels are not localised in time.)

Observe that {P,Q} are convolution operators and thus commute with each other and with the partial derivatives {\partial_t, \nabla_x}. If we apply the operator {I - Q^M} to (26), (27), (28), we see that {(\tilde u^{(0)}, \tilde p^{(0)}, \tilde R^{(0)})} is Navier-Stokes-Reynolds flow, where

\displaystyle  \tilde u^{(0)} := (I-Q^M) u^{(0)}

\displaystyle  \tilde p^{(0)} := (I-Q^M) p^{(0)}

\displaystyle  \tilde R_{ij}^{(0)} := (I-Q^M) R_{ij}^{(0)}

\displaystyle + (I-Q^M)(u_i^{(0)} u_j^{(0)}) - ((I-Q^M) u_i^{(0)}) ((I-Q^M) u_j^{(0)}).

Since {Q=I-P}, {I-Q^M} is a linear combination of the operators {P, P^2, \dots, P^M}. In particular, we see that {(\tilde u^{(0)}, \tilde p^{(0)}, \tilde R^{(0)})} is supported on {I'' \times \mathbf{T}}.

We abbreviate {D := (\tilde N_0^{-\varepsilon} \partial_t, \tilde N_0^{-1} \nabla x)}. For any {m \geq 0}, we have

\displaystyle  D^m P f(t,x) = \int_{{\bf R} \times {\bf R}^d} (\nabla_s, \nabla_y)^m \phi(s,y) f(t-\tilde N_0^{-\varepsilon} s,x- \tilde N_0^{-1} y)\ ds dy

and therefore deduce the bounds

\displaystyle  \| D^m P f \|_{L^p_{t,x}} \lesssim_{m} \| f \|_{L^p_{t,x}} \ \ \ \ \ (81)

for any {1 \leq p \leq \infty}, thanks to Young’s inequality. A similar application of Young’s inequality gives

\displaystyle  \| Q^m f \|_{L^p_{t,x}} \lesssim_{m} \| D^m f \|_{L^p_{t,x}} \ \ \ \ \ (82)

for all {m \geq 0} and {1 \leq p \leq \infty}.

From (81) and decomposing {1-Q^M} as linear combinations of {P,P^2,\dots,P^M}, we have

\displaystyle  \| D^m \nabla_x \tilde u^{(0)} \|_{L^2_{t,x}} \lesssim_{m} \| \nabla_x u^{(0)} \|_{L^2_{t,x}}

for any {m \geq 0}, and hence (77) follows from (75). In a similar spirit, from (82), (75) one has

\displaystyle  \| (N_1^{-\varepsilon} \partial_t)^{\leq 1} (\tilde u^{(0)} - u^{(0)}) \|_{L^2_{t,x}} \lesssim \| D Q^{M-1} (1-P) u^{(0)} \|_{L^2_{t,x}}

\displaystyle  \lesssim \| D^{M} u^{(0)} \|_{L^2_{t,x}}

\displaystyle  \leq \| N_0^{-\varepsilon^3 M} (N_0^{-1} \nabla x, N_0^{-\varepsilon} \partial_t)^{M} u^{(0)} \|_{L^2_{t,x}}

\displaystyle  \lesssim_{s,\varepsilon,M,d,\nu} A N_1^{-10}

if {M} is large enough, and this gives (79), (80).

Finally we prove (78). By the triangle inequality it suffices to show that

\displaystyle  \| D^m (I-Q^M) R^{(0)} \|_{L^1_{t,x}} \lesssim_m A^2 N_0^{-9\varepsilon^2} N_1^{-2s'} \ \ \ \ \ (83)


\displaystyle  \| D^m ((I-Q^M)(u_i^{(0)} u_j^{(0)}) - ((I-Q^M) u_i^{(0)}) ((I-Q^M) u_j^{(0)})) \|_{L^1_{t,x}} \lesssim_m A^2 N_0^{-9\varepsilon^2} N_1^{-2s'} \ \ \ \ \ (84)

for any {m \geq 0}. The claim (83) follows from (81), (76), after writing {1-Q^M} as a linear combination of {P,\dots,P^M} and noting that {N_0^{-2(1+\varepsilon)s' - 10\varepsilon^2} = N_0^{-10\varepsilon^2} N_1^{-2s'}}. For (84), if we again write {1-Q^M} as a linear combination of {P,\dots,P^M} and uses (81) and the Leibniz rule, one can bound the left-hand side of (84) by

\displaystyle  \lesssim_m \sum_{m_1+m_2=m} \| D^{m_1} u^{(0)} \|_{L^2_{t,x}} \| D^{m_2} u^{(0)} \|_{L^2_{t,x}}

and hence by (75) (bounding {D} by {N_0^{-\varepsilon^2} (N_0^{-1} \nabla x, N_0^{-\varepsilon} \partial_t)}) this is bounded by

\displaystyle  \lesssim_m N_0^{-\varepsilon^2 m} A^2 N_0^{2(1-s')}.

This gives (84) when {m \geq M/2}. For {m<M/2}, we rewrite the expression

\displaystyle  (I-Q^M)(u_i^{(0)} u_j^{(0)}) - ((I-Q^M) u_i^{(0)}) ((I-Q^M) u_j^{(0)})


\displaystyle  -Q^M(u_i^{(0)} u_j^{(0)} + (Q^M u_i^{(0)}) ((I-Q^M) u_j^{(0)}) + u_i^{(0)} Q^M u_j^{(0)}.

The contribution of the first term to (84) can be bounded using (82), (81) (splitting {Q^M = Q^{M-k} (1-P)^k}) by

\displaystyle  \lesssim \| D^M (u_i^{(0)} u_j^{(0)}) \|_{L^1_{t,x}}

which by the Leibniz rule, bounding {D} by {N_0^{-\varepsilon^2} (N_0^{-1} \nabla x, N_0^{-\varepsilon} \partial_t)}, and (75) is bounded by

\displaystyle  \lesssim N_0^{-\varepsilon^2 M} A^2 N_0^{2(1-s')}

which is again an acceptable contribution to (84) since {M} is large. The other terms are treated similarly. \Box

We return to the proof of Theorem 30. We abbreviate {D_1 := (N_1^{-\varepsilon} \partial_t, N_1^{-1} \nabla_x)}. Let {\tilde U := (\tilde u^{(0)}, \tilde p^{(0)}, \tilde R^{(0)})} be the Navier-Stokes-Reynolds flow constructed by Proposition 31. By using the ansatz

\displaystyle  (u^{(1)}, p^{(1)}, R^{(1)}) = (\tilde u^{(0)} + v, \tilde p^{(0)} + q, R^{(1)} ),

and the triangle inequality, it will suffice to locate a difference Navier-Stokes-Reynolds flow {(v, q, R^{(1)})} at {\tilde U} supported on {I' \times \mathbf{T}_E}, obeying the estimates

\displaystyle  \| D_1^{\leq M} \nabla_x (\tilde u^{(0)} + v) \|_{L^2_{t,x}} \leq A N_1^{1-s'} \ \ \ \ \ (85)

\displaystyle  \| R^{(1)} \|_{L^1_{t,x}} \lesssim A^2 N_1^{-2(1+\varepsilon)s' - 11\varepsilon^2} \ \ \ \ \ (86)

\displaystyle  \| D_1^{\leq 1} v \|_{L^2_{t,x}} \lesssim A N_1^{-s'} \ \ \ \ \ (87)

\displaystyle  \| v \|_{L^1_{t,x}} \lesssim A N_1^{-10}. \ \ \ \ \ (88)

From (77) we have

\displaystyle  \| D_1^{\leq M} \nabla_x \tilde u^{(0)} \|_{L^2_{t,x}} \leq A N_1^{1-s'}/2

so by the triangle inequality we can replace (85) by

\displaystyle  \| D_1^{\leq M} \nabla_x v \|_{L^2_{t,x}} \leq A N_1^{1-s'}/2

and then (85), (87) may then be replaced by the single estimate

\displaystyle  \| D_1^{\leq M+1} v \|_{L^2_{t,x}} \lesssim A N_1^{-s'-\varepsilon^2} \ \ \ \ \ (89)

(say). By using Exercise 24 as in the previous section, it then suffices to construct an approximate difference Navier-Stokes-Reynolds flow {(v, q, R^{(1)}, f, F)} to the difference equation at {U} supported on supported in {I' \times \mathbf{T}_E} obeying the bounds (86), (89),

\displaystyle  \| D_1^{\leq M} f \|_{L^2_{t,x}} \lesssim A N_1^{-20} \ \ \ \ \ (90)


\displaystyle  \| F \|_{L^1_{t,x}} \lesssim A N_1^{-20}. \ \ \ \ \ (91)

Now, we pass to fast and slow variables. Let {\mathbf{D}} denote the tuple

\displaystyle  \mathbf{D} := (N_1^{-\varepsilon} \partial_t, N_1^{-1} \nabla_x, N_1^{-\varepsilon^2} \nabla_y);

informally, the use of {\mathbf{D}} is consistent with oscillations in time of wavelength {\gtrsim N_1^{-\varepsilon}}, in the slow variable {x} of wavelength {\gtrsim N_1^{-1}}, and in the fast variable {y} of wavelength {\gtrsim N_1^{-\varepsilon^2} \sim N'_1/N_1}.

Exercise 32 By using the method of fast and slow variables as in the previous section, show that to construct the approximate Navier-Stokes-Reynolds flow {(v, q, R^{(1)}, f, F)} at {U} obeying the bounds (86), (89), (90), (91), it suffices to locate an approximate fast-slow solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} to the difference Navier-Stokes-Reynolds equation at {\tilde U} (at frequency scale {N'_1 \sim N_1^{1-\varepsilon^2}} rather than {N_1}) and supported in {I' \times \mathbf{T}_E \times \mathbf{T}_F} that obey the bounds

\displaystyle  \| \mathbf{D}^{\leq M+1} \mathbf{v} \|_{L^2_{t,x,y}} \lesssim A N_1^{-s'-\varepsilon^2}. \ \ \ \ \ (92)

\displaystyle  \| \mathbf{R} \|_{L^1_{t,x,y}} \lesssim A^2 N_1^{-2(1+\varepsilon)s' - 11\varepsilon^2} \ \ \ \ \ (93)

\displaystyle  \| \mathbf{v} \|_{L^1_{t,x,y}} \lesssim A N_1^{-20} \ \ \ \ \ (94)

\displaystyle  \| \mathbf{D}^{\leq M} \mathbf{f} \|_{L^2_{t,x,y}} \lesssim A N_1^{-20} \ \ \ \ \ (95)

\displaystyle  \| \mathbf{F} \|_{L^1_{t,x,y}} \lesssim A N_1^{-20}. \ \ \ \ \ (96)

As in the previous section, we can then pass to simplified fast-slow soutions:

Exercise 33 Show that to construct the approximate fast-slow solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} to the difference equation at {\tilde U} obeying the estimates of the previous exercise, it will in fact suffice to locate a simplified fast-slow solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} at {\tilde U} (again at frequency scale {N'_1}) supported on {I' \times \mathbf{T}_E \times \mathbf{T}_F}, obeying the bounds (92), (93), (95), (96) and

\displaystyle  \| \mathbf{D}^{\leq 2} \mathbf{v} \|_{L^2_t L^2_x L^1_y} \lesssim A N_1^{-30}. \ \ \ \ \ (97)

(Hint: one will need the estimate

\displaystyle  \| \tilde u \|_{L^2_t L^2_x L^\infty_y} = \| \tilde u \|_{L^2_{t,x}} \lesssim A \tilde N_0^{1-s'}

from Proposition 31.)

Now we need to construct a simplified fast-slow solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F})} at {\tilde U} supported on {I' \times \mathbf{T}_E \times \mathbf{T}_F} obeying the bounds (92), (93), (95), (96), (97). As in the previous section, we do this in stages, first finding a solution that cancels off the top order terms {N'_1 \partial_{y^j}(\mathbf{v}^i \mathbf{v}^j)} and {N'_1 \partial_{y^i} \mathbf{v}^i}, and also such that {\mathbf{v}^i \mathbf{v}^j + \tilde R^{ij} + q \eta^{ij}} is “ high frequency” (mean zero in {y}). Then we apply a divergence corrector to completely eliminate {\mathbf{f}}, followed by a stress corrector that almost completely eliminates {\mathbf{F}}.

As before, we need to select {q} so that {-\tilde R^{ij} - q \eta^{ij}} is positive definite. In the previous section we essentially took {q} to be a large multiple of {|\tilde R|}, but now we will need good control on the derivatives of {q}, which requires a little more care. Namely, we will need the following technical lemma:

Lemma 34 (Smooth polar-type decomposition) There exists a factorisation {\tilde R = w^2 S}, where {w: \Omega \rightarrow {\bf R}}, {S: \Omega \rightarrow {\bf R}^{d^2}} are smooth, supported on {I' \times \mathbf{T}_E}, and obey the estimates

\displaystyle  \| (\tilde N_0^{-\varepsilon} \partial_t, \tilde N_0^{-1} \nabla x)^{\leq 10M} w \|_{L^2_{t,x}} \lesssim A N_0^{-9\varepsilon^2/2} N_1^{-s'} \ \ \ \ \ (98)

\displaystyle  \| (\tilde N_0^{-\varepsilon} \partial_t, \tilde N_0^{-1} \nabla x)^{\leq 10M} S \|_{L^\infty_{t,x}} \lesssim 1. \ \ \ \ \ (99)

Proof: We may assume that {\tilde R} is not identically zero, since otherwise the claim is trivial. For brevity we write {D := (\tilde N_0^{-\varepsilon} \partial_t, \tilde N_0^{-1} \nabla x)} and {Q := A^2 N_0^{-9\varepsilon^2} N_1^{-2s'}}. From (78) we have

\displaystyle  \| D^{\leq 20M} \tilde R \|_{L^1_{t,x}} \lesssim Q. \ \ \ \ \ (100)

Let {B} denote the spacetime cylinder {B := \{ (s,y) \in {\bf R} \times {\bf R}^d: |s| \leq 1, |y| \leq 1 \}}, and let {F: \Omega \rightarrow {\bf R}^+} denote the maximal function

\displaystyle  F(t,x) := \sup_{(s,y) \in B} |D^{\leq 10M} \tilde R(t+\tilde N_0^{-\varepsilon} s,x+ \tilde N_0^{-1} y)|.

From the fundamental theorem of calculus (or Sobolev embedding) one has the pointwise estimate

\displaystyle  F(t,x) \lesssim \int_B |D^{\leq 20M} \tilde R(t+\tilde N_0^{-\varepsilon} s,x+ \tilde N_0^{-1} y)|\ ds dy,

thus by Fubini’s theorem and (101)

\displaystyle  \| F \|_{L^1_{t,x}} \lesssim Q. \ \ \ \ \ (101)

We do not have good control on the derivatives of {F}, so we apply a smoothing operator. Let {G: \Omega \rightarrow {\bf R}^+} denote the function

\displaystyle  G(t,x) := \int_{{\bf R} \times {\bf R}^d} F( t + \tilde N_0^{-\varepsilon} s, x+ \tilde N_0^{-1} y) \langle (s,y) \rangle^{-10d}\ ds dy

where {\langle (s,y) \rangle := (1+|s|^2+|y|^2)^{1/2}}, then by Fubini’s theorem (or Young’s inequality)

\displaystyle  \| G \|_{L^1_{t,x}} \lesssim Q. \ \ \ \ \ (102)

Also, {G} is smooth and strictly positive everywhere, and from differentation under the integral sign and integration by parts we have

\displaystyle  D^m G(t,x) \lesssim_{m} \int_{{\bf R} \times {\bf R}^d} F( t + \tilde N_0^{-\varepsilon} s, x+ \tilde N_0^{-1} y) |\nabla_{s,y} \langle (s,y) \rangle^{-10d}|\ ds dy

and hence

\displaystyle  D^m G \lesssim_{m} G \ \ \ \ \ (103)

for any {m \geq 0}. Also, from construction one has

\displaystyle  |D^{\leq 2M} \tilde R(t,x)| \lesssim \int_B F( t + \tilde N_0^{-\varepsilon} s, x + \tilde N_0^{-1} y)\ ds dy \ \ \ \ \ (104)

\displaystyle  \lesssim G(t,x).

Write {w := G^{1/2}}. From many applications of the chain rule (or the Faá di Bruno formula), we see that for any {j \geq 0}, {D^j w} is a linear combination of {O_j(1)} terms of the form

\displaystyle  (D^{m_1} G) \dots (D^{m_k} G) G^{\frac{1}{2} - k}

where {m_1,\dots,m_k \geq 1} sum up to {m} (more precisely, each component of {D^m w} is a linear combination of expressions of the above form in which one works with individual components of each factor {D^{m_i} G} rather than the full tuple {D^{m_i} G}). From (103) we thus have the pointwise estimate

\displaystyle  D^m w \lesssim_{m} G^{1/2}

for any {m}, and (98) now follows from (102). A similar argument gives

\displaystyle  D^m(G^{-1}) \lesssim_{m} G^{-1}

for any {m \geq 0}, hence if we set {S := G^{-1} \tilde R}, then by the product rule

\displaystyle  D^{\leq 20M} S \lesssim_{m} G^{-1} D^{\leq 20M} \tilde R

and (99) now follows from (104).

Strictly speaking we are not quite done because {w} is not supported in {I' \times \mathbf{T}_E}, but if one applies a smooth cutoff function in time that equals {1} on {I''} (where {\tilde R} is supported in time) and vanishes outside of {I'}, we obtain the required support property without significantly affecting the estimates. \Box

Let {\tilde R = w^2 S} be the factorisation given by the above lemma. If we set {q := -C S} for a sufficiently large constant {C} depending only on {s,\varepsilon,M,d}, then

\displaystyle  -\tilde R^{ij} + q \eta^{ij} = w^2 ( C \eta^{ij} - \tilde S^{ij} ).

For {C} large enough, we see from (99) that the matrix with entries {C \eta^{ij} - \tilde S^{ij}} takes values in a compact subset of positive definite {d \times d} matrices) that depends only on {s,\varepsilon,M,d}. Applying Exercise 27, we conclude that there exist non-zero lattice vectors {e_1,\dots,e_K \in {\bf Z}^d} and smooth functions {b_1,\dots,b_K: \Omega \rightarrow {\bf R}} for some {K \geq 0} such that

\displaystyle  - \tilde S^{ij}(t,x) + q \eta^{ij} = \sum_{k=1}^K b_k(t,x)^2 e_k^i e_k^j

for all {(t,x) \in \Omega}, and furthermore (from (99) the chain rule) we have the derivative estimates

\displaystyle  \| (\tilde N_0^{-1} \nabla x, \tilde N_0^{-\varepsilon} \partial_t)^{\leq 2M} b_k \|_{L^\infty_{t,x}} \lesssim 1

for {k=1,\dots,K}. Setting {a_k := w b_k}, we thus have

\displaystyle  -\tilde R^{ij} + q \eta^{ij} = \sum_{k=1}^K a_k(t,x)^2 e_k^i e_k^j

and from the Leibniz rule and (98) we have

\displaystyle  \| (\tilde N_0^{-\varepsilon} \partial_t, \tilde N_0^{-1} \nabla x)^{\leq 10M} a_k \|_{L^2_{t,x}} \lesssim A N_0^{-9\varepsilon^2/2} N_1^{-s'} \ \ \ \ \ (105)

for {k=1,\dots,K}.

Let {T_1,\dots,T_k} be the disjoint tubes in {\mathbf{T}_F} from the previous section, with width {N_1^{-\varepsilon^2}} rather than {N^{-\varepsilon}}. Construct the functions {\mathbf{\psi}_k: \mathbf{T}_F \rightarrow {\bf R}} as in the previous section, and again set

\displaystyle  \mathbf{w}_k^i(y) := \Delta^M_y \mathbf{\psi}_k(y) e_k^i.

Then as before, each {\mathbf{w}_k} is divergence free, and obeys the identities (65), (66) and the counds

\displaystyle  \partial_{y^j}( \mathbf{w}_k^i \mathbf{w}_k^j ) = 0

and the normalisation

\displaystyle  \int_{\mathbf{T}_F} \mathbf{w}_k^i \mathbf{w}_k^j = e_k^i e_k^j \ \ \ \ \ (106)


\displaystyle  \| (N_1^{-\varepsilon^2} \nabla_y)^m (N_1^{-2\varepsilon^2} \Delta_y)^{-m'} \mathbf{w}^k \|_{L^2_y} \lesssim_{U,m} 1 \ \ \ \ \ (107)

for all {m \geq 0} and {0 \leq m' \leq M}. As in the preceding section, we then set

\displaystyle \mathbf{v}^i(t,x,y) := \sum_{k=1}^K a_k(t,x) \mathbf{w}_k^i

\displaystyle \mathbf{F}^i(t,x,y) := \partial_{x^j}( \sum_{k=1}^K a_k(t,x)^2 (\mathbf{w}_k^i(y) \mathbf{w}_k^j(y) - e_k^i e_k^j) )

\displaystyle \mathbf{f}(t,x,y) := \sum_{k=1}^K \partial_{x^i} a_k(t,x) \mathbf{w}_k^i(y)

and one easily checks that {(\mathbf{v}, q, 0, \mathbf{f}, \mathbf{F})} is a simplified fast-slow solution supported in {I' \times \mathbf{T}_E \times \mathbf{T}_F}. Direct calculation using the Leibniz rule and (105), (107) then gives the bounds

\displaystyle  \| \mathbf{D}^{\leq 10M} \mathbf{v} \|_{L^2_{t,x,y}} \lesssim_{M} A N_0^{-9\varepsilon^2/2} N_1^{-s'} \ \ \ \ \ (108)

\displaystyle  \| \mathbf{D}^{\leq 10M-1} \mathbf{f} \|_{L^2_{t,x,y}} \lesssim_{M} A \tilde N_0 N_0^{-9\varepsilon^2/2} N_1^{-s'} \ \ \ \ \ (109)

\displaystyle  \| \mathbf{D}^{\leq 10M-1} \mathbf{F} \|_{L^1_{t,x,y}} \lesssim_{M} A^2 \tilde N_0 N_0^{-9\varepsilon^2} N_1^{-2s'}. \ \ \ \ \ (110)

As before, {\mathbf{F}} is “high frequency” (mean zero in the {y} variable). Also, {\mathbf{v}} is supported on the set {\Omega \times \bigcup_{k=1}^K T^k}, and for {d} large enough the latter set {\bigcup_{k=1}^K T_k} has measure (say) {O_{s,\varepsilon,M,d,\nu}(N_1^{-100})}. Thus by Cauchy-Schwarz (in just the {y} variable) one has

\displaystyle  \| \mathbf{D}^{\leq 10M} \mathbf{v} \|_{L^2_t L^2_x L^1_y} \lesssim A N_1^{-30}. \ \ \ \ \ (111)

The divergence corrector can be applied without difficulty:

Exercise 35 Show that there is a simplified fast-slow solution {(\mathbf{v}', q, \mathbf{R}, \mathbf{f}', \mathbf{F})} supported in {I' \times \mathbf{T}_E \times \mathbf{T}_F} obeying the estimates

\displaystyle  \| \mathbf{D}^{\leq 5M} \mathbf{v}' \|_{L^2_{t,x,y}} \lesssim A N_0^{-9\varepsilon^2/2} N_1^{-s'}

\displaystyle  \| \mathbf{D}^{\leq 5M} \mathbf{v}' \|_{L^2_t L^2_x L^1_y} \lesssim A N^{-30}

\displaystyle  \| \mathbf{D}^{\leq 5M} \mathbf{f}' \|_{L^2_{t,x,y}} \lesssim A N^{-30}

\displaystyle  \| \mathbf{D}^{\leq 5M} \mathbf{R} \|_{L^1_{t,x,y}} \lesssim A^2 \tilde N_0 N_1^{-1} N_0^{-9\varepsilon^2} N_1^{-2s'}.

The crucial thing here is the tiny gain {\tilde N_0 N_1^{-1}} in the third estimate, with the first factor {\tilde N_0} coming from a “slow” derivative {\nabla_x} and the second factor {N_1^{-1}} coming from essentially inverting a “fast” derivative {\nabla_y}.

Finally, we apply a stress corrector:

Exercise 36 Show that there is a simplified fast-slow solution {(\mathbf{v}', \mathbf{q}, \mathbf{R}', \mathbf{f}', \mathbf{F}')} supported in {I' \times \mathbf{T}_E \times \mathbf{T}_F} obeying the estimates

\displaystyle  \| \mathbf{F} \|_{L^1_{t,x,y}} \lesssim A^2 N^{-30}

\displaystyle  \| \mathbf{D}^{\leq M} \mathbf{R}' \|_{L^1_{t,x,y}} \lesssim A^2 \tilde N_0 (N'_1)^{-1} N_0^{-9\varepsilon^2} N_1^{-2s'}.

Again, we have a crucial gain of {\tilde N_0 (N'_1)^{-1}} coming from applying a slow derivative and inverting a fast one.


\displaystyle  \tilde N_0 (N'_1)^{-1} N_0^{-9\varepsilon^2} N_1^{-2s'} \lesssim N_1^{-2s' - \varepsilon + O( \varepsilon^2 )}

(with implied constant in the exponent uniform in {\varepsilon}) and {s' < s < 1/2}, we see (for {\varepsilon} small enough) that

\displaystyle \tilde N_0 (N'_1)^{-1} N_0^{-9\varepsilon^2} N_1^{-2s'} \leq N_1^{-2(1+\varepsilon)s' - 11\varepsilon^2}

and the desired estimates (92), (93), (95), (96), (97) now follow.

— 5. Constructing low regularity weak solutions to Euler —

Throughout this section, we specialise to the Euler equations {\nu=0} in the three-dimensional case {d=3} (although all of the arguments here also apply without much modification to {d>3} as well). In this section we establish an analogue of Corollary 28:

Proposition 37 (Low regularity non-trivial weak solutions) There exists a periodic weak {C^0} solution {u} to Euler which equals zero at time {t=0}, but is not identically zero.

This result was first established by de Lellis and Szekelyhidi. Our approach will deviate from the one in that paper in a number of technical respects (for instance, we use Mikado flows in place of Beltrami flows, and we place more emphasis on the method of fast and slow variables). A key new feature, which was not present in the high-dimensional Sobolev-scale setting, is that the material derivative term {(\partial_t + u_j \partial_j) v_i} in the difference Euler-Reynolds equations is no longer negligible, and needs to be treated by working with an ansatz in Lagrangian coordinates (or equivalently, an ansatz transported by the flow). (This use of Lagrangian coordinates is implicit in the thesis of Isett, this paper of de Lellis and Szekelyhidi, and in the later work of Isett.)

Just as Corollary 28 was derived from Proposition 26, the above proposition may be derived from

Proposition 38 (Weak improvement of Euler-Reynolds flows) Let {U = (u,p,R)} be an Euler-Reynolds flow supported on a strict compact subinterval {I \subsetneq {\bf R}/{\bf Z}}. Let {I'} be another interval in {{\bf R}/{\bf Z}} containing {I} in its interior. Then for sufficiently large {N}, there exists a Euler-Reynolds flow {\tilde U = (\tilde u, \tilde p, \tilde R)} supported in {I' \times \mathbf{T}_E} obeying the estimates

\displaystyle  \| (N^{-1} \partial_t, N^{-1} \nabla_x)^{\leq m} \tilde u \|_{C^0_{t,x}} \lesssim_{U,m,I,I'} 1 \ \ \ \ \ (112)

\displaystyle  \| \tilde R \|_{C^0_{t,x}} \lesssim_{U,I,I'} N^{-1} \ \ \ \ \ (113)

for all {m \geq 0}, and such that

\displaystyle  \|\tilde u - u \|_{C^0_{t,x}} \lesssim_{I,I'} \| R \|_{C^0_{t,x}}^{1/2}; \ \ \ \ \ (114)

also, we have a decomposition

\displaystyle  \tilde u^i - u^i = E^i + \partial_j (E')^{ij} \ \ \ \ \ (115)

where {E^i, (E')^{ij}: \Omega \rightarrow {\bf R}} are smooth functions obeying the bounds

\displaystyle  \| E^i \|_{C^0_{t,x}}, \|(E')^{ij} \|_{C^0_{t,x}} \lesssim_{U,I,I'} N^{-1}. \ \ \ \ \ (116)

The point of the decomposition (115) is that it (together with the smallness bounds (116)) asserts that the velocity correction {\tilde u_i - u_i} is mostly “high frequency” in nature, in that its low frequency components are small. Together with (112), the bounds roughly speaking assert that it is only the frequency {\sim N} components of {\tilde u_i - u_i} that can be large in {C^0_{t,x}} norm. Unlike the previous estimates, it will be important for our arguments that {u} is supported in a strict subinterval {I} of {{\bf R}/{\bf Z}}, because we will not be able to extend Lagrangian coordinates periodically around the circle. Actually the long-time instability of Lagrangian coordinates causes significant technical difficulties to overcome when one wants to construct solutions in higher regularity Hölder spaces {C^{0,\alpha}}, and in particular for {\alpha} close to {1/3}; we discuss this in the next section.

Exercise 39 Deduce Proposition 37 from Proposition 38. (The decomposition (116) is needed to keep {\tilde u} close to {u} in a very weak topology – basically the {C^0_t C^{-1}_{x}} topology – but one which is still sufficent to ensure that the limiting solution constructed is not identically zero.)

We now begin the proof of Proposition 38, repeating many of the steps used to prove Proposition 26. As before we may assume that {R} is non-zero, and that {U} is supported in {I \times \mathbf{T}_E}. We can assume that {I'} is also a strict subinterval of {{\bf R}/{\bf Z}}.

Assume {N} is sufficiently large; by rounding we may assume that {N} is a natural number. Using the ansatz

\displaystyle  (\tilde u, \tilde p, \tilde R) = (u + v, p + q, \tilde R),

and the triangle inequality, it suffices to construct a difference Euler-Reynolds flow {(v, q, \tilde R)} at {U} supported on {I' \times \mathbf{T}_E} and obeying the bounds

\displaystyle  \| (N^{-1} \partial_t, N^{-1} \nabla)^{\leq k} v \|_{C^0_{t,x}} \lesssim_{U,k} 1

\displaystyle  \| \tilde R \|_{C^0_{t,x}} \lesssim_{U} N^{-1}

\displaystyle  \| v \|_{C^0_{t,x}} \lesssim \| R \|_{C^0_{t,x}}^{1/2}

for all {k \geq 0}, and for which we have a decomposition {v_i = E_i + \partial_j E'_{ij}} obeying (116).

As before, we permit ourselves some error:

Exercise 40 Show that it suffices to construct an approximate difference Euler-Reynolds flow {(v, q, \tilde R, f, F)} at {U} supported on {I' \times \mathbf{T}_E} and obeying the bounds

\displaystyle  \| (N^{-1} \partial_t, N^{-1} \nabla_x)^{\leq m} v \|_{C^0_{t,x}} \lesssim_{U,m} 1 \ \ \ \ \ (117)

\displaystyle  \| \tilde R\|_{C^0_{t,x}} \lesssim_{U} N^{-1} \ \ \ \ \ (118)

\displaystyle  \| (N^{-1} \partial_t, N^{-1} \nabla_x)^{\leq m} f \|_{C^0_{t,x}} \lesssim_{U} N^{-20} \ \ \ \ \ (119)

\displaystyle  \| F \|_{C^0_{t,x}} \lesssim_{U} N^{-20} \ \ \ \ \ (120)

\displaystyle  \| v \|_{C^0_{t,x}} \lesssim \| R \|_{C^0_{t,x}}^{1/2} \ \ \ \ \ (121)

for {m \geq 0}, and for which we have a decomposition {v^i = E^i + \partial_j (E')^{ij}} obeying (116).

It still remains to construct the approximate difference Euler-Reynolds flow obeying the claimed estimates. By definition, {(v,q,\tilde R, f, F)} has to obey the system of equations

\displaystyle  \partial_t v^i + \partial_j( u^i v^j + u^j v^i + v^i v^j + R^{ij} + q \eta^{ij} - \tilde R^{ij} ) = F^i \ \ \ \ \ (122)

\displaystyle  \partial_i v^i = f \ \ \ \ \ (123)

\displaystyle  \tilde R^{ij} = \tilde R^{ji} \ \ \ \ \ (124)

with a decomposition

\displaystyle  v^i = E^i + \partial_j (E')^{ij}. \ \ \ \ \ (125)

As {u} is divergence-free, the first equation (122) may be rewritten as

\displaystyle  {\mathcal D}_t v^i + u^i f + 2 v^j \partial_j u^i + \partial_j( v^i v^j + R^{ij} + q \eta^{ij} - \tilde R^{ij} ) = F^i \ \ \ \ \ (126)

where {{\mathcal D}_t v} is the material Lie derivative of {v}, thus

\displaystyle  {\mathcal D}_t v^i := \partial_t v^i + {\mathcal L}_u v^i = \partial_t v^i + u^j \partial_j v^i - v^j \partial_j u^i. \ \ \ \ \ (127)

The lower order terms {u_i f + 2 v_j \partial_j u_i} in (126) will turn out to be rather harmless; the main new difficulty is dealing with the material Lie derivative term {{\mathcal D}_t v_i}. We will therefore invoke Lagrangian coordinates in order to convert the material Lie derivative {{\mathcal D}_t} into the more tractable time derivative {\partial_t} (at the cost of mildly complicating all the other terms in the system).

We introduce a “Lagrangian torus” {{\mathbf T}_L := ({\bf R}/{\bf Z})^d} that is an isomorphic copy of the Eulerian torus {{\mathbf T}_E}; as in the previous section, we paramterise this torus by {a = (a^\alpha)_{\alpha=1,\dots,d}}, and adopt the usual summation conventions for the indices {\alpha,\beta,\gamma}. Let {X: I' \times {\mathbf T}_L \rightarrow {\mathbf T}_E} be a trajectory map for {u}, that is to say a smooth map such that for every time {t \in I'}, the map {X(t): {\mathbf T}_L \rightarrow {\mathbf T}_E} is a diffeomorphism and one obeys the ODE

\displaystyle  \partial_t X(t,a) = u(t,X(t,a))

for all {(t,a) \in I' \times {\mathbf T}}. The existence of such a trajectory map is guaranteed by the Picard existence theorem (it is important here that {I'} is not all of the torus {{\bf R}/{\bf Z}}); see also Exercise 1 from Notes 1. From (the periodic version of) Lemma 3 of Notes 1, we can ensure that the map {X} is volume-preserving, thus

\displaystyle  \det(\nabla X) = 1.

Recall from Notes 1 that

  • (i) Any Eulerian scalar field {f} on {I' \times {\mathbf T}_E} can be pulled back to a Lagrangian scalar field {X^* f} on {I' \times {\mathbf T}_L} by the formula

    \displaystyle  X^* f(t, a) := f(t, X(t,a));

  • (ii) Any Eulerian vector field {v^i} on {I' \times {\mathbf T}_E} can be pulled back to a Lagrangian vector field {(X^* v)^\alpha} on {I' \times {\mathbf T}_L} by the formula

    \displaystyle (X^* v)^\alpha(t, a) := (\nabla X(t,a)^{-1})^\alpha_i v^i(t, X(t,a))

    where {\nabla X(t,a)^{-1} = ((\nabla X(t,a)^{-1})^\alpha_i)_{i,\alpha=1,\dots,d}} is the inverse of the matrix {\nabla X(t,a) = (\nabla X(t,a)^i_\alpha)_{i,\alpha=1,\dots,d}}, defined by

    \displaystyle  \nabla X(t,a)^i_\alpha := \partial_\alpha X^i(t,a);


  • (iii) Any Eulerian rank {(2,0)} tensor {T^{ij}} on {I' \times {\mathbf T}_E}, can be pulled back to a Lagrangian rank {2} tensor {(X^* T)^{\alpha \beta}} on {I' \times {\mathbf T}_L} by the formula

    \displaystyle (X^* T)^{\alpha \beta}(t, a) := (\nabla X(t,a)^{-1})^\alpha_i (\nabla X(t,a)^{-1})^\beta_j T^{ij}(t, X(t,a)).

(One can pull back other tensors also, but these are the only ones we will need here.) Each of these pullback operations may be inverted by the corresponding pullback operation for the labels map {X^{-1}: (t,X(t,a)) \mapsto (t,a)} (also known as pushforward by {X}). One can compute how these pullbacks interact with divergences:

Exercise 41 (Pullback and divergence)

  • (i) If {v^i} is a smooth Eulerian vector field, show that the pullback of the divergence of {v} equals the divergence of the pullback of {v}:

    \displaystyle  X^* (\partial_i v^i) = \partial_\alpha (X^* v)^\alpha.

    (Hint: you will need to use the fact that {X} is volume-preserving. Similarly to Lemma 3 and Exercise 4 of Notes 1, one can establish this either using the laws of integration or the laws of differentiation.).

  • (ii) Show that there exist smooth functions {\tilde \Gamma^\alpha_{\beta \gamma}: I' \times {\mathbf T}_L \rightarrow {\bf R}} for {\alpha,\beta,\gamma =1,\dots,d} with the following property: for any smooth Eulerian rank {2} tensor {T^{ij}} on {I' \times {\mathbf T}_E}, with divergence {Y^i := \partial_j T^{ij}}, one has

    \displaystyle  (X^* Y)^\alpha = \partial_\beta (X^* T)^{\alpha \beta} + \tilde \Gamma^\alpha_{\beta \gamma} (X^* T)^{\gamma \beta}.

    (In fact, {\tilde \Gamma^\alpha_{\beta \gamma}} can be given explicitly as {\Gamma^\alpha_{\beta \gamma} + \Gamma^\sigma_{\sigma \beta} \delta^\alpha_\gamma}, where {\delta} is the Kronecker delta and {\Gamma^\alpha_{\beta \gamma}} is the Christoffel symbol {\Gamma^\alpha_{\beta \gamma}} associated with the pullback {X^* \eta} of the Euclidean metric – but we will not need this precise formula. The right-hand side may also be written (in Penrose abstract index notation) as {\nabla_\alpha (X^* T)^{\alpha \beta}}, where {\nabla_\alpha} is the covariant derivative associated to {X^* \eta}.)

As remarked upon in the exercise, these calculations can be streamlined using the theory of the covariant derivative in Riemannian geometry; we will not develop this theory further here, but see for instance these two blog posts.

If one now applies the pullback operation {X^*} to the system (126), (123), (124), (125) (and uses Exercise 16 from Notes 1 to convert the material Lie derivative into the ordinary time derivative) one obtain the equivalent system

\displaystyle  \partial_t (X^* v)^\alpha + (X^* u)^\alpha (X^* f) + 2 (X^* v)^\beta \partial_\beta ((X^* u)^\alpha + \partial_\beta \underline{T}^{\alpha \beta} + \tilde \Gamma^\alpha_{\beta \gamma} \underline{T}^{\gamma \beta})

\displaystyle = (X^* F)^\alpha

\displaystyle  \partial_\alpha (X^* v)^\alpha = X^* f

\displaystyle  (X^* \tilde R)^{\alpha \beta} = (X^* \tilde R)^{\beta \alpha}

\displaystyle  (X^* v)^\alpha = (X^* E)^\alpha + \partial_\beta (X^* E')^{\alpha \beta} + \tilde \Gamma^\alpha_{\beta \gamma} (X^* E')^{\gamma \beta}

where {\underline{T}} denotes the rank {(2,0)} tensor

\displaystyle  \underline{T}^{\alpha \beta} := (X^* v)^\alpha (X^* v)^\beta + (X^* R)^{\alpha \beta} + (X^* q) (X^* \eta)^{\alpha \beta} - (X^* \tilde R)^{\alpha \beta} .

Thus, if one introduces the Lagrangian fields

\displaystyle  \underline{v} := X^* v; \quad \underline{u} := X^* u; \quad \underline{F} := X^* F; \quad \underline{f} := X^* f;

\displaystyle  \underline{\tilde R} := X^* \tilde R; \quad \underline{q} := X^* q; \quad \underline{\eta} := X^* \eta; \underline{E'} := X^* E'

and also

\displaystyle  \underline{E} := (X^* E)^\alpha + + \tilde \Gamma^\alpha_{\beta \gamma} (X^* E')^{\gamma \beta}

then (from many applications of the chain rule) we see that our task has now transformed to that of obtaining a {(\underline{v},\underline{q},\underline{\tilde R}, \underline{f}, \underline{F})} supported on {I' \times \mathbf{T}_L} obeying the equations

\displaystyle  \partial_t \underline{v}^\alpha + \underline{u}^\alpha \underline{f} + 2 \underline{v}^\beta \partial_\beta \underline{u}^\alpha + \partial_\beta \underline{T}^{\alpha \beta} + \tilde \Gamma^\alpha_{\beta \gamma} \underline{T}^{\gamma \beta}) = \underline{F}^\alpha \ \ \ \ \ (128)

\displaystyle  \partial_\alpha \underline{v}^\alpha = \underline{f} \ \ \ \ \ (129)

\displaystyle  \underline{\tilde R}^{\alpha \beta} = \underline{\tilde R}^{\beta \alpha} \ \ \ \ \ (130)

\displaystyle  \underline{v}^\alpha = \underline{E}^\alpha + \partial_\beta \underline{E'}^{\alpha \beta} \ \ \ \ \ (131)

where {\underline{T}} denotes the rank {(2,0)} tensor

\displaystyle  \underline{T}^{\alpha \beta} := \underline{v}^\alpha \underline{v}^\beta + \underline{R}^{\alpha \beta} + \underline{q} \underline{\eta}^{\alpha \beta} - \underline{\tilde R}^{\alpha \beta} \ \ \ \ \ (132)

and obeying the estimates

\displaystyle  \| (N^{-1} \partial_t, N^{-1} \nabla_a)^{\leq m} \underline{v} \|_{C^0_{t,a}} \lesssim_{U,m} 1 \ \ \ \ \ (133)

\displaystyle  \| \underline{\tilde R}\|_{C^0_{t,a}} \lesssim_{U} N^{-1} \ \ \ \ \ (134)

\displaystyle  \| (N^{-1} \partial_t, N^{-1} \nabla_a)^{\leq m} \underline{f} \|_{C^0_{t,a}} \lesssim_{U,m} N^{-20} \ \ \ \ \ (135)

\displaystyle  \| \underline{F} \|_{C^0_{t,a}} \lesssim_{U} N^{-20} \ \ \ \ \ (136)

\displaystyle  \| X_* \underline{v} \|_{C^0_{t,x}} \lesssim \| R \|_{C^0_{t,x}}^{1/2} \ \ \ \ \ (137)

\displaystyle  \| E \|_{C^0_{t,a}}, \|E' \|_{C^0_{t,a}} \lesssim_{U} N^{-1} \ \ \ \ \ (138)

for {m \geq 0}, where {C^0_{t,a}} denotes the supremum on the Lagrangian spacetime {I' \times \mathbf{T}_L}. (In (137) we have to move back to Eulerian coordinates because the coefficients in the pushforward {X_*} depend on {U}, and we want an estimate here uniform in {U}.)

This problem looks complicated, but the net effect of moving to the Lagrangian formulation is to arrive at a problem that is nearly identical to the Eulerian one, but in which the material Lie derivative {{\mathcal D}_t} has been replaced by the ordinary time derivative {\partial_t}, and several lower order terms with smooth variable coefficients have been added to the system.

Now that the dangerous transport term in the material Lie derivative has been eliminated, it is now safe to use the method of fast-slow variables, but now on the Lagrangian torus {\mathbf{T}_L} rather than the Eulerian torus {\mathbf{T}_E}. We now parameterise the fast torus {\mathbf{T}_F} by {b = (b^\alpha)_{\alpha=1,\dots,d}} (thus we think of {\mathbf{T}_F} now as a “Lagrangian fast torus” rather than a “Eulerian fast torus”) and use the ansatz

\displaystyle  \underline{v}(t,a) := \mathbf{v}( t, a, N a \hbox{ mod } 1)

\displaystyle  \underline{q}(t,a) := \mathbf{q}( t, a, N a \hbox{ mod } 1)

\displaystyle  \underline{R}(t,a) := \mathbf{R}( t, a, N a \hbox{ mod } 1)

\displaystyle  \underline{f}(t,a) := \mathbf{f}( t, a, N a \hbox{ mod } 1)

\displaystyle  \underline{F}(t,a) := \mathbf{F}( t, a, N a \hbox{ mod } 1)

\displaystyle  \underline{E}(t,a) := \mathbf{E}( t, a, N a \hbox{ mod } 1)

\displaystyle  \underline{E'}(t,a) := \mathbf{E}'( t, a, N a \hbox{ mod } 1)

so that the equations of motion now become

\displaystyle  \partial_t \mathbf{v}^\alpha + \underline{u}^\alpha \mathbf{f} + 2 \mathbf{v}^\beta \partial_{a^\beta} \underline{u}^\alpha + (\partial_{a^\beta} + N \partial_{b^\beta}) \mathbf{T}^{\alpha \beta} + \tilde \Gamma^\alpha_{\beta \gamma} \mathbf{T}^{\gamma \beta}) = \mathbf{F}^\alpha \ \ \ \ \ (139)

\displaystyle  (\partial_{a^\alpha} + N \partial_{b^\alpha}) \mathbf{v}^\alpha = \mathbf{f} \ \ \ \ \ (140)

\displaystyle  \mathbf{R}^{\alpha \beta} = \mathbf{R}^{\beta \alpha} \ \ \ \ \ (141)

\displaystyle  \mathbf{v}^\alpha = \mathbf{E}^\alpha + (\partial_{a^\beta}+N\partial_{b^\beta}) \mathbf{E'}^{\alpha \beta} \ \ \ \ \ (142)


\displaystyle  \mathbf{T}^{\alpha \beta} := \mathbf{v}^\alpha \mathbf{v}^\beta + \underline{R}^{\alpha \beta} + \mathbf{q} \underline{\eta}^{\alpha \beta} - \mathbf{R}^{\alpha \beta} \ \ \ \ \ (143)

where we think of {\underline{u}, \tilde \Gamma, \underline{\eta}} as “low frequency” functions of time {t} and the slow Lagrangian variable {a} only (thus they are independent of the fast Lagrangian variable {b}). Set

\displaystyle  D := ((N^{-1} \partial_t, N^{-1} \nabla_a, \nabla_b).

It will now suffice to find a smooth solution {(\mathbf{v}, \mathbf{q}, \mathbf{R}, \mathbf{f}, \mathbf{F}, \mathbf{E}, \mathbf{E'})} to the above system supported on {I' \times \mathbf{T}_L \times \mathbf{T}_F} obeying the estimates

\displaystyle  \| D^{\leq m} \mathbf{v} \|_{C^0_{t,a,b}} \lesssim_{U,m} 1 \ \ \ \ \ (144)

\displaystyle  \| \mathbf{R} \|_{C^0_{t,a,b}} \lesssim_{U} N^{-1} \ \ \ \ \ (145)

\displaystyle  \| D^{\leq m} \mathbf{f} \|_{C^0_{t,a,b}} \lesssim_{U,m} N^{-20} \ \ \ \ \ (146)

\displaystyle  \| \mathbf{F} \|_{C^0_{t,a,b}} \lesssim_{U} N^{-20} \ \ \ \ \ (147)

\displaystyle  \| \mathbf{E} \|_{C^0_{t,a,b}}, \|\mathbf{E'} \|_{C^0_{t,a}} \lesssim_{U} N^{-1} \ \ \ \ \ (148)

and obeying the pointwise estimate

\displaystyle  \partial_\alpha X^i(t,a) \mathbf{v}^\alpha(t,a,b) = O( \| R \|_{C^0_{t,x}}^{1/2} ) \ \ \ \ \ (149)

for {(t,a,b) \in I' \times \mathbf{T}_L \times \mathbf{T}_F}.

If {\mathbf{v}} is chosen to be “high frequency” (mean zero in the fast variable {b}), then we can automatically obtain the estimate (148), as one may obtain the decomposition (142) with

\displaystyle  \mathbf{E'}^{\alpha \beta} := N^{-1} \eta^{\beta \gamma} \Delta_b^{-1} \partial_{b^\gamma} \mathbf{v}^\alpha


\displaystyle  \mathbf{E}^\alpha := - \partial_{a^\beta} (\mathbf{E'})^{\alpha \beta}

at which point the estimates (148) follow from (144). Thus we may drop (148) and (142) from our requirements as long as we instead require {\mathbf{v}} to be high frequency.

As in previous sections, we can relax the conditions on {\mathbf{f}} and {\mathbf{F}}:

Exercise 42

  • (i) (Stress corrector) Show that we may replace the condition (147) with the condition that

    \displaystyle  \| D^{\leq m} \mathbf{F} \|_{C^0_{t,a,b}} \lesssim_{U,m} 1

    for all {m \geq 0}, and that {\mathbf{F}} is of high frequency (mean zero in {b}). (Hint: add a corrector of size {O_U(N^{-1})} to {\mathbf{R}}, {\mathbf{q}} so that the main term {N \partial_{b^\beta} ( \mathbf{q} \underline{\eta}^{\alpha \beta} - \mathbf{R}^{\alpha \beta} )} now cancels off {\mathbf{F}}, and the other terms created by the correction are of size {O_U(N^{-1})} and of mean zero in {y}. Then iterate.)

  • (ii) (Divergence corrector) After performing the modifications in (i), show that we may replace the condition (146) with the condition that

    \displaystyle  \| D^{\leq m} \mathbf{f} \|_{C^0_{t,a,b}} \lesssim_{U,m} 1

    for all {m}, and that {\mathbf{f}} has mean zero in {b}. (Note that correcting for {\mathbf{f}} will modify {\mathbf{v}} by {O_U(N^{-1})}, but this will make a negligible contribution to (149) for {N} large enough.)

We can now repeat the Mikado flow construction from previous sections:

Exercise 43 Set {\mathbf{q} := 100 \| R \|_{C^0_{t,x}}}. Construct a smooth vector field {\mathbf{v}} supported on {I' \times \mathbf{T}_L \times \mathbf{T}_F}, with {\mathbf{v}} of mean zero in {b}, obeying the equations

\displaystyle  \partial_{b^\beta} (\mathbf{v}^\alpha \mathbf{v}^\beta) = 0

\displaystyle  \partial_{b^\beta} \mathbf{v}^\beta = 0

and such that

\displaystyle  \mathbf{v}^\alpha \mathbf{v}^\beta + \underline{R}^{\alpha \beta} + \mathbf{q} \underline{\eta}^{\alpha \beta}

is of mean zero in {b}, obeying the bounds (144) for all {m \geq 0}. Also show that for a sufficiently large absolute constant {C= O(1)} not depending on {U}, one can ensure that the matrix with entries

\displaystyle  C (\underline{R}^{\alpha \beta} + \mathbf{q} \underline{\eta}^{\alpha \beta}) - \mathbf{v}^\alpha \mathbf{v}^\beta \ \ \ \ \ (150)

is positive semi-definite at every point {(t,a,b) \in I' \times \mathbf{T}_L \times \mathbf{T}_F}.

If we now set

\displaystyle  \mathbf{R}^{\alpha \beta} := 0

\displaystyle  \mathbf{F}^\alpha := \partial_t \mathbf{v}^\alpha + \underline{u}^\alpha \mathbf{f} + 2 \mathbf{v}^\beta \partial_{a^\beta} \underline{u}^\alpha + \partial_{a^\beta}\mathbf{T}^{\alpha \beta} + \tilde \Gamma^\alpha_{\beta \gamma} \mathbf{T}^{\gamma \beta}

\displaystyle  \mathbf{f} := \partial_{a^\alpha} \mathbf{v}^\alpha

one can verify that the equations (139), (140), (141) hold, and that {\mathbf{F}, \mathbf{f}} have mean zero in {b}. Furthermore, by pushing forward (150) by {X(t)}, we conclude that the matrix with entries

\displaystyle  C (R^{ij}(t,X(t,a)) + 100 \| R \|_{C^0_{t,x}} \eta^{ij})

\displaystyle - \partial_\alpha X^i(t,a) \mathbf{v}^\alpha(t,a,b) \partial_\beta X^j(t,a) \mathbf{v}^\beta(t,a,b)

is positive semi-definite for all {(t,a,b) \in I' \in \mathbf{T}_L \times \mathbf{T}_F}; taking traces one concludes (149). Thus we have obtained all the properties required in Exercise 43, concluding the proof of Proposition 38.

— 6. Constructing high regularity weak solutions to Euler —

We now informally discuss how to modify the arguments above to establish the negative direction (ii) of Onsager’s conjecture. The full details are rather complicated (and arranged slightly differently from the presentation here), and we refer to Isett’s original paper for details. See also a subsequent paper of Buckmaster, De Lellis, Szekelyhidi, and Vicol for a simplified argument establishing this statement (as well as some additional strengthenings of it).

Let {0 < \alpha < 1/3}, let {\varepsilon > 0} be a small quantity, and let {M} be a large integer. The main iterative step, analogous to Theorem 30, roughly speaking (ignoring some technical logarithmic factors) takes an Euler-Reynolds flow {(u^{(0)},p^{(0)},R^{(0)})} obeying the estimates that look like

\displaystyle  \| (N_0^{-1} \nabla_x)^{\leq M} \nabla_x u^{(0)} \|_{C^0_{t,x}} \lesssim N_0^{1-\alpha} \ \ \ \ \ (151)


\displaystyle  \| (N_0^{-1} \nabla_x)^{\leq M} R^{(0)} \|_{C^0_{t,x}} \lesssim N_0^{-2(1+\varepsilon)\alpha}. \ \ \ \ \ (152)

for some sufficiently large {N_0>1}, and obtains a new Euler-Reynolds flow {(u^{(1)},p^{(1)},R^{(1)})} close to {(u^{(0)},p^{(0)},R^{(0)})} that obeys the estimates

\displaystyle  \| (N_1^{-1} \nabla_x)^{\leq M} \nabla_x u^{(1)} \|_{C^0_{t,x}} \lesssim N_1^{1-\alpha} \ \ \ \ \ (153)


\displaystyle  \| (N_0^{-1} \nabla_x)^{\leq M} R^{(1)} \|_{C^0_{t,x}} \lesssim N_1^{-2(1+\varepsilon)\alpha}, \ \ \ \ \ (154)

where {N_1 := N_0^{1+\varepsilon}}; see Lemma 2.1 of Isett’s original paper for a more precise statement (in a slightly different notation), which also includes various estimates on the difference {u^{(1)}-u^{(0)}} that we omit here. In contrast to previous arguments, it is useful for technical reasons to not impose time regularity in the estimates. Once this claim is formalised and proved, conclusions such as Onsager’s conjecture follow from the usual iteration arguments.

To achieve this iteration step, the first step is a mollification step analogous to Proposition 31 in which one perturbs the initial flow to obtain additional spatial regularity on the solution. Roughly speaking, this mollification allows one to replace the purely spatial differential operator {(N_0^{-1} \nabla x)^{\leq M}} appearing in (151), (153) with {(N_0^{-1} \nabla x)^{\leq m}} for {m} much larger than {M} (in practice there are some slight additional losses, which we will ignore here).

Now one has to solve the difference equation. We focus on the equation (126) and omit the small errors involving {f, F}. Suppose for the time being that we could magically replace the material Lie derivative {{\mathcal D}_t} here by an ordinary time derivative, thus we would be trying to construct {v} solving an equation such as

\displaystyle  \partial_t v^i + 2 v^j \partial_j u^i + \partial_j( v^i v^j + R^{ij} + q \eta^{ij} - \tilde R^{ij} ) \approx 0.

As before, we can use the method of fast and slow variables to construct a {v} of amplitude roughly {\| R \|_{C_0}^{1/2} = O( N_1^{-\alpha})}, oscillating at frequency {N_1}, such that {v^i v^j + R^{ij} + q \eta^{ij}} is high frequency (mean zero in the fast variable) and has amplitude about {N_1^{-2\alpha}}. We can also arrange matters (using something like (65)) so that the fast derivative component of {\partial_j( v^i v^j + R^{ij} + q \eta^{ij})} vanishes, leaving only a slower derivative of size about {N_0}. This makes the expression {\partial_j( v^i v^j + R^{ij} + q \eta^{ij})} of magnitude about {O(N_0 N_1^{-2\alpha})} and oscillating at frequency about {N_1}, which can be cancelled by a stress corrector in {\tilde R} of magnitude about {O(N^{-1}_1 N_0 N_1^{-2\alpha}) \approx N_1^{-2\alpha - \varepsilon}}. Such a term would be acceptable (smaller than {N_1^{-2(1+\varepsilon) \alpha}}) for {\alpha} as large as {1/2}, in the spirit of Theorem 29.

However, one also has the terms {\partial_t v^i} and {v^j \partial_j u^i}, which (in contrast to the high-dimensional Sobolev scale setting) cannot be ignored in this low-dimensional Hölder scale problem. The natural time scale of oscillation here is {O(N_0^{\alpha-1})}, coming from the usual heuristics concerning the Euler equation (see Remark 11 from 254A Notes 3). With this heuristic, {\nabla u} and {\partial_t} should both behave like {O(N_0^{1-\alpha})}, these expressions would be expected to have amplitude {O(N_0^{1-\alpha} N_1^{-\alpha})}. They still oscillate at the high frequency {N_1}, though, and lead to a stress corrector of magnitude about {O( N_1^{-1} N_0^{1-\alpha} N_1^{-\alpha}) \approx N_1^{-2\alpha - \varepsilon + \alpha \varepsilon}}. This however remains acceptable for {\alpha} up to {1/3}, which in principle resolves Onsager’s conjecture.

Now we have to remove the “cheat” of replacing the material Lie derivative by the ordinary time derivative. As we saw in the previous section, the natural way to fix this is to work in Lagrangian coordinates. However, we encounter a new problem: if one initialises the trajectory flow map {X} to be the identity at some given time {t_0}, then it turns out that one only gets good control on the flow map and its derivatives for times {t = t_0 + O(N_0^{\alpha-1})} within the natural time scale {N_0^{\alpha-1}} of that initial time {t_0}; beyond this, the Gronwall-type arguments used to obtain bounds start to deteriorate exponentially. Because of this, one cannot rely on a “global” Lagrangian coordinate system as in the previous section. To get around this, one needs to partition the time domain {{\bf R}/{\bf Z}} into intervals {I_k} of length about {N_0^{\alpha-1}}, and construct a separate trajectory map adapted to each such interval. One can then use these “local Lagrangian coordinates” to construct local components {v_k} of the velocity perturbation {v} that obey the required properties on each such interval. This construction is essentially the content of the “convex integration lemma” in Lemma 3.3 of Isett’s paper.

However, a new problem arises when trying to “glue” these local corrections {v_k} together: two consecutive time intervals {I_k, I_{k+1}} will overlap, and their corresponding local corrections {v_k, v_{k+1}} will also overlap. This leads to some highly undesirable interaction terms between {v_k} and {v_{k+1}} (such as the fast derivative of {v_{k}^i v_{k+1}^j}) which are very difficult to make small (for instance, one cannot simply ensure that {v_k, v_{k+1}} have disjoint spatial supports as they are constructed using different local Lagrangian coordinate systems). On the other hand, if the original Reynolds stress {R} had a special structure, namely that it was only supported on every other interval {I_k} (i.e., on the {I_k} for all {k} even, or the {I_k} for all {k} odd), then these interactions no longer occur and the iteration step can proceed.

One could try to then resolve the problem by correcting the odd and even interval components of the stress {R} in separate stages (cf. how Theorem 19 can be iterated to establish Theorem 15), but this is inefficient with regards to the {\alpha} parameter, in particular this makes the argument stop well short of the optimal {1/3} threshold. To attain this threshold one needs the final ingredient of Isett’s argument, namely a “gluing approximation” (see Lemma 3.2 of Isett’s paper), in which one tales the (mollified) initial Euler-Reynolds flow {(u^{(0)},p^{(0)},R^{(0)})} and replaces it with a nearby flow {(\tilde u^{(0)},\tilde p^{(0)},\tilde R^{(0)})} in which the new Reynolds stress {R^{(0)}} is only supported on every other interval {I_k}. Combining this gluing approximation lemma with the mollification lemma and convex integration lemma gives the required iteration step. (One technical point is that this gluing has to create some useful additional regularity along the material derivative, in the spirit of Remark 38 of Notes 1, as such regularity will be needed in order to justify the convex integration step.)

To obtain this gluing approximation, one takes an {N_0^{1-\alpha}}-separated sequence of times {t_k}, and for each such time {t_k}, one solves the true Euler equations with initial data {u^{(0)}(t_k)} at {t_k} to obtain smooth Euler solutions {(u_k, p_k)} on a time interval centred at {t_k} of lifespan {\sim N_0^{\alpha-1}}, that agree with {u^{(0)}} at time {t_k}. (It is non-trivial to check that these solutions even exist on such an interval, let alone obey good estimates, but this can be done if the initial data was suitably mollified, as is consistent with the heuristics in Remark 11 from 254A Notes 3.) One can then glue these solutions together around the reference solution {(u^{(0)},p^{(0)},R^{(0)})} by defining

\displaystyle  \tilde u^{(0)} := u^{(0)} + \sum_k \psi_k (u_k - u^{(0)})

\displaystyle  \tilde p^{(0)} := p^{(0)} + \sum_k \psi_k (p_k - p^{(0)})

for a suitable partition of unity {1 = \sum_k \psi_k} in time. This gives fields {(\tilde u^{(0)}, \tilde p^{(0)})} that solve the Euler equation near each time {t_k}. To finish the proof of the gluing approximation lemma, one needs to then find a matching Reynolds stress {\tilde R^{(0)}} for the intervals at which the Euler equation is not solved exactly. Isett’s original construction of this stress was rather intricate; see Sections 7-10 of Isett’s paper for the technical details. However, with improved estimates, a simpler construction was used in a subsequent paper of Buckmaster, De Lellis, Szekelyhidi, and Vicol, leading to a simplified proof of (the non-endpoint version of) this direction of Onsager’s conjecture.

Matt StrasslerBreaking a Little New Ground at the Large Hadron Collider

Today, a small but intrepid band of theoretical particle physicists (professor Jesse Thaler of MIT, postdocs Yotam Soreq and Wei Xue of CERN, Harvard Ph.D. student Cari Cesarotti, and myself) put out a paper that is unconventional in two senses. First, we looked for new particles at the Large Hadron Collider in a way that hasn’t been done before, at least in public. And second, we looked for new particles at the Large Hadron Collider in a way that hasn’t been done before, at least in public.

And no, there’s no error in the previous paragraph.

1) We used a small amount of actual data from the CMS experiment, even though we’re not ourselves members of the CMS experiment, to do a search for a new particle. Both ATLAS and CMS, the two large multipurpose experimental detectors at the Large Hadron Collider [LHC], have made a small fraction of their proton-proton collision data public, through a website called the CERN Open Data Portal. Some experts, including my co-authors Thaler, Xue and their colleagues, have used this data (and the simulations that accompany it) to do a variety of important studies involving known particles and their properties. [Here’s a blog post by Thaler concerning Open Data and its importance from his perspective.] But our new study is the first to look for signs of a new particle in this public data. While our chances of finding anything were low, we had a larger goal: to see whether Open Data could be used for such searches. We hope our paper provides some evidence that Open Data offers a reasonable path for preserving priceless LHC data, allowing it to be used as an archive by physicists of the post-LHC era.

2) Since only had a tiny fraction of CMS’s data was available to us, about 1% by some count, how could we have done anything useful compared to what the LHC experts have already done? Well, that’s why we examined the data in a slightly unconventional way (one of several methods that I’ve advocated for many years, but has not been used in any public study). Consequently it allowed us to explore some ground that no one had yet swept clean, and even have a tiny chance of an actual discovery! But the larger scientific goal, absent a discovery, was to prove the value of this unconventional strategy, in hopes that the experts at CMS and ATLAS will use it (and others like it) in future. Their chance of discovering something new, using their full data set, is vastly greater than ours ever was.

Now don’t all go rushing off to download and analyze terabytes of CMS Open Data; you’d better know what you’re getting into first. It’s worthwhile, but it’s not easy going. LHC data is extremely complicated, and until this project I’ve always been skeptical that it could be released in a form that anyone outside the experimental collaborations could use. Downloading the data and turning it into a manageable form is itself a major task. Then, while studying it, there are an enormous number of mistakes that you can make (and we made quite a few of them) and you’d better know how to make lots of cross-checks to find your mistakes (which, fortunately, we did know; we hope we found all of them!) The CMS personnel in charge of the Open Data project were enormously helpful to us, and we’re very grateful to them; but since the project is new, there were inevitable wrinkles which had to be worked around. And you’d better have some friends among the experimentalists who can give you advice when you get stuck, or point out aspects of your results that don’t look quite right. [Our thanks to them!]

All in all, this project took us two years! Well, honestly, it should have taken half that time — but it couldn’t have taken much less than that, with all we had to learn. So trying to use Open Data from an LHC experiment is not something you do in your idle free time.

Nevertheless, I feel it was worth it. At a personal level, I learned a great deal more about how experimental analyses are carried out at CMS, and by extension, at the LHC more generally. And more importantly, we were able to show what we’d hoped to show: that there are still tremendous opportunities for discovery at the LHC, through the use of (even slightly) unconventional model-independent analyses. It’s a big world to explore, and we took only a small step in the easiest direction, but perhaps our efforts will encourage others to take bigger and more challenging ones.

For those readers with greater interest in our work, I’ll put out more details in two blog posts over the next few days: one about what we looked for and how, and one about our views regarding the value of open data from the LHC, not only for our project but for the field of particle physics as a whole.

Scott AaronsonFour updates

A few weeks ago, I was at QIP’2019 in Boulder, CO. This week I was at SQuInT’2019 in Albuquerque, NM. There were lots of amazing talks—feel free to ask in the comments section.

There’s an interview with me at the website “GigaOm,” conducted by Byron Reese and entitled Quantum Computing: Capabilities and Limits. I didn’t proofread the transcript and it has some errors in it, but hopefully the meaning comes through. In other interview news, if you were interested in my podcast with Adam Ford in Melbourne but don’t like YouTube, Adam has helpfully prepared transcripts of the two longest segments: The Ghost in the Quantum Turing Machine and The Winding Road to Quantum Supremacy.

The New York Times ran an article entitled The Hard Part of Computer Science? Getting Into Class, about the surge in computer science majors all over the US, and the shortage of professors to teach them. The article’s go-to example of a university where this is happening is UT Austin, and there’s extensive commentary from my department chair, Don Fussell.

The STOC’2019 accepted papers list is finally out. Lots of cool stuff!

February 12, 2019

Mark Chu-CarrollThe Math of Vaccinations, Infection Rates, and Herd Immunity

Here in the US, we are, horribly, in the middle of a measles outbreak. And, as usual, anti-vaccine people are arguing that:

  • Measles isn’t really that serious;
  • Unvaccinated children have nothing to do with the outbreak; and
  • More vaccinated people are being infected than unvaccinated, which shows that vaccines don’t help.

A few years back, I wrote a post about the math of vaccines; it seems like this is a good time to update it.

When it comes to vaccines, there’s two things that a lot of people don’t understand. One is herd immunity; the other is probability of infection.

Herd immunity is the fundamental concept behind vaccines.

In an ideal world, a person who’s been vaccinated against a disease would have no chance of catching it. But the real world isn’t ideal, and vaccines aren’t perfect. What a vaccine does is prime the recipient’s immune system in a way that reduces the probability that they’ll be infected.

But even if a vaccine for an illness were perfect, and everyone was vaccinated, that wouldn’t mean that it was impossible for anyone to catch the illness. There are many people who’s immune systems are compromised – people with diseases like AIDS, or people with cancer receiving chemotherapy. (Or people who’ve had the measles within the previous two years!) And that’s not considering the fact that there are people who, for legitimate medical reasons, cannot be vaccinated!

So individual immunity, provided by vaccines, isn’t enough to completely eliminate the spread of a contagious illness. To prevent outbreaks, we rely on an emergent property of a vaccinated population. If enough people are immune to the disease, then even if one person gets infected with it, the disease won’t be able to spread enough to produce a significant outbreak.

We can demonstrate this with some relatively simple math.

Let’s imagine a case of an infection disease. For illustration purposes, we’ll simplify things in way that makes the outbreak more likely to spread than reality. (So this makes herd immunity harder to attain than reality.)

  • There’s a vaccine that’s 95% effective: out of every 100 people vaccinated against the disease, 95% are perfectly immune; the remaining 5% have no immunity at all.
  • The disease is highly contagious: out of every 100 people who are exposed to the disease, 95% will be infected.

If everyone is immunized, but one person becomes ill with the disease, how many people do they need to expose to the disease for the disease to spread?

Keeping things simple: an outbreak, by definition, is a situation where the number of exposed people is steadily increasing. That can only happen if every sick person, on average, infects more than 1 other person with the illness. If that happens, then the rate of infection can grow exponentially, turning into an outbreak.

In our scheme here, only one out of 20 people is infectable – so, on average, if our infected person has enough contact with 20 people to pass an infection, then there’s a 95% chance that they’d pass the infection on to one other person. (19 of 20 are immune; the one remaining person has a 95% chance of getting infected). To get to an outbreak level – that is, a level where they’re probably going to infect more than one other person, they’d need expose something around 25 people (which would mean that each infected person, on average, could infect roughly 1.2 people). If they’re exposed to 20 other people on average, then on average, each infected person will infect roughly 0.9 other people – so the number of infected will decrease without turning into a significant outbreak.

But what will happen if just 5% of the population doesn’t get vaccinated? Then we’ve got 95% of the population getting vaccinated, with a 95% immunity rate – so roughly 90% of the population has vaccine immunity. Our pool of non-immune people has doubled. In our example scenario, if each person is exposed to 20 other people during their illness, then they will, on average, cause 1.8 people to get sick. And so we have a major outbreak on our hands!

This illustrates the basic idea behind herd immunity. If you can successfully make a large enough portion of the population non-infectable by a disease, then the disease can’t spread through the population, even though the population contains a large number of infectable people. When the population’s immunity rate (either through vaccine, or through prior infection) gets to be high enough that an infection can no longer spread, the population is said to have herd immunity: even individuals who can’t be immunized no longer need to worry about catching it, because the population doesn’t have the capacity to spread it around in a major outbreak.

(In reality, the effectiveness of the measles vaccine really is in the 95 percent range – actually slightly higher than that; various sources estimate it somewhere between 95 and 97 percent effective! And the success rate of the vaccine isn’t binary: 95% of people will be fully immune; the remaining 5% will have a varying degree of immunity And the infectivity of most diseases is lower than the example above. Measles (which is a highly, highly contagious disease, far more contagious than most!) is estimated to infect between 80 and 90 percent of exposed non-immune people. So if enough people are immunized, herd immunity will take hold even if more than 20 people are exposed by every sick person.)

Moving past herd immunity to my second point: there’s a paradox that some antivaccine people (including, recently, Sheryl Atkinson) use in their arguments. If you look at an outbreak of an illness that we vaccinate for, you’ll frequently find that more vaccinated people become ill than unvaccinated. And that, the antivaccine people say, shows that the vaccines don’t work, and the outbreak can’t be the fault of the unvaccinated folks.

Let’s look at the math to see the problem with that.

Let’s use the same numbers as above: 95% vaccine effectiveness, 95% contagion. In addition, let’s say that 2% of people choose to go unvaccinated.

That means thats that 98% of the population has been immunized, and 95% of them are immune. So now 92% of the population has immunity.

If each infected person has contact with 20 other people, then we can expect expect 8% of those 20 to be infectable – or 1.6; and of those, 95% will become ill – or 1.52. So on average, each sick person will infect 1 1/2 other people. That’s enough to cause a significant outbreak. Without the non-immunized people, the infection rate is less than 1 – not enough to cause an outbreak.

The non-immunized population reduced the herd immunity enough to cause an outbreak.

Within the population, how many immunized versus non-immunized people will get sick?

Out of every 100 people, there are 5 who got vaccinated, but aren’t immune. Out of that same 100 people, there are 2 (2% of 100) that didn’t get vaccinated. If every non-immune person is equally likely to become ill, then we’d expect that in 100 cases of the disease, about 70 of them to be vaccinated, and 30 unvaccinated.

The vaccinated population is much, much larger – 50 times larger! – than the unvaccinated.
Since that population is so much larger, we’d expect more vaccinated people to become ill, even though it’s the smaller unvaccinated group that broke the herd immunity!

The easiest way to see that is to take those numbers, and normalize them into probabilities – that is, figure out, within the pool of all vaccinated people, what their likelihood of getting ill after exposure is, and compare that to the likelihood of a non-vaccinated person becoming ill after exposure.

So, let’s start with the vaccinated people. Let’s say that we’re looking at a population of 10,000 people total. 98% were vaccinated; 2% were not.

  • The total pool of vaccinated people is 9800, and the total pool of unvaccinated is 200.
  • Of the 9800 who were vaccinated, 95% of them are immune, leaving 5% who are not – so
    490 infectable people.
  • Of the 200 people who weren’t vaccinated, all of them are infectable.
  • If everyone is exposed to the illness, then we would expect about 466 of the vaccinated, and 190 of the unvaccinated to become ill.

So more than twice the number of vaccinated people became ill. But:

  • The odds of a vaccinated person becoming ill are 466/9800, or about 1 out of every 21
  • The odds of an unvaccinated person becoming ill are 190/200 or 19 out of every 20 people! (Note: there was originally a typo in this line, which was corrected after it was pointed out in the comments.)

The numbers can, if you look at them without considering the context, appear to be deceiving. The population of vaccinated people is so much larger than the population of unvaccinated that the total number of infected can give the wrong impression. But the facts are very clear: vaccination drastically reduces an individuals chance of getting ill; and vaccinating the entire population dramatically reduces the chances of an outbreak.

The reality of vaccines is pretty simple.

  • Vaccines are highly effective.
  • The diseases that vaccines prevent are not benign.
  • Vaccines are really, really safe. None of the horror stories told by anti-vaccine people have any basis in fact. Vaccines don’t damage your immune system, they don’t cause autism, and they don’t cause cancer.
  • Not vaccinating your children (or yourself!) doesn’t just put you at risk for illness; it dramatically increases the chances of other people becoming ill. Even when more vaccinated people than unvaccinated become ill, that’s largely caused by the unvaccinated population.

In short: everyone who is healthy enough to be vaccinated should get vaccinated. If you don’t, you’re a despicable free-riding asshole who’s deliberately choosing to put not just yourself but other people at risk.

Robert HellingBohmian Rapsody

Visits to a Bohmian village

Over all of my physics life, I have been under the local influence of some Gaul villages that have ideas about physics that are not 100% aligned with the main stream views: When I was a student in Hamburg, I was good friends with people working on algebraic quantum field theory. Of course there were opinions that they were the only people seriously working on QFT as they were proving theorems while others dealt with perturbative series only that are known to diverge and are thus obviously worthless. Funnily enough they were literally sitting above the HERA tunnel where electron proton collisions took place that were very well described by exactly those divergent series. Still, I learned a lot from these people and would say there are few that have thought more deeply about structural properties of quantum physics. These days, I use more and more of these things in my own teaching (in particular in our Mathematical Quantum Mechanics and Mathematical Statistical Physics classes as well as when thinking about foundations, see below) and even some other physicists start using their language.

Later, as a PhD student at the Albert Einstein Institute in Potsdam, there was an accumulation point of people from the Loop Quantum Gravity community with Thomas Thiemann and Renate Loll having long term positions and many others frequently visiting. As you probably know, a bit later, I decided (together with Giuseppe Policastro) to look into this more deeply resulting in a series of papers there were well received at least amongst our peers and about which I am still a bit proud.

Now, I have been in Munich for over ten years. And here at the LMU math department there is a group calling themselves the Workgroup Mathematical Foundations of Physics. And let's be honest, I call them the Bohmians (and sometimes the Bohemians). And once more, most people believe that the Bohmian interpretation of quantum mechanics is just a fringe approach that is not worth wasting any time on. You will have already guessed it: I did so none the less. So here is a condensed report of what I learned and what I think should be the official opinion on this approach. This is an informal write up of a notes paper that I put on the arXiv today.

Bohmians don't like about the usual (termed Copenhagen lacking a better word) approach to quantum mechanics that you are not allowed to talk about so many things and that the observer plays such a prominent role by determining via a measurement what aspect is real an what is not. They think this is far too subjective. So rather, they want quantum mechanics to be about particles that then are allowed to follow trajectories.

"But we know this is impossible!" I hear you cry. So, let's see how this works. The key observation is that the Schrödinger equation for a Hamilton operator of the form kinetic term (possibly with magnetic field) plus potential term, has  a conserved current

$$j = \bar\psi\nabla\psi - (\nabla\bar\psi)\psi.$$

So as your probability density is $\rho=\bar\psi\psi$, you can think of that being made up of particles moving with a velocity field

$$v = j/\rho = 2\Im(\nabla \psi/\psi).$$

What this buys you is that if you have a bunch of particles that is initially distributed like the probability density and follows the flow of the velocity field it will also later be distributed like $|\psi |^2$.

What is important is that they keep the Schrödinger equation in tact. So everything that you can do with the original Schrödinger equation (i.e. everything) can be done in the Bohmian approach as well.  If you set up your Hamiltonian to describe a double slit experiment, the Bohmian particles will flow nicely to the screen and arrange themselves in interference fringes (as the probability density does). So you will never come to a situation where any experimental outcome will differ  from what the Copenhagen prescription predicts.

The price you have to pay, however, is that you end up with a very non-local theory: The velocity field lives in configuration space, so the velocity of every particle depends on the position of all other particles in the universe. I would say, this is already a show stopper (given what we know about quantum field theory whose raison d'être is locality) but let's ignore this aesthetic concern.

What got me into this business was the attempt to understand how the set-ups like Bell's inequality and GHZ and the like work out that are supposed to show that quantum mechanics cannot be classical (technically that the state space cannot be described as local probability densities). The problem with those is that they are often phrased in terms of spin degrees of freedom which have Hamiltonians that are not directly of the form above. You can use a Stern-Gerlach-type apparatus to translate the spin degree of freedom to a positional but at the price of a Hamiltonian that is not explicitly know let alone for which you can analytically solve the Schrödinger equation. So you don't see much.

But from Reinhard Werner and collaborators I learned how to set up qubit-like algebras from positional observables of free particles (at different times, so get something non-commuting which you need to make use of entanglement as a specific quantum resource). So here is my favourite example:

You start with two particles each following a free time evolution but confined to an interval. You set those up in a particular entangled state (stationary as it is an eigenstate of the Hamiltonian) built from the two lowest levels of the particle in the box. And then you observe for each particle if it is in the left or the right half of the interval.

From symmetry considerations (details in my paper) you can see that each particle is with the same probability on the left and the right. But they are anti-correlated when measured at the same time. But when measured at different times, the correlation oscillates like the cosine of the time difference.

From the Bohmian perspective, for the static initial state, the velocity field vanishes everywhere, nothing moves. But in order to capture the time dependent correlations, as soon as one particle has been measured, the position of the second particle has to oscillate in the box (how the measurement works in detail is not specified in the Bohmian approach since it involves other degrees of freedom and remember, everything depends on everything but somehow it has to work since you want to produce the correlations that are predicted by the Copenhagen approach).

The trajectory of the second particle depending on its initial position

This is somehow the Bohmian version of the collapse of the wave function but they would never phrase it that way.

And here is where it becomes problematic: If you could see the Bohmian particle moving you could decide if the other particle has been measured (it would oscillate) or not (it would stand still). No matter where the other particle is located. With this observation you could build a telephone that transmits information instantaneously, something that should not exist. So you have to conclude you must not be able to look at the second particle and see if it oscillates or not.

Bohmians  tell you you cannot because all you are supposed to observer about the particles are their positions (and not their velocity). And if you try to measure the velocity by measuring the position at two instants in time you don't because the first observation disturbs the particle so much that it invalidates the original state.

As it turns out, you are not allowed to observe anything else about the particles than that they are distributed like $|\psi |^2$ because if you could, you could build a similar telephone (at least statistically) as I explain the in the paper (this fact is known in the Bohm literature but I found it nowhere so clearly demonstrated as in this two particle system).

My conclusion is that the Bohm approach adds something (the particle positions) to the wave function but then in the end tells you you are not allowed to observe this or have any knowledge of this beyond what is already encoded in the wave function. It's like making up an invisible friend.

PS: If you haven't seen "Bohemian Rhapsody", yet, you should, even if there are good reasons to criticise the dramatisation of real events.

February 11, 2019

BackreactionA philosopher of science reviews “Lost in Math”

Jeremy Butterfield is a philosopher of science in Cambridge. I previously wrote about some of his work here, and have met him on various occasions. Butterfield recently reviewed my book “Lost in Math,” and you can now find this review online here. (I believe it was solicited for a journal by name Physics in Perspective.) His is a very detailed review that focuses, unsurprisingly, on the

Doug NatelsonMore brief items

Some additional interesting links:

  • Another example of emergent universal behavior, as it is demonstrated that runners at the start of a marathon seem to collectively obey hydrodynamics, like a fluid.
  • The Voices of the Manhattan Project oral histories effort has a large number of interviews online.  It’s important for posterity that these were recorded before everyone involved is gone.
  • Maybe massive open online courses were not, in fact, the end of the traditional model of higher education.  Who could have foreseen this?
  • There are people arguing that the preprint arxiv model is a good path toward opens access.  This is definitely something I like, especially more than models involving authors paying thousands of dollars to for-profit publishers for open access journals.

February 10, 2019

Terence TaoTwo events

Just a quick post to advertise two upcoming events sponsored by institutions I am affiliated with:

  1. The 2019 National Math Festival will be held in Washington D.C. on May 4 (together with some satellite events at other US cities).  This festival will have numerous games, events, films, and other activities, which are all free and open to the public.  (I am on the board of trustees of MSRI, which is one of the sponsors of the festival.)
  2. The Institute for Pure and Applied Mathematics (IPAM) is now accepting applications for its second Industrial Short Course for May 16-17 2019, with the topic of “Deep Learning and the Latest AI Algorithms“.  (I serve on the Scientific Advisory Board of this institute.)  This is an intensive course (in particular requiring active participation) aimed at industrial mathematicians involving both the theory and practice of deep learning and neural networks, taught by Xavier Bresson.   (Note: space is very limited, and there is also a registration fee of $2,000 for this course, which is expected to be in high demand.)

David Hoggstellar activity cycles

Ben Montet (Chicago) was visiting Flatiron today and gave a very nice talk. He gave a full review of exoplanet discovery science but focused on a few specific things. One of them was stellar activity: He has been using the NASA Kepler full-frame images (which were taken once per month, roughly) to look at precise stellar photometric variations over long timescales (because standard transit techniques filter out the long time scales, and they are hard to recover without full-frame images). He can see stellar activity cycles in many stars, and look at its relationships with things like stellar rotation periods (and hence ages) and so on. He does find relationships! The nice thing is that the NASA TESS Mission produces full-frame images every 30 minutes, so it has way more data relevant to these questions, although it doesn't observe (most of) the sky for very long. All these things are highly relevant to the things I have been thinking about for Terra Hunting Experiment and related projects, a point he made clearly in his talk.

February 09, 2019

BackreactionWhy a larger particle collider is not currently a good investment

LHC tunnel. Credits: CERN. That a larger particle collider is not currently a good investment is hardly a controversial position. While the costs per units of collision-energy have decreased over the decades thanks to better technology, the absolute cost of new machines has shot up. That the costs of larger particle colliders would at some point become economically prohibitive has been known

February 08, 2019

Matt von HippelThe Particle Physics Curse of Knowledge

There’s a debate raging right now in particle physics, about whether and how to build the next big collider. CERN’s Future Circular Collider group has been studying different options, some more expensive and some less (Peter Woit has a nice summary of these here). This year, the European particle physics community will debate these proposals, deciding whether to include them in an updated European Strategy for Particle Physics. After that, it will be up to the various countries that are members of CERN to decide whether to fund the proposal. With the costs of the more expensive options hovering around $20 billion, this has led to substantial controversy.

I’m not going to offer an opinion here one way or another. Weighing this kind of thing requires knowing the alternatives: what else the European particle physics community might lobby for in the next few years, and once they decide, what other budget priorities each individual country has. I know almost nothing about either.

Instead of an opinion, I have an observation:

Imagine that primatologists had proposed a $20 billion primate center, able to observe gorillas in greater detail than ever before. The proposal might be criticized in any number of ways: there could be much cheaper ways to accomplish the same thing, the project might fail, it might be that we simply don’t care enough about primate behavior to spend $20 billion on it.

What you wouldn’t expect is the claim that a $20 billion primate center would teach us nothing new.

It probably wouldn’t teach us “$20 billion worth of science”, whatever that means. But a center like that would be guaranteed to discover something. That’s because we don’t expect primatologists’ theories to be exact. Even if gorillas behaved roughly as primatologists expected, the center would still see new behaviors, just as a consequence of looking at a new level of detail.

To pick a physics example, consider the gravitational wave telescope LIGO. Before their 2016 observation of two black holes merging, LIGO faced substantial criticism. After their initial experiments didn’t detect anything, many physicists thought that the project was doomed to fail: that it would never be sensitive enough to detect the faint signals of gravitational waves past the messy vibrations of everyday life on Earth.

When it finally worked, though, LIGO did teach us something new. Not the existence of gravitational waves, we already knew about them. Rather, LIGO taught us new things about the kinds of black holes that exist. LIGO observed much bigger black holes than astronomers expected, a surprise big enough that it left some people skeptical. Even if it hadn’t, though, we still would almost certainly observe something new: there’s no reason to expect astronomers to perfectly predict the size of the universe’s black holes.

Particle physics is different.

I don’t want to dismiss the work that goes in to collider physics (far too many people have dismissed it recently). Much, perhaps most, of the work on the LHC is dedicated not to detecting new particles, but to confirming and measuring the Standard Model. A new collider would bring heroic scientific effort. We’d learn revolutionary new things about how to build colliders, how to analyze data from colliders, and how to use the Standard Model to make predictions for colliders.

In the end, though, we expect those predictions to work. And not just to work reasonably well, but to work perfectly. While we might see something beyond the Standard Model, the default expectation is that we won’t, that after doing the experiments and analyzing the data and comparing to predictions we’ll get results that are statistically indistinguishable from an equation we can fit on a T-shirt. We’ll fix the constants on that T-shirt to an unprecedented level of precision, yes, but the form of the equation may well stay completely the same.

I don’t think there’s another field where that’s even an option. Nowhere else in all of science could we observe the world in unprecedented detail, capturing phenomena that had never been seen before…and end up perfectly matching our existing theory. There’s no other science where anyone would even expect that to happen.

That makes the argument here different from any argument we’ve faced before. It forces people to consider their deep priorities, to think not just about the best way to carry out this test or that but about what science is supposed to be for. I don’t think there are any easy answers. We’re in what may well be a genuinely new situation, and we have to figure out how to navigate it together.

Postscript: I still don’t want to give an opinion, but given that I didn’t have room for this above let me give a fragment of an opinion: Higgs triple couplings!!!

David Hoggmeasuring the Galaxy with HARPS? de-noising

Megan Bedell (Flatiron) was at Yale yesterday; they pointed out that some of the time-variable telluric lines we see in our wobble model of the HARPS data are not telluric at all; they are in fact interstellar medium lines. That got her thinking: Could we measure our velocity with respect to the local ISM using HARPS? The answer is obviously yes, and this could have strong implications for the Milky Way rotation curve! The signal should be a dipolar pattern of RV shifts in interstellar lines as you look around the Sun in celestial coordinates. In the barycentric reference frame, of course.

I also got great news first thing this morning: The idea that Soledad Villar (NYU) and I discussed yesterday about using a generative adversarial network trained on noisy data to de-noise noisy data was a success: It works! Of course, being a mathematician, her reaction was “I think I can prove something!” Mine was: Let's start using it! Probably the mathematical reaction is the better one. If we move on this it will be my first ever real foray into deep learning.

February 07, 2019

Tommaso DorigoCMS Discovers New Excited Bc Hadron

I am very happy to report today that the CMS experiment just confirmed to be an excellent spectrometer - as good as they get, I would say - by discovering two new excited B hadrons. The field of heavy meson spectroscopy proves once again to be rich with new gems ready to be unearthed, as we collect more data and dig deeper. For such discoveries to be made, collecting as many proton-proton collisions as possible is in fact the decisive factor, along with following up good ideas and preserving our will to not leave any stone unturned.

read more

Terence TaoRequest for comments from the ICM Structure Committee

[This post is collectively authored by the ICM structure committee, whose membership includes myself, and is listed in full in the post below – T.]

The International Congress of Mathematicians (ICM) is widely considered to be the premier conference for mathematicians.  It is held every four years; for instance, the 2018 ICM was held in Rio de Janeiro, Brazil, and the 2022 ICM is to be held in Saint Petersburg, Russia.  The most high-profile event at the ICM is the awarding of the 10 or so prizes of the International Mathematical Union (IMU) such as the Fields Medal, and the lectures by the prize laureates; but there are also approximately twenty plenary lectures from leading experts across all mathematical disciplines, several public lectures of a less technical nature, about 180 more specialised invited lectures divided into about twenty section panels, each corresponding to a mathematical field (or range of fields), as well as various outreach and social activities, exhibits and satellite programs, and meetings of the IMU General Assembly; see for instance the program for the 2018 ICM for a sample schedule.  In addition to these official events, the ICM also provides more informal networking opportunities, in particular allowing mathematicians at all stages of career, and from all backgrounds and nationalities, to interact with each other.

For each Congress, a Program Committee (together with subcommittees for each section) is entrusted with the task of selecting who will give the lectures of the ICM (excluding the lectures by prize laureates, which are selected by separate prize committees); they also have decided how to appropriately subdivide the entire field of mathematics into sections.   Given the prestigious nature of invitations from the ICM to present a lecture, this has been an important and challenging task, but one for which past Program Committees have managed to fulfill in a largely satisfactory fashion.

Nevertheless, in the last few years there has been substantial discussion regarding ways in which the process for structuring the ICM and inviting lecturers could be further improved, for instance to reflect the fact that the distribution of mathematics across various fields has evolved over time.   At the 2018 ICM General Assembly meeting in Rio de Janeiro, a resolution was adopted to create a new Structure Committee to take on some of the responsibilities previously delegated to the Program Committee, focusing specifically on the structure of the scientific program.  On the other hand, the Structure Committee is not involved with the format for prize lectures, the selection of prize laureates, or the selection of plenary and sectional lecturers; these tasks are instead the responsibilities of other committees (the local Organizing Committee, the prize committees, and the Program Committee respectively).

The first Structure Committee was constituted on 1 Jan 2019, with the following members:

As one of our first actions, we on the committee are using this blog post to solicit input from the mathematical community regarding the topics within our remit.  Among the specific questions (in no particular order) for which we seek comments are the following:

  1. Are there suggestions to change the format of the ICM that would increase its value to the mathematical community?
  2. Are there suggestions to change the format of the ICM that would encourage greater participation and interest in attending, particularly with regards to junior researchers and mathematicians from developing countries?
  3. What is the correct balance between research and exposition in the lectures?  For instance, how strongly should one emphasize the importance of good exposition when selecting plenary and sectional speakers?  Should there be “Bourbaki style” expository talks presenting work not necessarily authored by the speaker?
  4. Is the balance between plenary talks, sectional talks, and public talks at an optimal level?  There is only a finite amount of space in the calendar, so any increase in the number or length of one of these types of talks will come at the expense of another.
  5. The ICM is generally perceived to be more important to pure mathematics than to applied mathematics.  In what ways can the ICM be made more relevant and attractive to applied mathematicians, or should one not try to do so?
  6. Are there structural barriers that cause certain areas or styles of mathematics (such as applied or interdisciplinary mathematics) or certain groups of mathematicians to be under-represented at the ICM?  What, if anything, can be done to mitigate these barriers?

Of course, we do not expect these complex and difficult questions to be resolved within this blog post, and debating these and other issues would likely be a major component of our internal committee discussions.  Nevertheless, we would value constructive comments towards the above questions (or on other topics within the scope of our committee) to help inform these subsequent discussions.  We therefore welcome and invite such commentary, either as responses to this blog post, or sent privately to one of the members of our committee.  We would also be interested in having readers share their personal experiences at past congresses, and how it compares with other major conferences of this type.   (But in order to keep the discussion focused and constructive, we request that comments here refrain from discussing topics that are out of the scope of this committee, such as suggesting specific potential speakers for the next congress, which is a task instead for the 2022 ICM Program Committee.)

John BaezApplied Category Theory 2019

I hope to see you at this conference, which will occur right before the associated school meets in Oxford:

Applied Category Theory 2019, July 15-19, 2019, Oxford, UK.

Applied category theory is a topic of interest for a growing community of researchers, interested in studying systems of all sorts using category-theoretic tools. These systems are found in the natural sciences and social sciences, as well as in computer science, linguistics, and engineering. The background and experience of our members is as varied as the systems being studied. The goal of the ACT2019 Conference is to bring the majority of researchers in the field together and provide a platform for exposing the progress in the area. Both original research papers as well as extended abstracts of work submitted/accepted/published elsewhere will be considered.

There will be best paper award(s) and selected contributions will be awarded extended keynote slots.

The conference will include a business showcase and tutorials, and there also will be an adjoint school, the following week (see webpage).

Important dates

Submission of contributed papers: 3 May
Acceptance/Rejection notification: 7 June


Prospective speakers are invited to submit one (or more) of the following:

• Original contributions of high quality work consisting of a 5-12 page extended abstract that provides sufficient evidence of results of genuine interest and enough detail to allow the program committee to assess the merits of the work. Submissions of works in progress are encouraged but must be more substantial than a research proposal.

• Extended abstracts describing high quality work submitted/published elsewhere will also be considered, provided the work is recent and relevant to the conference. These consist of a maximum 3 page description and should include a link to a separate published paper or preprint.

The conference proceedings will be published in a dedicated Proceedings issue of the new Compositionality journal:

Only original contributions are eligible to be published in the proceedings.

Submissions should be prepared using LaTeX, and must be submitted in PDF format. Use of the Compositionality style is encouraged. Submission is done via EasyChair:

Program chairs

John Baez (U.C. Riverside)
Bob Coecke (University of Oxford)

Program committee

Bob Coecke (chair)
John Baez (chair)
Christina Vasilakopoulou
David Moore
Josh Tan
Stefano Gogioso
Brendan Fong
Steve Lack
Simona Paoli
Joachim Kock
Kathryn Hess Bellwald
Tobias Fritz
David I. Spivak
Ross Duncan
Dan Ghica
Valeria de Paiva
Jeremy Gibbons
Samuel Mimram
Aleks Kissinger
Jamie Vicary
Martha Lewis
Nick Gurski
Dusko Pavlovic
Chris Heunen
Corina Cirstea
Helle Hvid Hansen
Dan Marsden
Simon Willerton
Pawel Sobocinski
Dominic Horsman
Nina Otter
Miriam Backens

Steering committee

John Baez (U.C. Riverside)
Bob Coecke (University of Oxford)
David Spivak (M.I.T.)
Christina Vasilakopoulou (U.C. Riverside)

n-Category Café Applied Category Theory 2019

I hope to see you at this conference!

Here’s some information about it, such as how to submit papers.

Applied category theory is a topic of interest for a growing community of researchers, interested in studying systems of all sorts using category-theoretic tools. These systems are found in the natural sciences and social sciences, as well as in computer science, linguistics, and engineering. The background and experience of our members is as varied as the systems being studied. The goal of the ACT2019 Conference is to bring the majority of researchers in the field together and provide a platform for exposing the progress in the area. Both original research papers as well as extended abstracts of work submitted/accepted/published elsewhere will be considered.

There will be best paper award(s) and selected contributions will be awarded extended keynote slots.

The conference will include a business showcase and tutorials, and there also will be an adjoint school, the following week (see webpage).

Important dates

  • Submission of contributed papers: 3 May
  • Acceptance/Rejection notification: 7 June


Prospective speakers are invited to submit one (or more) of the following:

  • Original contributions of high quality work consisting of a 5-12 page extended abstract that provides sufficient evidence of results of genuine interest and enough detail to allow the program committee to assess the merits of the work. Submissions of works in progress are encouraged but must be more substantial than a research proposal.

  • Extended abstracts describing high quality work submitted/published elsewhere will also be considered, provided the work is recent and relevant to the conference. These consist of a maximum 3 page description and should include a link to a separate published paper or preprint.

The conference proceedings will be published in a dedicated Proceedings issue of the new Compositionality journal:

Only original contributions are eligible to be published in the proceedings.

Submissions should be prepared using LaTeX, and must be submitted in PDF format. Use of the Compositionality style is encouraged. Submission is done via EasyChair:

Program chairs

  • John Baez (U.C. Riverside)
  • Bob Coecke (University of Oxford)

Program committee

  • Bob Coecke (chair)
  • John Baez (chair)
  • Christina Vasilakopoulou
  • David Moore
  • Josh Tan
  • Stefano Gogioso
  • Brendan Fong
  • Steve Lack
  • Simona Paoli
  • Joachim Kock
  • Kathryn Hess Bellwald
  • Tobias Fritz
  • David I. Spivak
  • Ross Duncan
  • Dan Ghica
  • Valeria de Paiva
  • Jeremy Gibbons
  • Samuel Mimram
  • Aleks Kissinger
  • Jamie Vicary
  • Martha Lewis
  • Nick Gurski
  • Dusko Pavlovic
  • Chris Heunen
  • Corina Cirstea
  • Helle Hvid Hansen
  • Dan Marsden
  • Simon Willerton
  • Pawel Sobocinski
  • Dominic Horsman
  • Nina Otter
  • Miriam Backens

Steering committee

  • John Baez (U.C. Riverside)
  • Bob Coecke (University of Oxford)
  • David Spivak (M.I.T.)
  • Christina Vasilakopoulou (U.C. Riverside)

February 06, 2019

Clifford JohnsonAt the Perimeter

In case you were putting the kettle on to make tea for watching the live cast.... Or putting on your boots to head out to see it in person, my public talk at the Perimeter Institute has been postponed to tomorrow! It'll be just as graphic! Here's a link to the event's details.

-cvj Click to continue reading this post

The post At the Perimeter appeared first on Asymptotia.

Jordan EllenbergLarge-scale Pareto-optimal topologies, or: how to describe a hexahedron

I got to meet Karen Caswelch, the CEO of Madison startup SciArtSoft last week. The company is based on tech developed by my colleague Krishnan Suresh. When I looked at one of his papers about this stuff I was happy to find there was a lovely piece of classical solid geometry hidden in it!

Here’s the deal. You want to build some component out of metal, which metal is to be contained in a solid block. So you can think of the problem as: you start with a region V in R^3, and your component is going to be some subregion W in R^3. For each choice of W there’s some measure of “compliance” which you want to minimize; maybe it’s fragility, maybe it’s flexibility, I dunno, depends on the problem. (Sidenote: I think lay English speakers would want “compliance” to refer to something you’d like to maximize, but I’m told this usage is standard in engineering.) (Subsidenote: I looked into this and now I get it — compliance literally refers to flexibility; it is the inverse of stiffness, just like in the lay sense. If you’re a doctor you want your patient to comply to their medication schedule, thus bending to outside pressure, but bending to outside pressure is precisely what you do not want your metal widget to do.)

So you want to minimize compliance, but you also want to minimize the weight of your component, which means you want vol(W) to be as small as possible. These goals are in conflict. Little lacy structures are highly compliant.

It turns out you can estimate compliance by breaking W up into a bunch of little hexahedral regions, computing compliance on each one, and summing. For reasons beyond my knowledge you definitely don’t want to restrict to chopping uniformly into cubes. So a priori you have millions and millions of differently shaped hexahedra. And part of the source of Suresh’s speedup is to gather these into approximate congruence classes so you can do a compliance computation for a whole bunch of nearly congruent hexahedra at once. And here’s where the solid geometry comes in; an old theorem of Cauchy tells you that if you know what a convex polyhedron’s 1-skeleton looks like as a graph, and you know the congruence classes of all the faces, you know the polyhedron up to rigid motion. In partiuclar, you can just triangulate each face of the hexahedron with a diagonal, and record the congruence class by 18 numbers, which you can then record in a hash table. You sort the hashes and then you can instantly see your equivalence classes of hexahedra.

(Related: the edge lengths of a tetrahedron determine its volume but the areas of the faces don’t.)

Doug NatelsonBrief items

This is the absolute most dense time of the year in terms of administrative obligations, so posting is going to be a bit sparse.  In the meantime, here is a bit of interesting reading:

Scientific American has an interesting article about the fact that two independent means of assessing the Hubble constant (analysis of the cosmic microwave background on one hand; analysis of "standard candles" on the other) disagree well outside the estimated systematic uncertainties.

Kip Thorne posted a biographical reminiscence about John Wheeler on the arxiv.  I haven't read it yet, but it's in my queue.

Quanta Magazine had put up a very well done article about turbulence.  Good stuff.  I liked the animation.

John BaezFermat Primes and Pascal’s Triangle

If you take the entries Pascal’s triangle mod 2 and draw black for 1 and white for 0, you get a pleasing pattern:

The 2^nth row consists of all 1’s. If you look at the triangle consisting of the first 2^n rows, and take the limit as n \to \infty, you get a fractal called the Sierpinski gasket. This can also be formed by repeatedly cutting triangular holes out of an equilateral triangle:

Something nice happens if you interpret the rows of Pascal’s triangle mod 2 as numbers written in binary:

1 = 1
11 = 3
101 = 5
1111 = 15
10001 = 17
110011 = 51
1010101 = 85
11111111 = 255
100000001 = 257

Notice that some of these rows consist of two 1’s separated by a row of 0’s. These give the famous ‘Fermat numbers‘:

11 = 3 = 2^{2^0} + 1
101 = 5 = 2^{2^1} + 1
10001 = 17 = 2^{2^2} + 1
10000001 = 257 = 2^{2^3} + 1
1000000000000001 = 65537 = 2^{2^4} + 1

The numbers listed above are all prime. Based on this evidence Fermat conjectured that all numbers of the form 2^{2^n} + 1 are prime. But Euler crushed this dream by showing that the next Fermat number, 2^{2^5} + 1, is not prime.

Indeed, even today, no other Fermat numbers are known to be prime! People have checked all of them up to 2^{2^{32}} + 1. They’ve even checked a few bigger ones, the largest being

2^{2^{3329780}} + 1

which turns out to be divisible by

193 \times 2^{3329782} + 1

Here are some much easier challenges:

Puzzle 1. Show that every row of Pascal’s triangle mod 2 corresponds to a product of distinct Fermat numbers:

1 = 1
11 = 3
101 = 5
1111 = 15 = 3 × 5
10001 = 17
110011 = 51 = 3 × 17
1010101 = 85 = 5 × 17
11111111 = 255 = 3 × 5 × 17
100000001 = 257

and so on. Also show that every product of distinct Fermat numbers corresponds to a row of Pascal’s triangle mod 2. What is the pattern?

By the way: the first row, 1, corresponds to the empty product.

Puzzle 2. Show that the product of the first n Fermat numbers is 2 less than the next Fermat number:

3 + 2 = 5
3 × 5 + 2 = 17
3 × 5 × 17 + 2 = 257
3 × 5 × 17 × 257 + 2 = 65537

and so on.

Now, Gauss showed that we can construct a regular n-gon using straight-edge and compass if n is a prime Fermat number. Wantzel went further and showed that if n is odd, we can construct a regular n-gon using straight-edge and compass if and only if n is a product of distinct Fermat primes.

We can construct other regular polygons from these by repeatedly bisecting the angles. And it turns out that’s all:

Gauss–Wantzel Theorem. We can construct a regular n-gon using straight-edge and compass if and only if n is a power of 2 times a product of distinct Fermat primes.

There are only 5 known Fermat primes: 3, 5, 17, 257 and 65537. So, our options for constructing regular polygons with an odd number of sides are extremely limited! There are only 2^5 = 32 options, if we include the regular 1-gon.

Puzzle 3. What is a regular 1-gon? What is a regular 2-gon?

And, as noted in The Book of Numbers by Conway and Guy, the 32 constructible regular polygons with an odd number of sides correspond to the first 32 rows of Pascal’s triangle!

1 = 1
11 = 3
101 = 5
1111 = 15 = 3 × 5
10001 = 17
110011 = 51 = 3 × 17
1010101 = 85 = 5 × 17
11111111 = 255 = 3 × 5 × 17
100000001 = 257
1100000011 = 771 = 3 × 257
10100000101 = 1285 = 5 × 257
101010010101 = 3855 = 3 × 5 × 257

and so on. Here are all 32 rows, borrowed from the Online Encylopedia of Integer Sequences:

Click to enlarge! And here are all 32 odd numbers n for which we know that a regular n-gon is constructible by straight-edge and compass:

1, 3, 5, 15, 17, 51, 85, 255, 257, 771, 1285, 3855, 4369, 13107, 21845, 65535, 65537, 196611, 327685, 983055, 1114129, 3342387, 5570645, 16711935, 16843009, 50529027, 84215045, 252645135, 286331153, 858993459, 1431655765, 4294967295

So, the largest known odd n for which a regular n-gon is constructible is 4294967295. This is the product of all 5 known Fermat primes:

4294967295 = 3 × 5 × 17 × 257 × 65537

Thanks to Puzzle 2, this is 2 less than the next Fermat number:

4294967295 = 2^{2^5} - 1

We can construct a regular polygon with one more side, namely

4294967296 = 2^{2^5}

sides, because this is a power of 2. But we can’t construct a regular polygon with one more side than that, namely

4294967297 = 2^{2^5} + 1

because Euler showed this Fermat number is not prime.

So, we’ve hit the end of the road… unless someone discovers another Fermat prime.

February 05, 2019

Terence TaoWord maps close to the identity

While talking mathematics with a postdoc here at UCLA (March Boedihardjo) we came across the following matrix problem which we managed to solve, but the proof was cute and the process of discovering it was fun, so I thought I would present the problem here as a puzzle without revealing the solution for now.

The problem involves word maps on a matrix group, which for sake of discussion we will take to be the special orthogonal group SO(3) of real 3 \times 3 matrices (one of the smallest matrix groups that contains a copy of the free group, which incidentally is the key observation powering the Banach-Tarski paradox).  Given any abstract word w of two generators x,y and their inverses (i.e., an element of the free group {\bf F}_2), one can define the word map w: SO(3) \times SO(3) \to SO(3) simply by substituting a pair of matrices in SO(3) into these generators.  For instance, if one has the word w = x y x^{-2} y^2 x, then the corresponding word map w: SO(3) \times SO(3) \to SO(3) is given by

\displaystyle w(A,B) := ABA^{-2} B^2 A

for A,B \in SO(3).  Because SO(3) contains a copy of the free group, we see the word map is non-trivial (not equal to the identity) if and only if the word itself is nontrivial.

Anyway, here is the problem:

Problem. Does there exist a sequence w_1, w_2, \dots of non-trivial word maps w_n: SO(3) \times SO(3) \to SO(3) that converge uniformly to the identity map?

To put it another way, given any \varepsilon > 0, does there exist a non-trivial word w such that \|w(A,B) - 1 \| \leq \varepsilon for all A,B \in SO(3), where \| \| denotes (say) the operator norm, and 1 denotes the identity matrix in SO(3)?

As I said, I don’t want to spoil the fun of working out this problem, so I will leave it as a challenge. Readers are welcome to share their thoughts, partial solutions, or full solutions in the comments below.

n-Category Café Jacobi Manifolds

Here at the conference Foundations of Geometric Structures of Information 2019, Aïssa Wade of Penn State gave a talk about Jacobi manifolds. She got my attention with these words: “Poisson geometry is a good framework for classical mechanics, while contact geometry is the right framework for classical thermodynamics. Jacobi manifolds are a natural bridge between these.”

So what’s a Jacobi manifold?

It’s really simple: a Jacobi manifold is a smooth manifold MM such that the vector space C (M)C^\infty(M) is equipped with the structure of a Lie algebra

{,}:C (M)×C (M)C (M) \{\cdot, \cdot \} \colon C^\infty(M) \times C^\infty(M) \to C^\infty(M)

and the bracket is ‘local’ in the following sense:

supp{f,g}suppfsuppg \mathrm{supp} \{f,g\} \subseteq \mathrm{supp} f \cap \mathrm{supp} g

The most famous Jacobi manifolds are the Poisson manifolds, where the Lie bracket obeys this extra rule:

{f,gh}={f,g}h+g{f,h} \{f,g h\} = \{f,g\} h + g \{f, h \}

For any Jacobi manifold, the bracket can be written as

{f,g}=(fdggdf)(v)+(dfdg)(Π) \{f, g \} = (f d g - g d f)(v) + (d f \wedge d g)(\Pi)

for some unique vector field vv and bivector field Π\Pi. vv and Π\Pi need to obey some identities to ensure that the bracket obeys the Jacobi identity. If we’ve got a Jacobi manifold with v=0v = 0, so that

{f,g}=(dfdg)(Π) \{f, g \} = (d f \wedge d g)(\Pi)

then our Jacobi manifold is a Poisson manifold, and Π\Pi is called the Poisson bivector or Poisson tensor. Conversely, any Poisson manifold has

{f,g}=(dfdg)(Π) \{f, g \} = (d f \wedge d g)(\Pi)

for some bivector field Π\Pi.

So, generalizing from Poisson manifolds to Jacobi manifolds amounts to allowing a nonzero vector field vv in our formula for the bracket.

But Aïssa favors another way of thinking about Jacobi manifolds. Apparently a Jacobi manifold structure on MM gives a way of making some principal *\mathbb{R}^\ast-bundle over MM into a Poisson manifold, where *\mathbb{R}^\ast is the multiplicative group of the reals. The Poisson structures we get this way are homogeneous of degree 1-1 with respect to the *\mathbb{R}^\ast action. I’m not sure I’ve got all the details right here, but she stated an equivalence of categories between Jacobi manifolds and something like “manifolds equipped with a principal *\mathbb{R}^\ast bundle equipped with a Poisson bracket that’s homogeneous of degree 1-1”.

This viewpoint sets up the connection to contact geometry. I understand how contact geometry is the right geometry for classical thermodynamics, and I understand how Poisson geometry is the right geometry for classical mechanics… so I should be almost ready to understand how Jacobi manifolds are a nice home for both these branches of physics!

But my insight quits right around here, so if you want more, I’m afraid you’ll have to try these papers, and the references therein:

By the way, Vitagliano and Wade are doing something a lot deeper than the simple stuff I’m discussing here. They are talking about holomorphic Jacobi manifolds, and answering the question “what is the global object whose infinitesimal counterpart is a holomorphic Jacobi manifold?” To understand this question you have to first realize that just as Lie algebras are infinitesimal versions of Lie groups, Poisson manifolds can be seen as infinitesimal versions of ‘symplectic groupoids’. This is a fascinating story! We can then generalize this idea to Jacobi manifolds… and then adapt it to holomorphic Jacobi manifolds. But I’m still interested in much more basic stuff, like: what are we doing if we use a Jacobi manifold rather than a Poisson manifold as the ‘phase space’ for a classical system?

BackreactionString theory landscape predicts no new particles at the LHC

In a paper that appeared on the arXiv last week, Howard Baer and collaborators predict masses of new particles using the string theory landscape. They argue that the Large Hadron Collider should not have seen them so far, and likely will not see them in the upcoming run. Instead, it would at least take an upgrade of the LHC to higher collision energy to see any. The idea underlying their

February 04, 2019

Scott AaronsonSabineblogging

I’ve of course been following the recent public debate about whether to build a circular collider to succeed the LHC—notably including Sabine Hossenfelder’s New York Times column arguing that we shouldn’t.  (See also the responses by Jeremy Bernstein and Lisa Randall, and the discussion on Peter Woit’s blog, and Daniel Harlow’s Facebook thread, and this Vox piece by Kelsey Piper.)  Let me blog about this as a way of cracking my knuckles or tuning my violin, just getting back into blog-shape after a long hiatus for travel and family and the beginning of the semester.

Regardless of whether this opinion is widely shared among my colleagues, I like Sabine.  I’ve often found her blogging funny and insightful, and I wish more non-Lubos physicists would articulate their thoughts for the public the way she does, rather than just standing on the sidelines and criticizing the ones who do. I find it unfortunate that some of the replies to Sabine’s arguments dwelled on her competence and “standing” in physics (even if we set aside—as we should—Lubos’s misogynistic rants, whose predictability could be used to calibrate atomic clocks). It’s like this: if high-energy physics had reached a pathological state of building bigger and bigger colliders for no good reason, then we’d expect that it would take a semi-outsider to say so in public, so then it wouldn’t be a further surprise to find precisely such a person doing it.

Not for the first time, though, I find myself coming down on the opposite side as Sabine. Basically, if civilization could get its act together and find the money, I think it would be pretty awesome to build a new collider to push forward the energy frontier in our understanding of the universe.

Note that I’m not making the much stronger claim that this is the best possible use of $20 billion for science. Plausibly a thousand $20-million projects could be found that would advance our understanding of reality by more than a new collider would. But it’s also important to realize that that’s not the question at stake here. When, for example, the US Congress cancelled the Superconducting Supercollider midway through construction—partly, it’s believed, on the basis of opposition from eminent physicists in other subfields, who argued that they could do equally important science for much cheaper—none of the SSC budget, as in 0% of it, ever did end up redirected to those other subfields. In practice, then, the question of “whether a new collider is worth it” is probably best considered in absolute terms, rather than relative to other science projects.

What I found most puzzling, in Sabine’s writings on this subject, was the leap in logic from

  1. many theorists expected that superpartners, or other new particles besides the Higgs boson, had a good chance of being discovered at the LHC, based on statistical arguments about “natural” parameter values, and
  2. the basic soundness of naturalness arguments was always open to doubt, and indeed the LHC results to date offer zero support for them, and
  3. many of the same theorists now want an even bigger collider, and continue to expect new particles to be found, and haven’t sufficiently reckoned with their previous failed predictions, to …
  4. therefore we shouldn’t build the bigger collider.

How do we get from 1-3 to 4: is the idea that we should punish the errant theorists, by withholding an experiment that they want, in order to deter future wrong predictions? After step 3, it seems to me that Sabine could equally well have gone to: and therefore it’s all the more important that we do build a new collider, in order to establish all the more conclusively that there’s just an energy desert up there—and that I, Sabine, was right to emphasize that possibility, and those other theorists were wrong to downplay it!

Like, I gather that there are independently motivated scenarios where there would be only the Higgs at the LHC scale, and then new stuff at the next energy scale beyond it. And as an unqualified outsider who enjoys talking to friends in particle physics and binge-reading about it, I’d find it hard to assign the totality of those scenarios less than ~20% credence or more than ~80%—certainly if the actual experts don’t either.

And crucially, it’s not as if raising the collision energy is just one arbitrary direction in which to look for new fundamental physics, among a hundred a-priori equally promising directions. Basically, there’s raising the collision energy and then there’s everything else. By raising the energy, you’re not testing one specific idea for physics beyond Standard Model, but a hundred or a thousand ideas in one swoop.

The situation reminds me a little of the quantum computing skeptics who say: scalable QC can never work, in practice and probably even in principle; the mainstream physics community only thinks it can work because of groupthink and hype; therefore, we shouldn’t waste more funds trying to make it work. With the sole, very interesting exception of Gil Kalai, none of the skeptics ever seem to draw what strikes me as an equally logical conclusion: whoa, let’s go full speed ahead with trying to build a scalable QC, because there’s an epochal revolution in physics to be had here—once the experimenters finally see that I was right and the mainstream was wrong, and they start to unravel the reasons why!

Of course, $20 billion is a significant chunk of change, by the standards of science even if not by the standards of random government wastages (like our recent $11 billion shutdown). And ultimately, decisions do need to be made about which experiments are most interesting to pursue with limited resources. And if a future circular collider were built, and if it indeed just found a desert, I think the balance would tilt pretty strongly toward Sabine’s position—that is, toward declining to build an even bigger and more expensive collider after that. If the Patriots drearily won every Superbowl 13-3, year after year after year, eventually no one would watch anymore and the Superbowl would get cancelled (well, maybe that will happen for other reasons…).

But it’s worth remembering that—correct me if I’m wrong—so far there have been no cases in the history of particle physics of massively expanding the energy frontier and finding absolutely nothing new there (i.e., nothing that at least conveyed multiple bits of information, as the Higgs mass did). And while my opinion should count for less than a neutrino mass, just thinking it over a-priori, I keep coming back to the question: before we close the energy frontier for good, shouldn’t there have been at least one unmitigated null result, rather than zero?

February 02, 2019

Jordan Ellenberg-22F

I mentioned last week that -3 Fahrenheit doesn’t seem that cold to me anymore. Well, this week it got colder. The coldest it’s been in Wisconsin in more than two decades, the coldest temperatures I’ve ever experienced. When I walked home from the gym on Wednesday it was -22F. That, reader, is cold. You don’t notice it for the first few minutes because you still have residual heat in your body from being inside. But at -22F your fingers start to get cold and numb inside your gloves and your toes inside your boots. My walk was about ten minutes. I could have handled twenty. But probably not thirty.

February 01, 2019

Clifford JohnsonBlack Holes and Time Travel in your Everyday Life

Oh, look what I found! It is my talk "Black Holes and Time Travel in your Everyday Life", which I gave as the Klopsteg Award lecture at AAPT back in July. Someone put it on YouTube. I hope you enjoy it!

Two warnings: (1) Skip to about 6 minutes to start, to avoid all the embarrassing handshaking and awarding and stuff. (2) There's a bit of early morning slowness + jet lag in my delivery here and there, so sorry about that. :)


Abstract: [...] Click to continue reading this post

The post Black Holes and Time Travel in your Everyday Life appeared first on Asymptotia.

Clifford JohnsonBlack Market of Ideas

As a reminder, today I'll be at the natural history museum (LA) as part of the "Night of Ideas" event! I'll have a number of physics demos with me and will be at a booth/table (in the Black Market of Ideas section) talking about physics ideas underlying our energy future as a species. I'll sign some books too! Come along!

Here's link to the event:

Click to continue reading this post

The post Black Market of Ideas appeared first on Asymptotia.

Terence Tao255B, Notes 1: The Lagrangian formulation of the Euler equations

These lecture notes are a continuation of the 254A lecture notes from the previous quarter.

We consider the Euler equations for incompressible fluid flow on a Euclidean space {{\bf R}^d}; we will label {{\bf R}^d} as the “Eulerian space” {{\bf R}^d_E} (or “Euclidean space”, or “physical space”) to distinguish it from the “Lagrangian space” {{\bf R}^d_L} (or “labels space”) that we will introduce shortly (but the reader is free to also ignore the {E} or {L} subscripts if he or she wishes). Elements of Eulerian space {{\bf R}^d_E} will be referred to by symbols such as {x}, we use {dx} to denote Lebesgue measure on {{\bf R}^d_E} and we will use {x^1,\dots,x^d} for the {d} coordinates of {x}, and use indices such as {i,j,k} to index these coordinates (with the usual summation conventions), for instance {\partial_i} denotes partial differentiation along the {x^i} coordinate. (We use superscripts for coordinates {x^i} instead of subscripts {x_i} to be compatible with some differential geometry notation that we will use shortly; in particular, when using the summation notation, we will now be matching subscripts with superscripts for the pair of indices being summed.)

In Eulerian coordinates, the Euler equations read

\displaystyle  \partial_t u + u \cdot \nabla u = - \nabla p \ \ \ \ \ (1)

\displaystyle  \nabla \cdot u = 0

where {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E} is the velocity field and {p: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}} is the pressure field. These are functions of time {t \in [0,T)} and on the spatial location variable {x \in {\bf R}^d_E}. We will refer to the coordinates {(t,x) = (t,x^1,\dots,x^d)} as Eulerian coordinates. However, if one reviews the physical derivation of the Euler equations from 254A Notes 0, before one takes the continuum limit, the fundamental unknowns were not the velocity field {u} or the pressure field {p}, but rather the trajectories {(x^{(a)}(t))_{a \in A}}, which can be thought of as a single function {x: [0,T) \times A \rightarrow {\bf R}^d_E} from the coordinates {(t,a)} (where {t} is a time and {a} is an element of the label set {A}) to {{\bf R}^d}. The relationship between the trajectories {x^{(a)}(t) = x(t,a)} and the velocity field was given by the informal relationship

\displaystyle  \partial_t x(t,a) \approx u( t, x(t,a) ). \ \ \ \ \ (2)

We will refer to the coordinates {(t,a)} as (discrete) Lagrangian coordinates for describing the fluid.

In view of this, it is natural to ask whether there is an alternate way to formulate the continuum limit of incompressible inviscid fluids, by using a continuous version {(t,a)} of the Lagrangian coordinates, rather than Eulerian coordinates. This is indeed the case. Suppose for instance one has a smooth solution {u, p} to the Euler equations on a spacetime slab {[0,T) \times {\bf R}^d_E} in Eulerian coordinates; assume furthermore that the velocity field {u} is uniformly bounded. We introduce another copy {{\bf R}^d_L} of {{\bf R}^d}, which we call Lagrangian space or labels space; we use symbols such as {a} to refer to elements of this space, {da} to denote Lebesgue measure on {{\bf R}^d_L}, and {a^1,\dots,a^d} to refer to the {d} coordinates of {a}. We use indices such as {\alpha,\beta,\gamma} to index these coordinates, thus for instance {\partial_\alpha} denotes partial differentiation along the {a^\alpha} coordinate. We will use summation conventions for both the Eulerian coordinates {i,j,k} and the Lagrangian coordinates {\alpha,\beta,\gamma}, with an index being summed if it appears as both a subscript and a superscript in the same term. While {{\bf R}^d_L} and {{\bf R}^d_E} are of course isomorphic, we will try to refrain from identifying them, except perhaps at the initial time {t=0} in order to fix the initialisation of Lagrangian coordinates.

Given a smooth and bounded velocity field {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E}, define a trajectory map for this velocity to be any smooth map {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} that obeys the ODE

\displaystyle  \partial_t X(t,a) = u( t, X(t,a) ); \ \ \ \ \ (3)

in view of (2), this describes the trajectory (in {{\bf R}^d_E}) of a particle labeled by an element {a} of {{\bf R}^d_L}. From the Picard existence theorem and the hypothesis that {u} is smooth and bounded, such a map exists and is unique as long as one specifies the initial location {X(0,a)} assigned to each label {a}. Traditionally, one chooses the initial condition

\displaystyle  X(0,a) = a \ \ \ \ \ (4)

for {a \in {\bf R}^d_L}, so that we label each particle by its initial location at time {t=0}; we are also free to specify other initial conditions for the trajectory map if we please. Indeed, we have the freedom to “permute” the labels {a \in {\bf R}^d_L} by an arbitrary diffeomorphism: if {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} is a trajectory map, and {\pi: {\bf R}^d_L \rightarrow{\bf R}^d_L} is any diffeomorphism (a smooth map whose inverse exists and is also smooth), then the map {X \circ \pi: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} is also a trajectory map, albeit one with different initial conditions {X(0,a)}.

Despite the popularity of the initial condition (4), we will try to keep conceptually separate the Eulerian space {{\bf R}^d_E} from the Lagrangian space {{\bf R}^d_L}, as they play different physical roles in the interpretation of the fluid; for instance, while the Euclidean metric {d\eta^2 = dx^1 dx^1 + \dots + dx^d dx^d} is an important feature of Eulerian space {{\bf R}^d_E}, it is not a geometrically natural structure to use in Lagrangian space {{\bf R}^d_L}. We have the following more general version of Exercise 8 from 254A Notes 2:

Exercise 1 Let {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E} be smooth and bounded.

  • If {X_0: {\bf R}^d_L \rightarrow {\bf R}^d_E} is a smooth map, show that there exists a unique smooth trajectory map {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} with initial condition {X(0,a) = X_0(a)} for all {a \in {\bf R}^d_L}.
  • Show that if {X_0} is a diffeomorphism and {t \in [0,T)}, then the map {X(t): a \mapsto X(t,a)} is also a diffeomorphism.

Remark 2 The first of the Euler equations (1) can now be written in the form

\displaystyle  \frac{d^2}{dt^2} X(t,a) = - (\nabla p)( t, X(t,a) ) \ \ \ \ \ (5)

which can be viewed as a continuous limit of Newton’s first law {m^{(a)} \frac{d^2}{dt^2} x^{(a)}(t) = F^{(a)}(t)}.

Call a diffeomorphism {Y: {\bf R}^d_L \rightarrow {\bf R}^d_E} (oriented) volume preserving if one has the equation

\displaystyle  \mathrm{det}( \nabla Y )(a) = 1 \ \ \ \ \ (6)

for all {a \in {\bf R}^d_L}, where the total differential {\nabla Y} is the {d \times d} matrix with entries {\partial_\alpha Y^i} for {\alpha = 1,\dots,d} and {i=1,\dots,d}, where {Y^1,\dots,Y^d:{\bf R}^d_L \rightarrow {\bf R}} are the components of {Y}. (If one wishes, one can also view {\nabla Y} as a linear transformation from the tangent space {T_a {\bf R}^d_L} of Lagrangian space at {a} to the tangent space {T_{Y(a)} {\bf R}^d_E} of Eulerian space at {Y(a)}.) Equivalently, {Y} is orientation preserving and one has a Jacobian-free change of variables formula

\displaystyle  \int_{{\bf R}^d_F} f( Y(a) )\ da = \int_{{\bf R}^d_E} f(x)\ dx

for all {f \in C_c({\bf R}^d_E \rightarrow {\bf R})}, which is in turn equivalent to {Y(E) \subset {\bf R}^d_E} having the same Lebesgue measure as {E} for any measurable set {E \subset {\bf R}^d_L}.

The divergence-free condition {\nabla \cdot u = 0} then can be nicely expressed in terms of volume-preserving properties of the trajectory maps {X}, in a manner which confirms the interpretation of this condition as an incompressibility condition on the fluid:

Lemma 3 Let {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E} be smooth and bounded, let {X_0: {\bf R}^d_L \rightarrow {\bf R}^d_E} be a volume-preserving diffeomorphism, and let {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} be the trajectory map. Then the following are equivalent:

  • {\nabla \cdot u = 0} on {[0,T) \times {\bf R}^d_E}.
  • {X(t): {\bf R}^d_L \rightarrow {\bf R}^d_E} is volume-preserving for all {t \in [0,T)}.

Proof: Since {X_0} is orientation-preserving, we see from continuity that {X(t)} is also orientation-preserving. Suppose that {X(t)} is also volume-preserving, then for any {f \in C^\infty_c({\bf R}^d_E \rightarrow {\bf R})} we have the conservation law

\displaystyle  \int_{{\bf R}^d_L} f( X(t,a) )\ da = \int_{{\bf R}^d_E} f(x)\ dx

for all {t \in [0,T)}. Differentiating in time using the chain rule and (3) we conclude that

\displaystyle  \int_{{\bf R}^d_L} (u(t) \cdot \nabla f)( X(t,a)) \ da = 0

for all {t \in [0,T)}, and hence by change of variables

\displaystyle  \int_{{\bf R}^d_E} (u(t) \cdot \nabla f)(x) \ dx = 0

which by integration by parts gives

\displaystyle  \int_{{\bf R}^d_E} (\nabla \cdot u(t,x)) f(x)\ dx = 0

for all {f \in C^\infty_c({\bf R}^d_E \rightarrow {\bf R})} and {t \in [0,T)}, so {u} is divergence-free.

To prove the converse implication, it is convenient to introduce the labels map {A:[0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_L}, defined by setting {A(t): {\bf R}^d_E \rightarrow {\bf R}^d_L} to be the inverse of the diffeomorphism {X(t): {\bf R}^d_L \rightarrow {\bf R}^d_E}, thus

\displaystyle A(t, X(t,a)) = a

for all {(t,a) \in [0,T) \times {\bf R}^d_L}. By the implicit function theorem, {A} is smooth, and by differentiating the above equation in time using (3) we see that

\displaystyle  D_t A(t,x) = 0

where {D_t} is the usual material derivative

\displaystyle  D_t := \partial_t + u \cdot \nabla \ \ \ \ \ (7)

acting on functions on {[0,T) \times {\bf R}^d_E}. If {u} is divergence-free, we have from integration by parts that

\displaystyle  \partial_t \int_{{\bf R}^d_E} \phi(t,x)\ dx = \int_{{\bf R}^d_E} D_t \phi(t,x)\ dx

for any test function {\phi: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}}. In particular, for any {g \in C^\infty_c({\bf R}^d_L \rightarrow {\bf R})}, we can calculate

\displaystyle \partial_t \int_{{\bf R}^d_E} g( A(t,x) )\ dx = \int_{{\bf R}^d_E} D_t (g(A(t,x)))\ dx

\displaystyle  = \int_{{\bf R}^d_E} 0\ dx

and hence

\displaystyle  \int_{{\bf R}^d_E} g(A(t,x))\ dx = \int_{{\bf R}^d_E} g(A(0,x))\ dx

for any {t \in [0,T)}. Since {X_0} is volume-preserving, so is {A(0)}, thus

\displaystyle  \int_{{\bf R}^d_E} g \circ A(t)\ dx = \int_{{\bf R}^d_L} g\ da.

Thus {A(t)} is volume-preserving, and hence {X(t)} is also. \Box

Exercise 4 Let {M: [0,T) \rightarrow \mathrm{GL}_d({\bf R})} be a continuously differentiable map from the time interval {[0,T)} to the general linear group {\mathrm{GL}_d({\bf R})} of invertible {d \times d} matrices. Establish Jacobi’s formula

\displaystyle  \partial_t \det(M(t)) = \det(M(t)) \mathrm{tr}( M(t)^{-1} \partial_t M(t) )

and use this and (6) to give an alternate proof of Lemma 3 that does not involve any integration in space.

Remark 5 One can view the use of Lagrangian coordinates as an extension of the method of characteristics. Indeed, from the chain rule we see that for any smooth function {f: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}} of Eulerian spacetime, one has

\displaystyle  \frac{d}{dt} f(t,X(t,a)) = (D_t f)(t,X(t,a))

and hence any transport equation that in Eulerian coordinates takes the form

\displaystyle  D_t f = g

for smooth functions {f,g: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}} of Eulerian spacetime is equivalent to the ODE

\displaystyle  \frac{d}{dt} F = G

where {F,G: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}} are the smooth functions of Lagrangian spacetime defined by

\displaystyle  F(t,a) := f(t,X(t,a)); \quad G(t,a) := g(t,X(t,a)).

In this set of notes we recall some basic differential geometry notation, particularly with regards to pullbacks and Lie derivatives of differential forms and other tensor fields on manifolds such as {{\bf R}^d_E} and {{\bf R}^d_L}, and explore how the Euler equations look in this notation. Our discussion will be entirely formal in nature; we will assume that all functions have enough smoothness and decay at infinity to justify the relevant calculations. (It is possible to work rigorously in Lagrangian coordinates – see for instance the work of Ebin and Marsden – but we will not do so here.) As a general rule, Lagrangian coordinates tend to be somewhat less convenient to use than Eulerian coordinates for establishing the basic analytic properties of the Euler equations, such as local existence, uniqueness, and continuous dependence on the data; however, they are quite good at clarifying the more algebraic properties of these equations, such as conservation laws and the variational nature of the equations. It may well be that in the future we will be able to use the Lagrangian formalism more effectively on the analytic side of the subject also.

Remark 6 One can also write the Navier-Stokes equations in Lagrangian coordinates, but the equations are not expressed in a favourable form in these coordinates, as the Laplacian {\Delta} appearing in the viscosity term becomes replaced with a time-varying Laplace-Beltrami operator. As such, we will not discuss the Lagrangian coordinate formulation of Navier-Stokes here.

— 1. Pullbacks and Lie derivatives —

In order to efficiently change coordinates, it is convenient to use the language of differential geometry, which is designed to be almost entirely independent of the choice of coordinates. We therefore spend some time recalling the basic concepts of differential geometry that we will need. Our presentation will be based on explicitly working in coordinates; there are of course more coordinate-free approaches to the subject (for instance setting up the machinery of vector bundles, or of derivations), but we will not adopt these approaches here.

Throughout this section, we fix a diffeomorphism {Y: {\bf R}^d_L \rightarrow {\bf R}^d_E} from Lagrangian space {{\bf R}^d_L} to Eulerian space {{\bf R}^d_E}; one can for instance take {Y = X(t)} where {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} is a diffeomorphic trajectory map and {t \in [0,T)} is some time. Then all the differential geometry structures on Eulerian space {{\bf R}^d_E} can be pulled back via {Y} to Lagrangian space {{\bf R}^d_L}. For instance, a physical point {x \in {\bf R}^d_E} can be pulled back to a label {Y^* x := Y^{-1}(x) \in {\bf R}^d_L}, and similarly a subset {E \subset {\bf R}^d_E} of physical space can be pulled back to a subset {Y^* E := Y^{-1}(E) \subset {\bf R}^d_L} of label space. A scalar field {f: {\bf R}^d_E \rightarrow {\bf R}} can be pulled back to a scalar field {Y^* f: {\bf R}^d_L \rightarrow {\bf R}}, defined by pre-composition:

\displaystyle  Y^* f(a) := f(Y(a)).

These operations are all compatible with each other in various ways; for instance, if {x \in {\bf R}^d_E}, {E \subset {\bf R}^d_E}, and {f: {\bf R}^d_L \rightarrow {\bf R}}, and {c \in {\bf R}} then

  • {x \in E} if and only if {Y^* x \in Y^* E}.
  • {f(x) = c} if and only if {Y^* f( Y^* x ) = c}.
  • The map {E \mapsto Y^* E} is an isomorphism of {\sigma}-algebras.
  • The map {f \mapsto Y^* f} is an algebra isomorphism.

Differential forms. The next family of structures we will pull back are that of differential forms, which we will define using coordinates. (See also my previous notes on this topic for more discussion on differential forms.) For any {k \geq 0}, a {k}-form {\omega} on {{\bf R}^d_E} will be defined as a family of functions {\omega_{i_1 \dots i_k}: {\bf R}^d_E \rightarrow {\bf R}} for {i_1,\dots,i_k \in \{1,\dots,d\}} which is totally antisymmetric with respect to permutations of the indices {i_1,\dots,i_k}, thus if one interchanges {i_j} and {i_{j'}} for any {1 \leq j < j' \leq k}, then {\omega_{i_1 \dots i_k}} flips to {-\omega_{i_1 \dots i_k}}. Thus for instance

  • A {0}-form is just a scalar field {\omega: {\bf R}^d_E \rightarrow {\bf R}};
  • A {1}-form, when viewed in coordinates, is a collection {\omega_i: {\bf R}^d_E \rightarrow {\bf R}} of {d} scalar functions;
  • A {2}-form, when viewed in coordinates, is a collection {\omega_{ij}: {\bf R}^d_E \rightarrow {\bf R}} of {d^2} scalar functions with {\omega_{ji} = -\omega_{ij}} (so in particular {\omega_{ii}=0});
  • A {3}-form, when viewed in coordinates, is a collection {\omega_{ijk}: {\bf R}^d_E \rightarrow {\bf R}} of {d^3} scalar functions with {\omega_{jik} = -\omega_{ijk}}, {\omega_{ikj} = -\omega_{ijk}}, and {\omega_{kji} = -\omega_{ijk}}.

The antisymmetry makes the component {\omega_{i_1 \dots i_k}} of a {k}-form vanish whenever two of the indices agree. In particular, if {k>d}, then the only {k}-form that exists is the zero {k}-form {0}. A {d}-form is also known as a volume form; amongst all such forms we isolate the standard volume form {d\mathrm{vol}_E}, defined by setting {(d\mathrm{vol}_E)_{\sigma(1) \dots \sigma(d)} := \mathrm{sgn}(\sigma)} for any permutation {\sigma: \{1,\dots,d\} \rightarrow \{1,\dots,d\}} (with {\mathrm{sgn}(\sigma)\in \{-1,+1\}} being the sign of the permutation), and setting all other components of {d\mathrm{vol}_E} equal to zero. For instance, in three dimensions one has {(d\mathrm{vol}_E)_{i_1 \dots i_3}} equal to {+1} when {(i_1,i_2,i_3) = (1,2,3), (2,3,1), (3,1,2)}, {-1} when {(i_1,i_2,i_3) = (1,3,2), (3,2,1), (2,1,3)}, and {0} otherwise. We use {\Omega^k({\bf R}^d_E)} to denote the space of {k}-forms on {{\bf R}^d_E}.

If {f: {\bf R}^d_E \rightarrow {\bf R}} is a scalar field and {\omega \in \Omega^k({\bf R}^d_E)}, we can define the product {f\omega} by pointwise multiplication of components:

\displaystyle  (f\omega)_{i_1 \dots i_k}(x) := f(x) \omega_{i_1 \dots i_k}(x).

More generally, given two forms {\omega \in \Omega^k({\bf R}^d_E)}, {\theta \in \Omega^l({\bf R}^d_E)}, we define the wedge product {\omega \wedge \theta := \Omega^{k+l}({\bf R}^d_E)} to be the {k+l}-form given by the formula

\displaystyle  (\omega \wedge \theta)_{i_1 \dots i_{k+l}}(x) := \frac{1}{k! l!} \sum_{\sigma \in S_{k+l}} \mathrm{sgn}(\sigma) \omega_{i_{\sigma(1)} \dots i_{\sigma(k)}}(x) \theta_{i_{\sigma(k+1)} \dots i_{\sigma(k+l)}}(x)

where {S_{k+l}} is the symmetric group of permutations on {\{1,\dots,k+l\}}. For instance, for a scalar field {f: {\bf R}^d_E \rightarrow {\bf R}} (so {f \in \Omega^0({\bf R}^d_E)}), {f \wedge \omega = \omega \wedge f = f \omega}. Similarly, if {\theta,\eta \in \Omega^1({\bf R}^d_E)} and {\omega \in \Omega^2({\bf R}^d_E)}, we have the pointwise identities

\displaystyle  (\theta \wedge \eta)_{ij} = \theta_i \eta_j - \theta_j \eta_i

\displaystyle  (\theta \wedge \omega)_{ijk} = \theta_i \omega_{jk} - \theta_j \omega_{ik} + \theta_k \omega_{ij}

\displaystyle  (\omega \wedge \theta)_{ijk} = \omega_{ij} \theta_k - \omega_{ik} \theta_j + \omega_{jk} \theta_i.

Exercise 7 Show that the wedge product is a bilinear map from {\Omega^k({\bf R}^d_E) \times \Omega^l({\bf R}^d_E)} to {\Omega^{k+l}({\bf R}^d_E)} that obeys the supercommutative property

\displaystyle  \omega \wedge \theta = (-1)^{kl} \theta \wedge \omega

for {\omega \in \Omega^k({\bf R}^d_E)} and {\theta \in \Omega^l({\bf R}^d_E)}, and the associative property

\displaystyle  (\omega \wedge \theta) \wedge \eta = \omega \wedge (\theta \wedge \eta)

for {\omega \in \Omega^k({\bf R}^d_E)}, {\theta \in \Omega^l({\bf R}^d_E)}, {\eta \in \Omega^m({\bf R}^d_E)}. (In other words, the space of formal linear combinations of forms, graded by the parity of the order of the forms, is a supercommutative algebra. Very roughly speaking, the prefix “super” means that “odd order objects anticommute with each other rather than commute”.)

If {\omega \in \Omega^k({\bf R}^d_E)} is continuously differentiable, we define the exterior derivative {d\omega \in \Omega^{k+1}({\bf R}^d_E)} in coordinates as

\displaystyle  (d\omega)_{i_1 \dots i_{k+1}} := \sum_{j=1}^{k+1} (-1)^{j-1} \partial_{i_j} \omega_{i_1 \dots i_{j-1} i_{j+1} \dots i_{k+1}}. \ \ \ \ \ (8)

It is easy to verify that this is indeed a {k+1}-form. Thus for instance:

  • If {f \in \Omega^0({\bf R}^d_E)} is a continously differentiable scalar field, then {(df)_i = \partial_i f}.
  • If {\theta \in \Omega^1({\bf R}^d_E)} is a continuously differentiable {1}-form, then {(d\theta)_{ij} = \partial_i \theta_j - \partial_j \theta_i}.
  • If {\omega \in \Omega^2({\bf R}^d_E)} is a continuously differentiable {2}-form, then {(d\omega)_{ijk} = \partial_i \omega_{jk} - \partial_j \omega_{ik} + \partial_k \omega_{ij}}.

Exercise 8 If {\omega \in \Omega^k({\bf R}^d_E)} and {\theta \in \Omega^l({\bf R}^d_E)} are continuously differentiable, establish the antiderivation (or super-Leibniz) law

\displaystyle  d( \omega \wedge \theta ) = (d\omega) \wedge \theta + (-1)^k \omega \wedge d\theta \ \ \ \ \ (9)

and if {\omega} is twice continuously differentiable, establish the chain complex law

\displaystyle  d d \omega = 0. \ \ \ \ \ (10)

Each of the coordinates {x^i}, {i=1,\dots,d} can be viewed as scalar fields {(x^1,\dots,x^d) \mapsto x^i}. In particular, the exterior derivatives {dx^i}, {i=1,\dots,d} are {1}-forms. It is easy to verify the identity

\displaystyle  \omega = \frac{1}{k!} \omega_{i_1 \dots i_k} dx^{i_1} \wedge \dots \wedge dx^{i_k}

for any {\omega \in \Omega^k({\bf R}^d_E)} with the usual summation conventions (which, in this differential geometry formalism, assert that we sum indices whenever they appear as a subscript-superscript pair). In particular the volume form {d\mathrm{vol}_E} can be written as

\displaystyle  d\mathrm{vol}_E = dx^1 \wedge \dots \wedge dx^d.

One can of course define differential forms on Lagrangian space {{\bf R}^d_L} as well, changing the indices from Roman to Greek. For instance, if {\theta \in \Omega^1({\bf R}^d_L)} is continuously differentiable, then {d\theta \in \Omega^2({\bf R}^d_L)} is given in coordinates as

\displaystyle  (d\theta)_{\alpha \beta} = \partial_\alpha \theta_\beta - \partial_\beta \theta_\alpha.

If {\omega \in \Omega^k({\bf R}^d_E)}, we define the pullback form {Y^* \omega \in \Omega^k({\bf R}^d_L)} by the formula

\displaystyle  (Y^* \omega)_{\alpha_1 \dots \alpha_k}(a) := \omega_{i_1 \dots i_k}(Y(a)) \partial_{\alpha_1} Y^{i_1}(a) \dots \partial_{\alpha_k} Y^{i_k}(a) \ \ \ \ \ (11)

with the usual summation conventions. Thus for instance

It is easy to see that pullback {Y^*} is a linear map from {\Omega^k({\bf R}^d_E)} to {\Omega^k({\bf R}^d_L)}. It also preserves the exterior algebra and exterior derivative:

Exercise 9 Let {\omega, \theta \in \Omega^k({\bf R}^d_E)}. Show that

\displaystyle  Y^* (\omega \wedge \theta) = (Y^* \omega) \wedge (Y^* \theta),

and if {\omega} is continuously differentiable, show that

\displaystyle  Y^* d \omega = d Y^* \omega.

One can integrate {k}-forms on oriented {k}-manifolds. Suppose for instance that an oriented {k}-manifold {M \subset {\bf R}^d_E} has a parameterisation {\{ \phi(a): a \in U \}}, where {U} is an open subset of {{\bf R}^k_L} and {\phi: U \rightarrow {\bf R}^d_E} is an injective immersion. Then any continuous compactly supported {k}-form {\omega \in \Omega^k({\bf R}^d_E)} can be integrated on {M} by the formula

\displaystyle  \int_M \omega := \frac{1}{k!} \int_U \omega(\phi(a))_{i_1 \dots i_k} \partial_1 Y^{i_1}(a) \dots \partial_k Y^{i_k}(a)\ da

with the usual summation conventions. It can be shown that this definition is independent of the choice of parameterisation. For a more general manifold {M}, one can use a partition of unity to decompose the integral {\int_M \omega} into parameterised manifolds, and define the total integral to be the sum of the components; again, one can show (after some tedious calculation) that this is independent of the choice of parameterisation. If {M} is all of {{\bf R}^d_E} (with the standard orientation), and {f \in C_c({\bf R}^d \rightarrow {\bf R})}, then we have the identity

\displaystyle  \int_{{\bf R}^d_E} f\ d\mathrm{vol}_E = \int_{{\bf R}^d_E} f(x)\ dx \ \ \ \ \ (13)

linking integration on differential forms with the Lebesgue (or Riemann) integral. We also record Stokes’ theorem

\displaystyle  \int_{\partial \Omega} \omega = \int_{\Omega} d \omega \ \ \ \ \ (14)

whenever {\Omega} is a smooth orientable {k+1}-manifold with smooth boundary {\partial \Omega}, and {\omega} is a continuous, compactly supported {k}-form. The regularity conditions on {\Omega,\omega} here can often be relaxed by the usual limiting arguments; for the purposes of this set of notes, we shall proceed formally and assume that identities such as (14) hold for all manifolds {\Omega} and forms {\omega} under consideration.

From the change of variables formula we see that pullback also respects integration on manifolds, in that

\displaystyle  \int_{Y^* \Omega} Y^* \omega = \int_\Omega \omega \ \ \ \ \ (15)

whenever {\Omega} is a smooth orientable {k}-manifold, and {\omega} a continuous compactly supported {k}-form.

Exercise 10 Establish the identity

\displaystyle  Y^* d\mathrm{vol}_E = \mathrm{det}(\nabla Y) d\mathrm{vol}_L.

Conclude in particular that {Y} is volume-preserving if and only if

\displaystyle  Y^* d\mathrm{vol}_E = d\mathrm{vol}_L.

Vector fields. Having pulled back differential forms, we now pull back vector fields. A vector field {Z} on {{\bf R}^d_E}, when viewed in coordinates, is a collection {Z^i: {\bf R}^d_E \rightarrow {\bf R}}, {i=1,\dots,d} of scalar functions; superficially, this resembles a {1}-form {\theta \in \Omega^1({\bf R}^d_E)}, except that we use superscripts {Z^i} instead of subscripts {\theta_i} to denote the components. On the other hand, we will transform vector fields under pullback in a different manner from {1}-forms. For each {i}, a basic example of a vector field is the coordinate vector field {\frac{d}{dx^i}}, defined by setting {(\frac{d}{dx^i})^j} to equal {1} when {i=j} and {0} otherwise. Then every vector field {Z} may be written as

\displaystyle  Z = Z^i \frac{d}{dx^i}

where we multiply scalar functions against vector fields in the obvious fashion; compare this with the expansion {\theta = \theta_i dx^i} of a {1}-form {\theta \in \Omega^1({\bf R}^d_E)} into its components {\theta_i}. The space of all vector fields will be denoted {\Gamma(T {\bf R}^d_E)}. One can of course define vector fields on {{\bf R}^d_L} similarly.

The pullback {Y^* Z} of {Z} is defined to be the unique vector field {Y^* Z \in \Gamma(T {\bf R}^d_L)} such that

\displaystyle  Z^i(Y(a)) = (Y^* Z)^\alpha(a) \partial_\alpha Y^i(a) \ \ \ \ \ (16)

for all {a \in {\bf R}^d_L} (so that {Z} is the pushforward of {Y_* Z}). Equivalently, if {(\nabla Y)^{-1}} is the inverse matrix to the total differential {\nabla Y} (which we recall in coordinates is {(\nabla Y)^i_\alpha := \partial_\alpha Y^i}), so that

\displaystyle  ((\nabla Y)^{-1})^\alpha_i (\nabla Y)^i_\beta = \delta^\alpha_\beta, \quad (\nabla Y)^i_\alpha ((\nabla Y)^{-1})^\alpha_j = \delta^i_j

with {\delta} denoting the Kronecker delta, then

\displaystyle  (Y^* Z)^\alpha(a) = Z^i(Y(a)) ((\nabla Y)^{-1})^\alpha_i(a).

From the inverse function theorem one can also write

\displaystyle  ((\nabla Y)^{-1})^\alpha_i(a) = (\partial_i Y^{-1})( Y(a) ),

thus {Z} is also the pullback of {Y^* Z} by {Y^{-1}}.

If {\omega \in \Omega^k({\bf R}^d_E)} is a {k}-form and {Z_1,\dots,Z_k \in \Gamma(T{\bf R}^d_E)} are vector fields, one can form the scalar field {\omega(Z_1,\dots,Z_k): {\bf R}^d \rightarrow {\bf R}} by the formula

\displaystyle  \omega(Z_1,\dots,Z_k) := \omega_{i_1 \dots i_k} Z^{i_1}_1 \dots Z^{i_k}_k.

Thus for instance if {\omega \in \Omega^2({\bf R}^d_E)} is a {2}-form and {Z,W \in \Gamma(T{\bf R}^d_E)} are vector fields, then

\displaystyle  \omega(Z,W) = \omega_{ij} Z^i W^j.

It is clear that {\omega(Z_1,\dots,Z_k)} is a totally antisymmetric form in the {Z_1,\dots,Z_k}. If {\omega \in \Omega^k({\bf R}^d_E)} is a {k}-form for some {k \geq 1} and {Z \in \Gamma(T{\bf R}^d_E)} is a vector field, we define the contraction (or interior product) {Z \neg \omega \in \Omega^{k-1}({\bf R}^d_E)} in coordinates by the formula

\displaystyle  (Z \neg \omega)_{i_2 \dots i_k} := Z^{i_1} \omega_{i_1 \dots i_k}

or equivalently that

\displaystyle  (Z_1 \neg \omega)(Z_2,\dots,Z_k) = \omega(Z_1,\dots,Z_k)

for {Z_1,\dots,Z_k \in \Gamma(T{\bf R}^d_E)}. Thus for instance if {\omega \in \Omega^2({\bf R}^d_E)} is a {2}-form, and {Z \in \Gamma(T{\bf R}^d_E)} is a vector field, then {Z \neg \omega \in \Omega^1({\bf R}^d_E)} is the {1}-form

\displaystyle  (Z \neg \omega)_i = Z^j \omega_{ji}.

If {Z \in \Gamma(T{\bf R}^d_E)} is a vector field and {f \in \Omega^0({\bf R}^d_E)} is a continuously differentiable scalar field, then {Z \neg df = df(Z)} is just the directional derivative of {f} along the vector field {Z}:

\displaystyle  Z \neg df = Z^i \partial_i f.

The contraction {Z \neg \omega} is also denoted {\iota_Z \omega} in the literature. If one contracts a vector field {Z} against the standard volume form {d\mathrm{vol}_E}, one obtains a {d-1}-form which we will call (by slight abuse of notation) the Hodge dual {*Z} of {Z}:

\displaystyle  *Z := Z \neg d\mathrm{vol}_E.

This can easily be seen to be a bijection between vector fields and {d-1}-forms. The inverse of this operation will also be denoted by the Hodge star {*}:

\displaystyle  *(Z \neg d\mathrm{vol}_E) := Z.

In a similar spirit, the Hodge dual of a scalar field {f: {\bf R}^d_E \rightarrow {\bf R}} will be defined as the volume form

\displaystyle  *f := f d\mathrm{vol}_E

and conversely the Hodge dual of a volume form is a scalar field:

\displaystyle  *(f d\mathrm{vol}_E) = f.

More generally one can form a Hodge duality relationship between {k}-vector fields and {d-k}-forms for any {0 \leq k \leq d}, but we will not do so here as we will not have much use for the notion of a {k}-vector field for any {k>1}.

These operations behave well under pullback (if one assumes volume preservation in the case of the Hodge star):

Exercise 11

  • (i) If {\omega \in \Omega^k({\bf R}^d_E)} and {Z_1,\dots,Z_k \in \Gamma(T{\bf R}^d_E)}, show that

    \displaystyle  Y^* ( \omega(Z_1,\dots,Z_k) ) = (Y^* \omega)(Y^* Z_1, \dots, Y^* Z_k).

  • (ii) If {\omega \in \Omega^k({\bf R}^d_E)} for some {k \geq 1} and {Z \in \Gamma(T{\bf R}^d_E)}, show that

    \displaystyle  Y^*( Z \neg \Omega ) = (Y^* Z) \neg (Y^* \omega).

  • (iii) If {Y} is volume-preserving, show that

    \displaystyle  Y^*( * T ) = * Y^* T

    whenever {T} is a scalar field, vector field, {d-1}-form, or {d}-form on {{\bf R}^d_E}.

Riemannian metrics. A Riemannian metric {g} on {{\bf R}^d_E}, when expressed in coordinates is a collection of scalar functions {g_{ij}: {\bf R}^d_E \rightarrow {\bf R}} such that for each point {x \in {\bf R}^d_E}, the matrix {(g_{ij}(x))_{1 \leq i,j \leq d}} is symmetric and strictly positive definite. In particular it has an inverse metric {g^{-1}}, which is a collection of scalar functions {(g^{-1})^{ij}(x) = g^{ij}(x)} such that

\displaystyle  g^{ij} g_{jk} = \delta^i_k

where {\delta} denotes the Kronecker delta; here we have abused notation (and followed the conventions of general relativity) by allowing the inverse on the metric to be omitted when expressed in coordinates (relying instead on the superscripting of the indices, as opposed to subscripting, to indicate the metric inversion). The Euclidean metric {\eta} is an example of a metric tensor, with {\eta_{ij}} equal to {1} when {i=j} and zero otherwise; the coefficients {\eta^{ij}} of the inverse Euclidean metric {\eta^{-1}} is similarly equal to {1} when {i=j} and {0} otherwise. Given two vector fields {Z,W \in \Gamma(T{\bf R}^d_E)} and a Riemannian metric {g}, we can form the scalar field {g(Z,W)} by

\displaystyle  g(Z,W) := g_{ij} Z^i W^j;

this is a symmetric bilinear form in {Z,W}.

We can define the pullback metric {Y^* g} by the formula

\displaystyle  (Y^* g)_{\alpha \beta}(a) := g_{ij}(Y(a)) \partial_\alpha Y^i(a) \partial_\beta Y^j(a); \ \ \ \ \ (17)

this is easily seen to be a Riemannian metric on {{\bf R}^d_L}, and one has the compatibility property

\displaystyle  Y^*( g(Z,W) ) = (Y^* g)(Y^* Z, Y^* W)

for all {Z,W \in \Gamma(T{\bf R}^d_E)}. It is then not difficult to check that if we pull back the inverse metric {g^{-1}} by the formula

\displaystyle  (Y^*(g^{-1}))^{\alpha \beta}(a) := g^{ij}(Y(a)) ((\nabla Y)^{-1})^\alpha_i(a) ((\nabla Y)^{-1})^\beta_j(a)

then we have the expected relationship

\displaystyle  Y^*(g^{-1}) = (Y^* g)^{-1}.

Exercise 12 If {\pi: {\bf R}^d_L \rightarrow {\bf R}^d_L} is a diffeomorphism, show that

\displaystyle  (Y \circ \pi)^* \omega = \pi^* Y^* \omega

for any {\omega \in \Omega^k({\bf R}^d_E)}, and similarly

\displaystyle  (Y \circ \pi)^* Z = \pi^* Y^* Z

for any {Z \in \Gamma(T{\bf R}^d_E)}, and

\displaystyle  (Y \circ \pi)^* g = \pi^* Y^* g

for any Riemannian metric {g}.

Exercise 13 Show that {Y: {\bf R}^d_L \rightarrow {\bf R}^d_E} is an isometry (with respect to the Euclidean metric on both {{\bf R}^d_L} and {{\bf R}^d_E}) if and only if {Y^* \eta = \eta}.

Every Riemannian metric {g} induces a musical isomorphism between vector fields on {{\bf R}^d_E} with {1}-forms: if {Z \in \Gamma(T {\bf R}^d_E)} is a vector field, the associated {1}-form {g \cdot Z \in \Omega^1({\bf R}^d_E)} (also denoted {Z^\flat_g} or simply {Z^\flat}) is defined in coordinates as

\displaystyle  (g \cdot Z)_i := g_{ij} Z^j

and similarly if {\theta \in \Omega^1({\bf R}^d_E)}, the associated vector field {g^{-1} \cdot \theta \in \Gamma(T{\bf R}^d_E)} (also denoted {\theta^\sharp_g} or {\theta^\sharp}) is defined in coordinates as

\displaystyle  (g^{-1} \cdot \theta)^i := g^{ij} \theta_i.

These operations clearly invert each other: {g^{-1} \cdot g \cdot Z = Z} and {g \cdot g^{-1} \cdot \theta = \theta}. Note that {g \cdot Z} can still be defined if {g} is not positive definite, though it might not be an isomorphism in this case. Observe the identities

\displaystyle  g(W,Z) = W \neg (g \cdot Z) = Z \neg (g \cdot W) = (g \cdot Z)(W) = (g \cdot W)(Z). \ \ \ \ \ (18)

The musical isomorphism interacts well with pullback, provided that one also pulls back the metric {g}:

Exercise 14 If {g} is a Riemannian metric, show that

\displaystyle  Y^*( g \cdot Z ) = Y^* g \cdot Y^* Z

for all {Z \in \Gamma(T {\bf R}^d_E)}, and

\displaystyle  Y^*( g^{-1} \cdot \theta ) = (Y^* g)^{-1} \cdot Y^* \theta.

for all {\theta \in \Omega^1({\bf R}^d_E)}.

We can now interpret some classical operations on vector fields in this differential geometry notation. For instance, if {Z,W \in \Gamma(T{\bf R}^d_E)} are vector fields, the dot product {Z \cdot W: {\bf R}^d_E \rightarrow {\bf R}} can be written as

\displaystyle  Z \cdot W = \eta(Z,W) = Z \neg (\eta \cdot W) = (\eta \cdot W)(Z)

and also

\displaystyle  Z \cdot W = *( (\eta \cdot Z) \wedge *W ),

and for {d=3}, the cross product {Z \times W: {\bf R}^3_E \rightarrow {\bf R}^3_E} can be written in differential geometry notation as

\displaystyle  Z \times W = *((\eta \cdot Z) \wedge (\eta \cdot W)).

Exercise 15 Formulate a definition for the pullback {Y^* T} of a rank {(k,l)} tensor field {T} (which in coordinates would be given by {T^{i_1 \dots i_k}_{j_1 \dots j_l}} for {i_1,\dots,i_k,j_1,\dots,j_l \in \{1,\dots,d\}}) that generalises the pullback of differential forms, vector fields, and Riemannian metrics. Argue why your definition is the natural one.

Lie derivatives. Let {Z \in \Gamma(T {\bf R}^d_E)} is a continuously differentiable vector field, and {\omega \in \Omega^k( {\bf R}^d_E )} is a continuously differentiable {k}-form, we will define the Lie derivative {{\mathcal L}_Z \omega \in \Omega^k({\bf R}^d_E)} of {\omega} along {Z} by the Cartan formula

\displaystyle  {\mathcal L}_Z \omega := Z \neg d\omega + d(Z \neg \omega) \ \ \ \ \ (19)

with the convention that {d(Z \neg \omega)} vanishes if {\omega} is a {0}-form. Thus for instance:

One can interpret the Lie derivative as the infinitesimal version of pullback:

Exercise 16 Let {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E} be smooth and bounded (so that {u(t)} can be viewed as a smooth vector field on {{\bf R}^d_E} for each {t}), and let {X: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E} be a trajectory map. If {\omega \in \Omega^k({\bf R}^d_E)} is a smooth {k}-form, show that

\displaystyle  \partial_t (X(t)^* \omega) = X(t)^*( {\mathcal L}_{u(t)} \omega ).

More generally, if {\omega(t) \in \Omega^k({\bf R}^d_E)} is a smooth {k}-form that varies smoothly in {t}, show that

\displaystyle  \partial_t (X(t)^* \omega) = X(t)^*( {\mathcal D}_t \omega )

where {{\mathcal D}_t} denotes the material Lie derivative

\displaystyle  {\mathcal D}_t := \partial_t + {\mathcal L}_{u(t)}.

Note that the material Lie derivative specialises to the material derivative when applied to scalar fields. The above exercise shows that the trajectory map intertwines the ordinary time derivative {\partial_t} with the material (Lie) derivative.

Remark 17 If one lets {(t,x) \mapsto \exp(tZ) x} be the trajectory map associated to a time-independent vector field {Z} with initial condition (4) (thus {\exp(0 Z) x = x} and {\frac{d}{dt} \exp(tZ) x = Z( \exp(tZ) x)}, then the above exercise shows that {{\mathcal L}_Z \omega = \frac{d}{dt} \exp(tZ)_* \omega|_{t=0}} for any differential form {\omega}. This can be used as an alternate definition of the Lie derivative {{\mathcal L}_Z} (and has the advantage of readily extending to other tensors than differential forms, for which the Cartan formula is not available).

The Lie derivative behaves very well with respect to exterior product and exterior derivative:

Exercise 18 Let {Z \in \Gamma(T {\bf R}^d_E)} be continuously differentiable, and let {\omega \in \Omega^k({\bf R}^d_E), \alpha \in \Omega^l({\bf R}^d_E)} also be continuously differentiable. Establish the Leibniz rule

\displaystyle  {\mathcal L}_Z( \omega \wedge \alpha ) = ({\mathcal L}_Z \omega) \wedge \alpha + \omega \wedge {\mathcal L}_Z \omega.

If {\omega} is twice continuously differentiable, also establish the commutativity

\displaystyle  {\mathcal L}_Z d \omega = d {\mathcal L}_Z \omega

of exterior derivative and Lie derivative.

Exercise 19 Let {Z \in \Gamma(T {\bf R}^d_E)} be continuously differentiable. Show that

\displaystyle  {\mathcal L}_Z d\mathrm{vol}_E = \mathrm{div}(Z) d\mathrm{vol}_E

where {\mathrm{div}(Z) := \partial_i Z^i} is the divergence of {Z}. Use this and Exercise 16 to give an alternate proof of Lemma 3.

Exercise 20 Let {Z \in \Gamma(T {\bf R}^d_E)} be continuously differentiable. For any smooth compactly supported volume form {\omega}, show that

\displaystyle  \int_{{\bf R}^d_E} {\mathcal L}_Z \omega = 0.

Conclude in particular that if {Z} is divergence-free then

\displaystyle  \int_{{\bf R}^d_E} ({\mathcal L}_Z f) d\mathrm{vol}_E = 0

for any {f\in C^\infty_c({\bf R}^d_E \rightarrow {\bf R})}.

The Lie derivative {{\mathcal L}_Z W \in \Gamma(T {\bf R}^d_E) } of a continuously differentiable vector field {W \in\Gamma(T {\bf R}^d_E)} is defined in coordinates as

\displaystyle  ({\mathcal L}_Z W)^i := Z^j \partial_j W^i - W^j \partial_j Z^i

and the Lie derivative {{\mathcal L}_Z g} of a continuously differentiable rank {(0,2)} tensor {g} is defined in coordinates as

\displaystyle  ({\mathcal L}_Z g)_{ij} := Z^k \partial_k g_{ij} + (\partial_i Z^k) g_{kj} + (\partial_j Z^k) g_{ik}.

Thus for instance the Lie derivative of the Euclidean metric {\eta_{ij}} is expressible in coordinates as

\displaystyle  ({\mathcal L}_Z \eta)_{ij} = \partial_i Z^j + \partial_j Z^i \ \ \ \ \ (21)

(compare with the deformation tensor used in Notes 0).

We have similar properties to Exercise 18:

Exercise 21 Let {Z \in \Gamma(T {\bf R}^d_E)} be continuously differentiable.

Exercise 22 If {Z \in \Gamma(T {\bf R}^d_E)} is continuously differentiable, establish the identity

\displaystyle  Y^* {\mathcal L}_Z \phi = {\mathcal L}_{Y^* Z} Y^* \phi

whenever {\phi} is a continuously differentiable differential form, vector field, or metric tensor.

Exercise 23 If {Z,W \in \Gamma(T {\bf R}^d_E)} are smooth, define the Lie bracket {[Z,W] \in \Gamma(T {\bf R}^d_E)} by the formula

\displaystyle  [Z,W] := {\mathcal L}_Z W.

Establish the anti-symmetry {[Z,W] = -[W,Z]} (so in particular {[Z,Z]=0}) and the Jacobi identity

\displaystyle  [[Z_1,Z_2],Z_3] + [[Z_2,Z_3],Z_1] + [[Z_3,Z_1],Z_2] = 0,

and also

\displaystyle  {\mathcal L}_{[Z,W]} \phi = {\mathcal L}_Z {\mathcal L}_W - {\mathcal L}_W {\mathcal L}_Z

whenever {Z,W,Z_1,Z_2,Z_3 \in \Gamma(T {\bf R}^d_E)} are smooth, and {\phi} is a smooth differentiable form, vector field, or metric tensor.

Exercise 24 Formulate a definition for the Lie derivative {{\mathcal L}_Z T} of a (continuously differentiable) rank {(k,l)} tensor field {T} along a vector field {Z} that generalises the Lie derivative of differential forms, vector fields, and Riemannian metrics. Argue why your definition is the natural one.

— 2. The Euler equations in differential geometry notation —

Now we write the Euler equations (1) in differential geometry language developed in the above section. This will make it relatively painless to change coordinates. As in the rest of this set of notes, we work formally, assuming that all fields are smooth enough to justify the manipulations below.

The Euler equations involve a time-dependent scalar field {p(t)}, which can be viewed as an element of {\Omega^0({\bf R}^d_E)}, and a time-dependent velocity field {u(t)}, which can be viewed as an element of {\Gamma(T {\bf R}^d_E)}. The second of the Euler equations simply asserts that this vector field is divergence-free:

\displaystyle  \mathrm{div} u(t) = 0

or equivalently (by Exercise 19 and the definition of material Lie derivative {{\mathcal D}_t = \partial_t + {\mathcal L}_u})

\displaystyle  {\mathcal D}_t d\mathrm{vol}_E = 0.

For the first equation, it is convenient to work instead with the covelocity field {v(t) \in \Omega^1({\bf R}^d_E)}, formed by applying the Euclidean musical isomorphism to {u(t)}:

\displaystyle  v(t) = \eta \cdot u(t).

In coordinates, we have {v_i = \eta_{ij} u^j}, thus {v_i = u^i} for {i=1,\dots,d}. The Euler equations can then be written in coordinates as

\displaystyle  \partial_t v_i + u^j \partial_j v_i = -\partial_i p.

The left-hand side is close to the {i} component of the material Lie derivative {{\mathcal D}_t v = \partial_t v + {\mathcal L}_u v} of {v}. Indeed, from (20) we have

\displaystyle  ({\mathcal D}_t v)_i = \partial_t v_i + u^j \partial_j v_i + (\partial_i u^j) v_j

and so the first Euler equation becomes

\displaystyle  ({\mathcal D}_t v)_i = - \partial_i p + (\partial_i u^j) v_j.

Since {u^j = v_j}, we can express the right-hand side as a total derivative {- \partial_i \tilde p}, where {\tilde p} is the modified pressure

\displaystyle  \tilde p := p - \frac{1}{2} u^j v_j = p - \frac{1}{2} \eta(u,u).

We thus see that the Euler equations can be transformed to the system

\displaystyle  {\mathcal D}_t v = - d \tilde p \ \ \ \ \ (22)

\displaystyle  v = \eta \cdot u \ \ \ \ \ (23)

\displaystyle  {\mathcal D}_t d\mathrm{vol}_E = 0. \ \ \ \ \ (24)

Using the Cartan formula (19), one can also write (22) as

\displaystyle  \partial_t v + u \neg dv = - d p' \ \ \ \ \ (25)

where {p'} is another modification of the pressure:

\displaystyle  p' = \tilde p + u \neg v = p + \frac{1}{2} \eta(u,u).

In coordinates, (25) becomes

\displaystyle  \partial_t v_j + u^i (\partial_i v_j - \partial_j v_i) = - \partial_i p'. \ \ \ \ \ (26)

One advantage of the formulation (22)(24) is that one can pull back by an arbitrary diffeomorphic change of coordinates (both time-dependent and time-independent), with the only things potentially changing being the material Lie derivative {{\mathcal D}_t}, the metric {\eta}, and the volume form {d\mathrm{vol}_E}. (Another, related, advantage is that this formulation readily suggests an extension to more general Riemannian manifolds, by replacing {\eta} with a general Riemannian metric and {d\mathrm{vol}_E} with the associated volume form, without the need to explicitly introduce other Riemannian geometry concepts such as covariant derivatives or Christoffel symbols.)

For instance, suppose {d=3}, and we wish to view the Euler equations in cylindrical coordinates {(r,\theta,z) \in [0,+\infty) \times {\bf R}/2\pi {\bf Z} \times {\bf R}}, thus pulling back under the time-independent map {Y: [0,+\infty) \times {\bf R}/2\pi {\bf Z} \times {\bf R} \rightarrow {\bf R}^3_E} defined by

\displaystyle  Y(r, \theta,z) := (r \cos \theta, r \sin \theta, z ).

Strictly speaking, this is not a diffeomorphism due to singularities at {r=0}, but we ignore this issue for now by only working away from the {z} axis {r=0}. As is well known, the metric {d\eta^2 = (dx^1)^2 + (dx^2)^2 + (dx^3)^2} pulls back under this change of coordinates {x^1 = r \cos \theta, x^2 = r \sin \theta, z = x^3} as

\displaystyle  d(Y^* \eta)^2 = dr^2 + r^2 d\theta^2 + dz^2,

thus the pullback metric {Y^* \eta} is diagonal in {r,\theta,z} coordinates with entries

\displaystyle  (Y^* \eta)_{rr} = 1; \quad (Y^* \eta)_{\theta \theta} = r^2; \quad (Y^* \eta)_{zz} = 1.

The volume form {d\mathrm{vol}_E = dx \wedge dy \wedge dz} similarly pulls back to the familiar cylindrical coordinate volume form

\displaystyle  Y^* d\mathrm{vol}_E = r dr \wedge d \theta \wedge dz.

If (by slight abuse of notation) we write the components of {Y^* u} as {u^r, u^z, u^\theta}, and the components of {Y^* v} as {v_r, v_z, v_\theta}, then the second equation (23) in our current formulation of the Euler equations now becomes

\displaystyle  v_r = u^r; \quad v_\theta = r^2 u^\theta; \quad v_z = u_z \ \ \ \ \ (27)

and the third equation (24) is

\displaystyle  {\mathcal L}_u ( r dr \wedge d \theta \wedge dz ) = 0

which by the product rule and Exercise 19 becomes

\displaystyle  {\mathcal L}_u(r) + r \mathrm{div} u = 0

or after expanding in coordinates

\displaystyle  u^r + r (\partial_r u^r + \partial_\theta u^\theta + \partial_z u^z) = 0.

If one substitutes (27) into (26) in the {r,\theta,z} coordinates to eliminate the {v} variables, we thus see that the cylindrical coordinate form of the Euler equations is

\displaystyle  \partial_t u^r + u^\theta (\partial_\theta u^r - \partial_r(r^2 u^\theta)) + u^z (\partial_z u^r - \partial_r u^z) = - \partial_r p' \ \ \ \ \ (28)

\displaystyle  \partial_t (r^2 u^\theta) + u^r (\partial_r (r^2 u^\theta) - \partial_\theta u^r) + u^z (\partial_z (r^2 u^\theta) - \partial_\theta u^z) = - \partial_\theta p' \ \ \ \ \ (29)

\displaystyle  \partial_t u^z + u^r (\partial_r u^z - \partial_z u^r) + u^\theta (\partial_\theta u^r - \partial_z (r^2 u^\theta)) = - \partial_z p' \ \ \ \ \ (30)

\displaystyle  u^r + r (\partial_r u^r + \partial_\theta u^\theta + \partial_z u^z) = 0. \ \ \ \ \ (31)

One should compare how readily one can derive these equations using the differential geometry formalism with the more pedestrian aproach using the chain rule:

Exercise 25 Starting with a smooth solution {(u,p)} to the Euler equations (1) in {{\bf R}^3_L}, and transforming to cylindrical coordinates {(r,\theta,z)}, establish the chain rule formulae

\displaystyle  u^1 = u^r \cos \theta - r u^\theta \sin \theta

\displaystyle  u^2 = u^r \sin \theta + r u^\theta \cos \theta

\displaystyle  u^3 = u^z

\displaystyle  \partial_1 = \cos \theta \partial_r - \frac{\sin \theta}{r} \partial_\theta

\displaystyle  \partial_2 = \sin \theta \partial_r + \frac{\cos \theta}{r} \partial_\theta

\displaystyle  \partial_3 = \partial_z

and use this and the identity

\displaystyle  p' := p + \frac{1}{2} (u^1 u^1 + u^2 u^2 + u^3 u^3)

to rederive the system (28)(31) (away from the {z} axis) without using the language of differential geometry.

Exercise 26 Turkington coordinates {(x,y,\zeta) \in {\bf R} \times [0,+\infty) \times {\bf R}/2\pi {\bf Z}} are a variant of cylindrical coordinates {(r,\theta,z) \in [0,+\infty) \times {\bf R}/2\pi{\bf Z} \times {\bf R}}, defined by the formulae

\displaystyle  (x,y,\zeta) := (z,r^2/2, \theta);

the advantage of these coordinates are that the map from Cartesian coordinates {(x^1,x^2,x^3)} to Turkington coordinates {(x,y,z)} is volume preserving. Show that in these coordinates, the Euler equations become

\displaystyle  \partial_t u^x + u^y (\partial_y u^x - \partial_x(\frac{u^y}{2y})) + u^\zeta (\partial_\zeta u^x - \partial_x(2y u^\zeta)) = - \partial_x p'

\displaystyle  \partial_t (\frac{u^y}{2y}) + u^x (\partial_x (\frac{u^y}{2y}) - \partial_y u^x) + u^\zeta (\partial_\zeta (\frac{u^y}{2y}) - \partial_y u_\zeta) = - \partial_y p'

\displaystyle  \partial_t (2yu^\zeta) + u^x (\partial_x (2yu^\zeta) - \partial_\zeta u^x) + u^y (\partial_y (2yu_\zeta) - \partial_\zeta (\frac{u^y}{2y})) = - \partial_\zeta p'

\displaystyle  \partial_x u^x + \partial_y u^y + \partial_\zeta u^\zeta = 0.

(These coordinates are particularly useful for studying solutions to Euler that are “axisymmetric with swirl”, in the sense that the fields {u^x, u^y, u^z, p'} do not depend on the {\zeta} variable, so that all the terms involving {\partial_\zeta} vanish; one can specialise further to the case of solutions that are “axisymmetric without swirl”, in which case {u_\zeta} also vanishes.)

We can use the differential geometry formalism to formally verify the conservation laws of the Euler equation. We begin with conservation of energy

\displaystyle  E(t) := \frac{1}{2} \int_{{\bf R}^d_E} |u|^2\ d\mathrm{vol}_E = \frac{1}{2} \int_{{\bf R}^d_E} (g^{-1} \cdot v) \neg v\ d\mathrm{vol}_E.

Formally differentiating this in time (and noting that the form {(g^{-1} \cdot w) \neg v= g^{ij} v_i w_j} is symmetric in {v,w}) we have

\displaystyle  \partial_t E(t) = \int_{{\bf R}^d_E} (g^{-1} \cdot v) \neg \partial_t v \ d\mathrm{vol}_E = \int_{{\bf R}^d_E} u \neg \partial_t v\ d\mathrm{vol}(E).

Using (22), we can write

\displaystyle  u \neg \partial_t v = - u \neg {\mathcal L}_u v - u \neg dp'.

From the Cartan formula (19) one has {u \neg dp' = {\mathcal L}_u p'}; from Exercise 23 one has {{\mathcal L}_u u = 0}, and hence by the Leibniz rule (Exercise 18(i)) we thus can write {u \neg \partial_t v} as a total derivative:

\displaystyle  u \neg \partial_t v = {\mathcal L}_u ( - u \neg v - p' ).

From Exercise 20 we thus formally obtain the conservation law {\partial_t E}.

Now suppose that {Z \in \Gamma(T {\bf R}^d_E)} is a time-independent vector field that is a Killing vector field for the Euclidean metric {\eta}, by which we mean that

\displaystyle  {\mathcal L}_Z \eta = 0.

Taking traces in (21), this implies in particular that {Z} is divergence-free, or equivalently

\displaystyle  {\mathcal L}_Z d\mathrm{vol}_E = 0.

(Geometrically, this implication arises because the volume form {d\mathrm{vol}_E} can be constructed from the Euclidean metric {\eta} (up to a choice of orientation).) Consider the formal quantity

\displaystyle  M(t) := \int_{{\bf R}^d_E} (Z \neg v)\ d\mathrm{vol}_E.

As {v} is the only time-dependent quantity here, we may formally differentiate to obtain

\displaystyle  \partial_t M(t) = \int_{{\bf R}^d_E} (Z \neg \partial_t v)\ d\mathrm{vol}_E

Using (22), the left-hand side is

\displaystyle  - \int_{{\bf R}^d_E} (Z \neg {\mathcal L}_u v) + (Z \neg dp')\ d\mathrm{vol}_E.

By Cartan’s formula, {Z \neg dp'} is a total derivative {{\mathcal L}_Z p'}, and hence this contribution to the integral formally vanishes as {Z} is divergence-free. The quantity {Z \neg {\mathcal L}_u v} can be written using the Leibniz rule as the difference of the total derivative {{\mathcal L}_u (Z \neg v)} and the quantity {{\mathcal L}_u Z \neg v}. The former quantity also gives no contribution to the integral as {u} is divergence free, thus

\displaystyle  \partial_t M(t) = \int_{{\bf R}^d_E} {\mathcal L}_u Z \neg v\ d\mathrm{vol}_E.

By Exercise 23, we have {{\mathcal L}_u Z = - {\mathcal L}_Z u = - {\mathcal L}_Z(\eta^{-1} \cdot v)}. Since {\eta} (and hence {\eta^{-1}}) is annihilated by {{\mathcal L}_Z}, and the form {(\eta^{-1} \cdot v) \neg w = \eta^{ij} v_i w_j} is symmetric in {v,w}, we can express {{\mathcal L}_Z(\eta^{-1} \cdot v) \neg v} as a total derivative

\displaystyle  {\mathcal L}_Z(\eta^{-1} \cdot v) \neg v = \frac{1}{2} {\mathcal L}_Z ( (\eta^{-1} \cdot v) \neg v ),

and so this integral also vanishes. Thus we obtain the conservation law {\partial_t M(t) = 0}. If we set the Killing vector field {Z} equal to the constant vector field {\frac{d}{dx^i}} for some {i=1,\dots,d}, we obtain conservation of the momentum components

\displaystyle  \int_{{\bf R}^d_E} u^i\ d\mathrm{vol}_E

for {i=1,\dots,d}; if we instead set the Killing vector field {Z} equal to the rotation vector field {x^i \frac{d}{dx^j} - x^j \frac{d}{dx^i}}) (which one can easily verify to be Killing using (21)) we obtain conservation of the angular momentum components

\displaystyle  \int_{{\bf R}^d_E} x^i u^j - x^j u^i\ d\mathrm{vol}_E

for {i,j =1,\dots,d}. Unfortunately, this essentially exhausts the supply of Killing vector fields:

Exercise 27 Let {Z} be a smooth Killing vector field of the Euclidean metric {\eta}. Show that {Z} is a linear combination (with real coefficients) of the constant vector fields {\frac{d}{dx^i}}, {i=1,\dots,d} and the rotation vector fields {x^i \frac{d}{dx^j} - x^j \frac{d}{dx^i}}, {i,j=1,\dots,d}. (Hint: use (21) to show that all the second derivatives of components of {Z} vanish.)

The vorticity {2}-form {\omega(t) \in \Omega^2( {\bf R}^d_E)} is defined as the exterior derivative of the covelocity:

\displaystyle  {}\omega := dv.

It already made an appearance in Notes 3 from the previous quarter. Taking exterior derivatives of (22) using (10) and Exercise 18 we obtain the appealingly simple vorticity equation

\displaystyle  {\mathcal D}_t \omega = 0. \ \ \ \ \ (32)

In two and three dimensions we may take the Hodge dual {*\omega} of the velocity {2}-form to obtain either a scalar field (in dimension {d=2}) or a vector field (in dimension {d=3}), and then Exercise 18(iv) implies that

\displaystyle  {\mathcal D}_t *\omega = 0. \ \ \ \ \ (33)

In two dimensions, this gives us a lot of conservation laws, since one can apply the scalar chain rule to then formally conclude that

\displaystyle  {\mathcal D}_t F( *\omega) = 0

for any {F: {\bf R} \rightarrow {\bf R}}, which upon integration on {{\bf R}^2_E} using Exercise 20 gives the conservation law

\displaystyle  \partial_t \int_{{\bf R}^2_E} F(*\omega)\ d\mathrm{vol}_E = 0

for any such function {F}. Thus for instance the {L^p({\bf R}^2_E \rightarrow {\bf R})} norms of {*\omega} are formally conserved for every {0 < p < \infty}, and hence also for {p=\infty} by a limiting argument, recovering Proposition 24 from Notes 3 of the previous quarter.

In three dimensions there is also an interesting conservation law involving the vorticity. Observe that the wedge product {v \wedge \omega} of the covelocity and the vorticity is a {3}-form and can thus be integrated over {{\bf R}^3_E}. The helicity

\displaystyle  H(t) := \int_{{\bf R}^3_E} v(t) \wedge \omega(t) \ \ \ \ \ (34)

is a formally conserved quantity of the Euler equations. Indeed, formally differentiating and using Exercise 20 we have

\displaystyle  \partial_t H(t) = \int_{{\bf R}^3_E} {\mathcal D}_t ( v \wedge \omega).

From the Leibniz rule and (32) we have

\displaystyle  {\mathcal D}_t ( v \wedge \omega) = ({\mathcal D}_t v) \wedge \omega.

Applying (22) we can write this expression as {-d\tilde p \wedge \omega}. From (10) we have {d\omega=0}, hence this expression is also a total derivative {-d(\tilde p \omega)}. From Stokes’ theorem (14) we thus formally obtain the conservation of helicity: {H(t) = H(0)}; this was first observed by Moreau.

Exercise 28 Formally verify the conservation of momentum, angular momentum, and helicity directly from the original form (1) of the Euler equations.

Exercise 29 In even dimensions {d \geq 2}, show that the integral {\int_{{\bf R}^d_E} \bigwedge^{d/2} \omega(t)} (formed by taking the exterior product of {d/2} copies of {\omega}) is conserved by the flow, while in odd dimensions {d \geq 3}, show that the generalised helicity {\int_{{\bf R}^d_E} v(t) \wedge \bigwedge^{\frac{d-1}{2}} \omega(t)} is conserved by the flow. (This observation is due to Denis Serre, as well as unpublished work of Tartar.)

As it turns out, there are no further conservation laws for the Euler equations in Eulerian coordinates that are linear or quadratic integrals of the velocity field and its derivatives, at least in three dimensions; see this paper of Denis Serre. In particular, the Euler equations are not believed to be completely integrable. (But there are a few more conserved integrals of motion in the Lagrangian formalism; see Exercise 40.)

Exercise 30 Let {u: [0,T) \times {\bf R}^3_E \rightarrow {\bf R}^3_E} be a smooth solution to the Euler equations in three dimensions {{\bf R}^3_E}, let {*\omega} be the vorticity vector field, and let {f: [0,T) \times {\bf R}^3_E \rightarrow {\bf R}} be an arbitrary smooth scalar field. Establish Ertel’s theorem

\displaystyle  D_t( *\omega \cdot \nabla f ) = *\omega \cdot \nabla(D_t f).

Exercise 31 (Clebsch variables) Let {u: [0,T) \times {\bf R}^d_E \rightarrow {\bf R}^d_E} be a smooth solution to the Euler equations. Suppose that at time zero, the covelocity {v(0)} takes the form

\displaystyle  v(0) = \sum_{j=1}^k \theta_j(0) d \varphi_j(0)

for some smooth scalar fields {\theta_j(0), \varphi_j(0): {\bf R}^d_E \rightarrow {\bf R}}.

— 3. Viewing the Euler equations in Lagrangian coordinates —

Throughout this section, {(u,p)} is a smooth solution to the Euler equations on {[0,T) \times {\bf R}^d_E}, and let {X} be a volume-preserving trajectory map.

We pull back the Euler equations (22), (23), (24), to create a Lagrangian velocity field {\underline{u}: [0,T) \rightarrow \Omega^1({\bf R}^d_L)}, a Lagrangian covelocity field {\underline{v}: [0,T) \rightarrow \Gamma(T {\bf R}^d_L)}, a Lagrangian modified pressure field {\underline{p'}: [0,T) \times {\bf R}^d_L \rightarrow {\bf R}}, and a Lagrangian vorticity field {\underline{\omega}: [0,T) \rightarrow \Omega^2({\bf R}^d_L)} by the formulae

\displaystyle  \underline{u}(t) := X(t)^* u(t)

\displaystyle  \underline{v}(t) := X(t)^* v(t)

\displaystyle  \underline{\omega}(t) := X(t)^* \omega(t) \ \ \ \ \ (35)

\displaystyle  \underline{\tilde p}(t) := X(t)^* \tilde p(t).

By Exercise 16, the Euler equations now take the form

\displaystyle  \partial_t \underline{v} = - d \underline{\tilde p} \ \ \ \ \ (36)

\displaystyle  \mathrm{div} U = 0

\displaystyle  \underline{v} = (X^* \eta) \cdot \underline{u}

and the vorticity is given by

\displaystyle  \underline{\omega} = d \underline{v}

and obeys the vorticity equation

\displaystyle  \partial_t \underline{\omega} = 0.

We thus see that the Lagrangian vorticity {\underline{\omega}} is a pointwise conserved quantity:

\displaystyle  \underline{\omega}_{\alpha \beta}(t, a) = \underline{\omega}_{\alpha\beta}(0, a). \ \ \ \ \ (37)

This lets us solve for the Eulerian vorticity {\omega_{ij}} in terms of the trajectory map. Indeed, from (12), (35) we have

\displaystyle  \underline{\omega}_{\alpha \beta}(0,a) = \underline{\omega}_{\alpha \beta}(t,a) = \omega_{ij}(t,X(t,a)) \partial_\alpha X(t)^i(a) \partial_\beta X(t)^j(a);

applying the inverse {(\nabla X(t))^{-1}} of the linear transformation {\nabla X(t)}, we thus obtain the Cauchy vorticity formula

\displaystyle  \omega_{ij}(t,X(t,a)) = \underline{\omega}_{\alpha \beta}(0,a) (\nabla X(t)^{-1})^\alpha_i(a) (\nabla X(t)^{-1})^\beta_j(a). \ \ \ \ \ (38)

If we normalise the trajectory map by (4), then {\underline{\omega}(0) = \omega(0)}, and we thus have

\displaystyle  \omega_{ij}(t,X(t,a)) = \omega_{\alpha \beta}(0,a) (\nabla X(t)^{-1})^\alpha_i(a) (\nabla X(t)^{-1})^\beta_j(a). \ \ \ \ \ (39)

Thus for instance, we see that the support of the vorticity is transported by the flow:

\displaystyle  \mathrm{supp}(\omega(t)) = X(t)( \mathrm{supp}(\omega(0)) ).

Among other things, this shows that the volume and topology of the support of the vorticity remain constant in time. It also suggests that the Euler equations admit a number of “vortex patch” solutions in which the vorticity is compactly supported.

Exercise 32 Assume the normalisation (4).

  • (i) In the two-dimensional case {d=2}, show that the Cauchy vorticity formula simplifies to

    \displaystyle  \omega_{12}(t,X(t,a)) = \omega_{12}(0, a).

    Thus in this case, vorticity is simply transported by the flow.

  • (ii) In the three-dimensional case {d=3}, show that the Cauchy vorticity formula can be written using the Hodge dual {*\omega} of the vorticity as

    \displaystyle  *\omega^i(t, X(t,a)) = *\omega^\alpha(0,a) \partial_\alpha X^i( t, a ).

    Thus we see that the vorticity is transported and also stretched by the flow, with the stretching given by the matrix {\partial_\alpha X^i}.

One can also phrase the conservation of vorticity in an integral form. If {S} is a two-dimensional oriented surface in {{\bf R}^3_L} that does not vary in time, then from (37) we see that the integral

\displaystyle  \int_S \underline{\omega}(t)

is formally conserved in time:

\displaystyle  \int_S \underline{\omega}(t) = \int_S \underline{\omega}(0).

Composing this with the trajectory map {X(t)} using (35), we conclude that

\displaystyle  \int_{X(t)(S)} \omega(t) = \int_{X(0)(S)} \omega(0).

Writing {\omega = dv} and using Stokes’ theorem (14), we arrive at the Kelvin circulation theorem

\displaystyle  \int_{X(t)(\partial S)} v(t) = \int_{X(0)(\partial S)} v(0).

The integral of the covelocity {v} along a loop {\gamma} is known as the circulation of the fluid along the loop; the Kelvin circulation theorem then asserts that this circulation remains constant over time as long as the loop evolves along the flow.

Exercise 33 (Cauchy invariants)

For more discussion of Cauchy’s investigation of the Cauchy invariants and vorticity formula, see this article of Frisch and Villone.

Exercise 34 (Transport of vorticity lines) Suppose we are in three dimensions {d=3}, so that the Hodge dual {* \omega} of vorticity is a vector field. A smooth curve {\gamma} (either infinite on both ends, or a closed loop) in {{\bf R}^3_E} is said to be a vortex line (or vortex ring, in the case of a closed loop) at time {t} if at every point {x} of the curve {\gamma}, the tangent to {x} at {\gamma} is parallel to the vorticity {\omega(t,x)} at that point. Suppose that the trajectory map is normalised using (4). Show that if {\gamma} is a vortex line at time {0}, then {X(t)(\gamma)} is a vortex line at any other time {t}; thus, vortex lines (or vortex rings) flow along with the fluid.

Exercise 35 (Conservation of helicity in Lagrangian coordinates)

  • (i) In any dimension, establish the identity

    \displaystyle  \partial_t ( \underline{v}(t) \wedge \underline{\omega}(t) ) = - d( \underline{p'} \underline{\omega}(t) )

    in Lagrangian spacetime.

  • (ii) Conclude that in three dimensions {d=3}, the quantity

    \displaystyle  \int_{{\bf R}^3_L} \underline{v}(t) \wedge \underline{\omega}(t)

    is formally conserved in time. Explain why this conserved quantity is the same as the helicity (34).

  • (iii) Continue assuming {d=3}. Define a vortex tube at time {t} to be a region {T \subset {\bf R}^3_E} in which, at every point {x} on the boundary {\partial T}, the vorticity vector field {*\omega(t,x)} is tangent to {\partial T}. Show that if {X(0)(T)} is a vortex tube at time {0}, then {X(t)(T)} is a vortex tube at time {T}, and the helicity {\int_{X(t)(T)} v(t) \wedge \omega(t)\ dt} on the vortex tube is formally conserved in time.
  • (iv) Let {d=3}. If the covelocity {v} can be expressed in Clebsch variables (Exercise 31) with {k=1}, show that the local helicity {\int_{X(t)(T)} v(t) \wedge \omega(t)\ dt} formally vanishes on every vortex tube {T}. This provides an obstruction to the existence of {k=1} Clebsch variables. (On the other hand, it is easy to find Clebsch variables on {{\bf R}^d} with {k=d} for an arbitrary covelocity {v}, simply by setting {\varphi_j} equal to the coordinate functions {\varphi_j(x) = x^j}.)

Exercise 36 In the three-dimensional case {d=3}, show that the material derivative {D_t} commutes with operation {*\omega \cdot \nabla} of differentiation along the (Hodge dual of the) vorticity.

The Cauchy vorticity formula (39) can be used to obtain an integral representation for the velocity {u} in terms of the trajectory map {X}, leading to the vorticity-stream formulation of the Euler equations. Recall from 254A Notes 3 that if one takes the divergence of the (Eulerian) vorticity {\omega}, one obtains the Laplacian of the (Eulerian) covelocity {u}:

\displaystyle  \Delta v_j = \partial^i \omega_{ij},

where {\partial^i := \eta^{ik} \partial_k} are the partial derivatives raised by the Euclidean metric. For {d > 2}, we can use the fundamental solution {\frac{-1}{(d-2)|S^{d-2}|} \frac{1}{|x|^{d-2}}} of the Laplacian (see Exercise 18 of 254A Notes 1) that (formally, at least)

\displaystyle  v_j(t,x) = \frac{-1}{(d-2)|S^{d-2}|} \int_{{\bf R}^d_E} \frac{\partial^i \omega_{ij}(t,y)}{|x-y|^{d-2}}\ d\mathrm{vol}_E(y).

Integrating by parts (after first removing a small ball around {x}, and observing that the boundary terms from this ball go to zero as one shrinks the radius to zero) one obtains the Biot-Savart law

\displaystyle  v_j(t,x) = \frac{1}{|S^{d-2}|} \int_{{\bf R}^d_E} \frac{(x^i-y^i) \omega_{ij}(t,y)}{|x-y|^{d}}\ d\mathrm{vol}_E(y)

for the covelocity, or equivalently

\displaystyle  u^k(t,x) = \frac{1}{|S^{d-2}|} \int_{{\bf R}^d_E} \frac{\eta^{jk} (x^i-y^i) \omega_{ij}(t,y)}{|x-y|^{d}}\ d\mathrm{vol}_E(y)

for the velocity.

Exercise 37 Show that this law is also valid in the two-dimensional case {d=2}.

Changing to Lagrangian variables, we conclude that

\displaystyle  u^k(t,X(t,a)) = \frac{1}{|S^{d-2}|} \int_{{\bf R}^d_L}

\displaystyle  \eta^{jk} \frac{(X^i(t,a)-X^i(t,b)) \omega_{ij}(t,X(t,b))}{|X(t,a)-X(t,b)|^{d}}\ d\mathrm{vol}_L(b).

Using the Cauchy vorticity formula (39) (assuming the normalisation (4)), we obtain

\displaystyle  u^k(t,X(t,a)) = \frac{1}{|S^{d-2}|} \int_{{\bf R}^d_L}

\displaystyle \frac{(X^i(t,a)-X^i(t,b)) (\nabla X(t)^{-1})^\alpha_i(b) (\nabla X(t)^{-1})^\beta_j(b)}{|X(t,a)-X(t,b)|^{d}} \omega_{\alpha \beta}(0,b) \ d\mathrm{vol}_L(b).

Combining this with (3), we obtain an integral-differential equation for the evolution of the trajectory map:

\displaystyle  \partial_t X^k(t,a) = \frac{1}{|S^{d-2}|} \int_{{\bf R}^d_L} \ \ \ \ \ (42)

\displaystyle  \eta^{jk} \frac{(X^i(t,a)-X^i(t,b)) (\nabla X(t)^{-1})^\alpha_i(b) (\nabla X(t)^{-1})^\beta_j(b)}{|X(t,a)-X(t,b)|^{d}} \omega_{\alpha \beta}(0,b) \ d\mathrm{vol}_L(b).

This is known as the vorticity-stream formulation of the Euler equations. In two and three dimensions, the formulation can be simplified using the alternate forms of the vorticity formula in Exercise 32. While the equation (42) looks complicated, it is actually well suited for Picard-type iteration arguments (of the type used in 254A Notes 1), due to the relatively small number of derivatives on the right-hand side. Indeed, it turns out that one can iterate this equation with the trajectory map placed in function spaces such as {C^0_t C^{1,\alpha}_x( [0,T) \times {\bf R}^d_L \rightarrow {\bf R}^d_E)}; see Chapter 4 of Bertozzi-Majda for details.

Remark 38 Because of the ability to solve the Euler equations in Lagrangian coordinates by an iteration method, the local well-posedness theory is slightly stronger in some respects in Lagrangian coordinates than it is in Eulerian coordinates. For instance, in this paper of Constantin Kukavica and Vicol it is shown that Lagrangian coordinate Euler equations are well-posed in Gevrey spaces, while Eulerian coordinate Euler equations are not. It also happens that the trajectory maps {X(t,a)} are real-analytic in {t} even if the initial data is merely smooth; see for instance this paper of Constantin-Vicol-Wu and the references therein. An example of this phenomenon is given in the exercise below.

Exercise 39 (DiPerna-Majda example) Let {f: {\bf R} \rightarrow {\bf R}} and {g: {\bf R}^2 \rightarrow {\bf R}} be smooth functions.

  • (i) Show that the DiPerna-Majda flow {u: {\bf R}^3 \rightarrow {\bf R}^3} defined by

    \displaystyle  u(t,x) = (f(x_2), 0, g( x_1- t f(x_2), x_2))

    solves the three-dimesional Euler equations (with zero pressure).

  • (ii) Show that the trajectory map with initial condition (4) is given by

    \displaystyle  X(t,a) = (a_1 + t f(a_2), a_2, a_3 + t g(a_1, a_2) );

    in particular the trajectory map is analytic in the time variable {t}, even though the Eulerian velocity field {u} need not be.

  • (iii) Show that the Lagrangian covelocity field {\underline{v}} is given by

    \displaystyle  \underline{v} = f(a_2) da_1 + g(a_1,a_2) da_3 + d( \frac{t}{2} f^2(a_2) + \frac{t}{2} g^2(a_1,a_2) )

    and the Lagrangian vorticity field {\underline{\omega}} is given by

    \displaystyle  \underline{\omega} = - f'(a_2) da_1 \wedge da_2 + dg(a_1,a_2) \wedge da_3.

    In particular the Lagrangian vorticity is conserved in time (as it ought to).

Exercise 40 Show that the integral

\displaystyle \int_{{\bf R}^d_L} \eta_{ij} (X^j(t,a) - a^\alpha \partial_\alpha X^j(t,a) - \frac{d+2}{2} t \partial_t X^j(t,a))

\displaystyle  \partial_t X^i(t,a)\ d\mathrm{vol}_L(a)

is formally conserved in time. (Hint: some of the terms arising from computing the derivative are more easily treated by moving to Eulerian coordinates and performing integration by parts there, rather than in Lagrangian coordinates. One can also proceed by rewriting the terms in this integral using the Eulerian covelocity {v} and the Lagrangian covelocity {\underline{v}}.) With the normalisation (4), conclude in particular that

\displaystyle  \int_{{\bf R}^d_E} x^i v_i(t,x)\ d\mathrm{vol}_E(x) = \int_{{\bf R}^d_L} a^\alpha \underline{v}_\alpha(t,a)\ d\mathrm{vol}_L(a)

\displaystyle + \frac{d+2}{2} t \int_{{\bf R}^d_E} |u(t,x)|^2\ d\mathrm{vol}_E(x).

This conservation law is related to a scaling symmetry of the Euler equations in Lagrangian coordinates, and is due to Shankar. It does not have a local expression in purely Eulerian coordinates (mainly because of the appearance of the labels coordinate {a^\alpha}).

We summarise the dictionary between Eulerian and Lagrangian coordinates in the following table: