Planet Musings

October 28, 2016

ParticlebitesSaving SUSY: Interpreting the 3.3σ ATLAS Stop Excess

Article: Surviving scenario of stop decays for ATLAS l+jets+E_T^miss search
Authors: Chengcheng Han, Mihoko M. Nojiri, Michihisa Takeuchi, and Tsutomu T. Yanagida
Reference: arXiv:1609.09303v1 [hep-ph]

If you’ve been following the ongoing SUSY searches, you know that much of the phase space accessible at colliders like the LHC has been ruled out. Nevertheless, many phenomenologists are working diligently to develop alternative frameworks that maintain the compatibility of the Minimal Supersymmetric Standard Model (MSSM, one of the most compelling SUSY models) with recent experimental results. If evidence of SUSY is found at the LHC, it could help us start answering questions about naturalness, dark matter, gauge coupling unification, and other BSM questions, so it’s no wonder so many researchers are invested in keeping the SUSY search alive.

This particular paper discusses recent 13 TeV ATLAS results in the l+jets+ETmiss channel where an excess of up to 3.3σ above Standard Model expectations were seen in the initial 13TeV Run 2 dataset. Although CMS hasn’t reported any significant excess in this channel, both experiments see a moderate excess in the 0 lepton channel, so there’s some strong motivation for phenomenologists to explain these preliminary results as the presence of a new particle, namely a SUSY particle.

This ATLAS search is aimed at stop (the SUSY partner of the top quark) production where the stop then decays into a top and neutralino (a mixed state of the higgsino and gaugino, ie the SUSY partners of the Higgs and the gauge boson), and the top then decays to leptons. The stop is a particularly interesting SUSY particle, because it plays a critical role in the naturalness of SUSY models; most natural SUSY models predict a light stop and higgsino. The analysis group defined 7 (non exclusive) potential signal regions for this search, and excesses above 2σ were seen in 3 of them: DM_low, bC2x_diag, and SR1. The selection cuts for these regions are shown in Table 1. This paper discussed models to explain the DM_low excess of 3.3σ, but similar models could be used to explain the other excesses as well. The authors sought to create models which are compatible both with these results and the previous stop parameter limits set by ATLAS and CMS.

Table 1: Summary of the selection cuts for the 7 signal regions considered in this search

Table 1: Summary of the selection cuts for the 7 signal regions considered in this search

They first explored the two simplest one-step stop decays. Depending on what the Lightest Supersymmetric Particle (LSP) is, these decays can have different constraints, so they conducted a scan over the entire parameter space. There are two cases for this type of decay: (1) the LSP is a bino (SUSY partner of weak hypercharge boson) and the Next Lightest Super Symmetric Particle (NLSP) is the stop,leading to the simple decay shown in Figure 1a, or (2) the LSP is a higgsino which can be charged or neutral, with each possibility having different masses, which leads to the split decay shown in Figure 1b.

Figure 1: Decay diagrams for the Bino LSP scenario (left) and Higgsino LSP scenario (right)

Figure 1: Decay diagrams for the Bino LSP scenario (left) and Higgsino LSP scenario (right)

We can see in Figure 2 that most (or all in the case of the higgsino) of the preferred 2σ phase-space for these models are ruled out by existing constraints on stop production, so unfortunately these models aren’t particularly promising. Consequently, the authors designed an additional model essentially combing these two processes, where the LSP is a bino and the NLSP is a higgsino. This allows for both one step and two step decays, as shown in Figure 3. The results for this model are much more exciting; almost all of the preferred 2σ phase space exists outside of the existing constraints, as shown in Figure 4!

Figure 2: 2 sigma preferred region and exclusion limits from experiments for Bino LSP (left) and Higgsino LSP (right)

Figure 2: 2 sigma preferred region and exclusion limits from experiments for Bino LSP (left) and Higgsino LSP (right)

Screen Shot 2016-10-11 at 1.10.58 PM




2 additional references are included in the Figure 4 plot. 3 benchmark models for different combinations of higgsino and stop masses are indicated with red crosses, and all of them fall in this allowed phase space. Furthermore, direct dark matter detection limits from the LUX experiment are shown as a dashed black line. The left side of this line has been excluded by LUX, so one of the considered benchmark models is not allowed. The second benchmark model falls near the exclusion line, so upcoming dark matter results will play an important role in telling us if this SUSY model can actually explain the excess!

Figure 3: Decay diagram for the Bino LSP, Higgsino NLSP scenario

Figure 3: Decay diagram for the Bino LSP, Higgsino NLSP scenario

Figure 4: 2 sigma preferred region and exclusion limits for the Bino LSP and Higgsino NLSP model with benchmark points and LUX exclusion limit

Figure 4: 2 sigma preferred region and exclusion limits for the Bino LSP and Higgsino NLSP model with benchmark points and LUX exclusion limit

So, SUSY hasn’t been found at the LHC, but its not dead yet! There are promising excesses in the current ATLAS dataset which are consistent with benchmark MSSM models with expected LSP candidates. We look forward to new data from the LHC and other experiments to tell us more!

References and further reading:

  • Stephen P. Martin, “Supersymmetry primer” (arXiv:hep-ph/9709356)
  • Sven Krippendorf, Fernando Quevedo, Oliver Schlotterer, “Cambridge Lectures on Supersymmetry and Extra Dimensions” (arXiv:1011.1491)
  • John Ellis, “Supersymmetry, Dark Matter, and the LHC” (slides)


Edinburgh Mathematical Physics GroupWilliam Gordon Seggie-Brown Research Fellowship

The School has just advertised this year’s Seggie-Brown Postdoctoral Fellowship. The closing date is on 28th November and, for my sins, I will be chairing the committee this year. Needless to say, we (= the group) particularly encourage strong applicants in Mathematical Physics. I believe that the last mathematical physicist who was awarded this fellowship […]

October 27, 2016

Clifford Johnson(Dry) Watercolour

watercolour_share_26_oct_2016All change! Last week another style change took place, in service of a new story/chapter for the book. I've transitioned to a looser style, with final line art done with a charcoal-like finish, and the colour done as watercolour. (Click for a slightly larger view.) It turned out that back in March when I went and hid for a week to work on the book, I thumbnailed and roughed a lot of pages (on two stories I think?) in a pretty tight manner, and so I've decided that I'm simply going to go in and sketch the final material all by hand, with no elaborate construction work for placing backgrounds (neither analogue nor digital), no measurements, no drawing of perspective grids, etc.

This turns out to mean that I can get the pre-colour work done pretty swiftly on some pages. Rather than take this as an opportunity to sprint ahead and make up some lost time, I decided to [...] Click to continue reading this post

The post (Dry) Watercolour appeared first on Asymptotia.

n-Category Café The Kan Extension Seminar Returns

In early 2014, the nn-Category Café hosted the Kan Extension Seminar, a graduate reading course in category theory modeled after Daniel Kan’s eponymous reading course in algebraic topology at MIT. My experience with the seminar, described here, was overwhelming positive, so I am delighted to announce that we’re back. Alexander Campbell, Brendan Fong, and I are organizing “Kan II” in early 2017 and we are currently soliciting applications for seminar participants.

Our plan is to read the following eight papers between mid January and mid May:

  • Hyland and Power, The category theoretic understanding of universal algebra: Lawvere theories and monads

  • Freyd, Algebra valued functors in general and tensor products in particular

  • Beck, Distributive laws

  • Kelly, On the operads of J.P. May

  • Kelly, Structures defined by finite limits in the enriched context, I

  • Kelly, On Clubs and data-type constructors

  • Lack and Rosicky, Notions of Lawvere theory

  • Berger, Mellies, and Weber, Monads with arities and their associated theories

We are seeking eight participants who will read and engage with all of the papers as well as prepare an oral presentation on one of them, followed by a blog post to be published here, on the nn-Category Café. The course will conclude with a series of short public expository lectures given, by those able to attend, on July 16 in conjunction with the 2017 International Category Theory Conference at the University of British Columbia in Vancouver.

More details, including information about how to apply, can be found on the seminar website or by contacting any of the three organizers. Applications are due November 30th. I hope you will help us spread the word by passing this message along to those who might be interested.

Scott AaronsonMay reason trump the Trump in all of us

Two years ago, when I was the target of an online shaming campaign, what helped me through it were hundreds of messages of support from friends, slight acquaintances, and strangers of every background.  I vowed then to return the favor, by standing up when I saw decent people unfairly shamed.  Today I have an opportunity to make good.

Some time ago I had the privilege of interacting a bit with Sam Altman, president of the famed startup incubator Y Combinator (and a guy who’s thanked in pretty much everything Paul Graham writes).  By way of our mutual friend, the renowned former quantum computing researcher Michael Nielsen, Sam got in touch with me to solicit suggestions for “outside-the-box” scientists and writers, for a new grant program that Y Combinator was starting. I found Sam eager to delve into the merits of any suggestion, however outlandish, and was delighted to be able to make a difference for a few talented people who needed support.

Sam has also been one of the Silicon Valley leaders who’s written most clearly and openly about the threat to America posed by Donald Trump and the need to stop him, and he’s donated tens of thousands of dollars to anti-Trump causes.  Needless to say, I supported Sam on that as well.

Now Sam is under attack on social media, and there are even calls for him to resign as the president of Y Combinator.  Like me two years ago, Sam has instantly become the corporeal embodiment of the “nerd privilege” that keeps the marginalized out of Silicon Valley.

Why? Because, despite his own emphatic anti-Trump views, Sam rejected demands to fire Peter Thiel (who has an advisory role at Y Combinator) because of Thiel’s support for Trump.  Sam explained his reasoning at some length:

[A]s repugnant as Trump is to many of us, we are not going to fire someone over his or her support of a political candidate.  As far as we know, that would be unprecedented for supporting a major party nominee, and a dangerous path to start down (of course, if Peter said some of the things Trump says himself, he would no longer be part of Y Combinator) … The way we got into a situation with Trump as a major party nominee in the first place was by not talking to people who are very different than we are … I don’t understand how 43% of the country supports Trump.  But I’d like to find out, because we have to include everyone in our path forward.  If our best ideas are to stop talking to or fire anyone who disagrees with us, we’ll be facing this whole situation again in 2020.

The usual criticism of nerds is that we might have narrow technical abilities, but we lack wisdom about human affairs.  It’s ironic, then, that it appears to have fallen to Silicon Valley nerds to guard some of the most important human wisdom our sorry species ever came across—namely, the liberal ideals of the Enlightenment.  Like Sam, I despise pretty much everything Trump stands for, and I’ve been far from silent about it: I’ve blogged, donated money, advocated vote swapping, endured anonymous comments like “kill yourself kike”—whatever seemed like it might help even infinitesimally to ensure the richly-deserved electoral thrashing that Trump mercifully seems to be headed for in a few weeks.

But I also, I confess, oppose the forces that apparently see Trump less as a global calamity to be averted, than as a golden opportunity to take down anything they don’t like that’s ever been spotted within a thousand-mile radius of Trump Tower.  (Where does this Kevin Bacon game end, anyway?  Do “six degrees of Trump” suffice to contaminate you?)

And not only do I not feel a shadow of a hint of a moral conflict here, but it seems to me that precisely the same liberal Enlightenment principles are behind both of these stances.

But I’d go yet further.  It sort of flabbergasts me when social-justice activists don’t understand that, if we condemn not only Trump, not only his supporters, but even vociferous Trump opponents who associate with Trump supporters (!), all we’ll do is feed the narrative that got Trumpism as far as it has—namely, that of a smug, bubble-encased, virtue-signalling leftist elite subject to runaway political correctness spirals.  Like, a hundred million Americans’ worldviews revolve around the fear of liberal persecution, and we’re going to change their minds by firing anyone who refuses to fire them?  As a recent Washington Post story illustrates, the opposite approach is harder but can bear spectacular results.

Now, as for Peter Thiel: three years ago, he funded a small interdisciplinary workshop on the coast of France that I attended.  With me there were a bunch of honest-to-goodness conservative Christians, a Freudian psychoanalyst, a novelist, a right-wing radio host, some scientists and Silicon Valley executives, and of course Thiel himself.  Each, I found, offered tons to disagree about but also some morsels to learn.

Thiel’s worldview, focused on the technological and organizational greatness that (in his view) Western civilization used to have and has subsequently lost, was a bit too dark and pessimistic for me, and I’m a pretty dark and pessimistic person.  Thiel gave a complicated, meandering lecture that involved comparing modern narratives about Silicon Valley entrepreneurs against myths of gods, heroes, and martyrs throughout history, such as Romulus and Remus (the legendary founders of Rome).  The talk might have made more sense to Thiel than to his listeners.

At the same time, Thiel’s range of knowledge and curiosity was pretty awesome.  He avidly followed all the talks (including mine, on P vs. NP and quantum complexity theory) and asked pertinent questions. When the conversation turned to D-Wave, and Thiel’s own decision not to invest in it, he laid out the conclusions he’d come to from an extremely quick look at the question, then quizzed me as to whether he’d gotten anything wrong.  He hadn’t.

From that conversation among others, I formed the impression that Thiel’s success as an investor is, at least in part, down neither to luck nor to connections, but to a module in his brain that most people lack, which makes blazingly fast and accurate judgments about tech startups.  No wonder Y Combinator would want to keep him as an adviser.

But, OK, I’m so used to the same person being spectacularly right on some things and spectacularly wrong on others, that it no longer causes even slight cognitive dissonance.  You just take the issues one by one.

I was happy, on balance, when it came out that Thiel had financed the lawsuit that brought down Gawker Media.  Gawker really had used its power to bully the innocent, and it had broken the law to do it.  And if it’s an unaccountable, anti-egalitarian, billionaire Godzilla against a vicious, privacy-violating, nerd-baiting King Kong—well then, I guess I’m with Godzilla.

More recently, I was appalled when Thiel spoke at the Republican convention, pandering to the crowd with Fox-News-style attack lines that were unworthy of a mind of his caliber.  I lost a lot of respect for Thiel that day.  But that’s the thing: unlike with literally every other speaker at the GOP convention, my respect for Thiel had started from a point that made a decrease possible.

I reject huge parts of Thiel’s worldview.  I also reject any worldview that would threaten me with ostracism for talking to Thiel, attending a workshop he sponsors, or saying anything good about him.  This is not actually a difficult balance.

Today, when it sometimes seems like much of the world has united in salivating for a cataclysmic showdown between whites and non-whites, Christians and Muslims, “dudebros” and feminists, etc., and that the salivators differ mostly just in who they want to see victorious in the coming battle and who humiliated, it can feel lonely to stick up for naïve, outdated values like the free exchange of ideas, friendly disagreement, the presumption of innocence, and the primacy of the individual over the tribe.  But those are the values that took us all the way from a bronze spear through the enemy’s heart to a snarky rebuttal on the arXiv, and they’ll continue to build anything worth building.

And now to watch the third debate (I’ll check the comments afterward)…

Update (Oct. 20): See also this post from a blog called TheMoneyIllusion. My favorite excerpt:

So let’s see. Not only should Trump be shunned for his appalling political views, an otherwise highly respected Silicon Valley entrepreneur who just happens to support Trump (along with 80 million other Americans) should also be shunned. And a person who despises Trump and works against him but who defends Thiel’s right to his own political views should also resign. Does that mean I should be shunned too? After all, I’m a guy who hates Trump, writing a post that defends a guy who hates Trump, who wrote a post defending a guy’s freedom to support Trump, who in turn supports Trump. And suppose my mother sticks up for me? Should she also be shunned?

It’s almost enough to make me vote . . . no, just kidding.

Question … Which people on the left are beyond the pale? Suppose Thiel had supported Hugo Chavez? How about Castro? Mao? Pol Pot? Perhaps the degrees of separation could be calibrated to the awfulness of the left-winger:

Chavez: One degree of separation. (Corbyn, Sean Penn, etc.)

Castro: Two degrees of separation is still toxic.

Lenin: Three degrees of separation.

Mao: Four degrees of separation.

Pol Pot: Five degrees of separation.

Robert HellingDaylight saving time about to end (end it shouldn't)

Twice  a year, around the last Sunday in March and the last Sunday in October, everybody (in particular newspaper journalists) take a few minutes off to rant about daylight savings time. So, for this first time, I want to join this tradition in writing.

Until I had kids, I could not care less about the question of changing the time twice a year. But at least for our kids (and then secondary also for myself), I realize biorhythm is quite strong and at takes more than a week to adopt to the 1 hour jet lag (in particular in spring when it means getting out of bed "one hour earlier"). I still don't really care about cows that have to deliver their milk at different times since there is no intrinsic reason that the clock on the wall has to show a particular time when it is done and if it were really a problem, the farmers could do it at fixed UTC.

So, obviously, it is a nuisance. So what are the benefit that justify it? Well, obviously, in summer the sun sets at a later hour and we get more sun when being outside in the summer. That sounds reasonable. But why restrict it to the summer?

Which brings me to my point: If you ask me, I want to get rid of changing the offset to UTC twice a year and want to permanently adopt daylight saving time.

But now I hear people cry that this is "unnatural", we have to have the traditional time at least in the winter when it does not matter as it's too cold to be outside (which only holds for people with defective clothing as we know). So how natural is CET (the time zone we set our clocks to in winter), let's take people living in Munich for an example?

First of all: It is not solar time! CET is the "mean solar time" when you live at  a longitude of 15 degrees east, which is (assuming the latitude) close to Neumarkt an der Ypps somewhere in Austria not too far from Vienna. Munich is about 20 minutes behind. So, this time is artificial as well, and Berlin being closer to 15 degrees, it is probably Prussian.

Also a common time zone for Germany was established only in the 1870s when the advent of railways and telegraphs make synchronization between different local times advantageous. So this "natural" time is not that old either.

It is so new, that Christ Church college in Oxford still refuses to fully bow to it: Their clock tower shows Greenwich time. And the cathedral services start according to solar time (about five minutes later) because they don't care about modern shenanigans. ("How many Oxford deans does it take to change a light bulb?" ---- "Change??!??"). Similarly, in Bristol, there is a famous clock with two minute hands.

Plus, even if you live in Neumarkt an der Ybbs, your sun dial does not always show the correct noon! Thanks to the tilt of the earth axis and the fact that the orbit of the earth is elliptic, this varies through the year by a number of minutes:

So, "winter time" is in no way more natural than the other time zone. So we should be free to choose a time zone according to what is convenient. At least for me, noon is not the center of my waking hours (it's more 5,5 : 12). So, aligning those more with the sun seems to be a pretty good idea.

PS: The title was a typo, but looking at it I prefer it the way it is...

Doug NatelsonRice has an opening for Dean of Engineering

October 26, 2016

n-Category Café Higher Structures Journal

Michael Batanin, Ralph Kaufmann and Martin Markl are the editors of a new diamond open access journal called Higher Structures. The managing editor is Mark Weber, and here’s the editorial board:

Clemens Berger, Université Nice-Sophia Antipolis

Vladimir Dotsenko, Trinity College Dublin, the University of Dublin

Tobias Dyckerhoff, Hausdorff Center for Mathematics

Benoit Fresse, Université de Lille

Richard Garner, Macquarie University

André Henriques, Universiteit Utrecht

Joachim Kock, Universitat Autònoma de Barcelona

Stephen Lack, Macquarie University

Andrey Lazarev, Lancaster University

Muriel Livernet, Université Paris Diderot

Michael Makkai, McGill University

Yuri Manin, Max Planck Institute for Mathematics

Ieke Moerdijk, Universiteit Utrecht

Amnon Neeman, Australian National University

Maria Ofelia Ronco, Universidad de Talca

Jiří Rosický, Masaryk University

James Stasheff, University of Pennsylvania

Ross Street, Macquarie University

Bertrand Toën, Université de Toulouse

Boris Tsygan, Northwestern University

Bruno Vallette, Université Paris 13

Michel Van den Bergh, Universiteit Hasselt

Alexander Voronov, University of Minnesota

Here are some of their policies:

Focus and Scope

This journal publishes articles that make significant new contributions to mathematical science using higher structures, or that significantly advance our understanding of the foundational aspects of the theory of such structures. The scope of the journal includes: higher categories, operads and their generalisations, and applications of these to Algebra, Geometry, Topology, Combinatorics, Logic and Mathematical Physics.

Peer Review Process

Articles appearing in the journal have been carefully and critically refereed under the responsibility of members of the Editorial Board. Only papers judged to be both significant and excellent are accepted for publication.

Open Access Policy

This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge. Accepted articles will be licensed under a Creative Commons Attribution 4.0 International License.


This journal utilizes the LOCKSS system to create a distributed archiving system among participating libraries and permits those libraries to create permanent archives of the journal for purposes of preservation and restoration.

Matt StrasslerAn Interesting Result from CMS, and its Implications

UPDATE 10/26: In the original version of this post, I stupidly forgot to include an effect, causing an error of a factor of about 5 in one of my estimates below. I had originally suggested that a recent result using ALEPH data was probably more powerful than a recent CMS result.  But once the error is corrected, the two experiments appear have comparable sensitivity. However, I was very conservative in my analysis of ALEPH, and my guess concerning CMS has a big uncertainty band — so it might go either way.  It’s up to ALEPH experts and CMS experts to show us who really wins the day.  Added reasoning and discussion marked in green below.

In Friday’s post, I highlighted the importance of looking for low-mass particles whose interactions with known particles are very weak. I referred to a recent preprint in which an experimental physicist, Dr. Arno Heister, reanalyzed ALEPH data in such a search.

A few hours later, Harvard Professor Matt Reece pointed me to a paper that appeared just two weeks ago: a very interesting CMS analysis of 2011-2012 data that did a search of this type — although it appears that CMS [one of the two general purpose detectors at the Large Hadron Collider (LHC)] didn’t think of it that way.

The title of the paper is obscure:  “Search for a light pseudo–scalar Higgs boson produced in association with bottom quarks in pp collisions at 8 TeV“.  Such spin-zero “pseudo-scalar” particles, which often arise in speculative models with more than one Higgs particle, usually decay to bottom quark/anti-quark pairs or tau/anti-tau pairs.  But they can have a very rare decay to muon/anti-muon, which is much easier to measure. The title of the paper gives no indication that the muon/anti-muon channel is the target of the search; you have to read the abstract. Shouldn’t the words “in the dimuon channel” or “dimuon resonance” appear in the title?  That would help researchers who are interested in dimuons, but not in pseudo-scalars, find the paper.

Here’s the main result of the paper:

At left is shown a plot of the number of events as a function of the invariant mass of the muon/anti-muon pairs.  CMS data is in black dots; estimated background is shown in the upper curve (with top quark backgrounds in the lower curve); and the peak at bottom shows what a simulated particle decaying to muon/anti-muon with a mass of 30 GeV/c² would look like. (Imagine sticking the peak on top of the upper curve to see how a signal would affect the data points).  At right are the resulting limits on the rate for such a resonance to be produced and then decay to muon/anti-muon, if it is radiated off of a bottom quark. [A limit of 100 femtobarns means that at most two thousand collisions of this type could have occurred during the year 2012.  But note that only about 1 in 100 of these collisions would have been observed, due to the difficulty of triggering on these collisions and some other challenges.]

[Note also the restriction of the mass of the dimuon pair to the range 25 GeV to 60 GeV. This may have done purely been for technical reasons, but if it was due to the theoretical assumptions, that restriction should be lifted.]

While this plot places moderate limits on spin-zero particles produced with a bottom quark, it’s equally interesting, at least to me, in other contexts. Specifically, it puts limits on any light spin-one particle (call it V) that mixes (either via kinetic or mass mixing) with the photon and Z and often comes along with at least one bottom quark… because for such particles the rate to decay to muons is not rare.  This is very interesting for hidden valley models specifically; as I mentioned on Friday, new spin-one and spin-zero particles often are produced together, giving a muon/anti-muon pair along with one or more bottom quark/anti-quark pairs.

But CMS interpreted its measurement only in terms of radiation of a new particle off a bottom quark.  Now, what if a V particle decaying sometimes to muon/anti-muon were produced in a Z particle decay (a possibility alluded to already in 2006).  For a different production process, the angles and energies of the particles would be different, and since many events would be lost (due to triggering, transverse momentum cuts, and b-tagging inefficiencies at low transverse momentum) the limits would have to be fully recalculated by the experimenters.  It would be great if CMS could add such an analysis before they publish this paper.

Still, we can make a rough back-of-the-envelope estimate, with big caveats. The LHC produced about 600 million Z particles at CMS in 2012. The plot at right tells us that if the V were radiated off a bottom quark, the maximum number of produced V’s decaying to muons would be about 2000 to 8000, depending on the V mass.  Now if we could take those numbers directly, we’d conclude that the fraction of Z’s that could decay to muon/anti-muon plus bottom quarks in this way would be 3 to 12 per million. But sensitivity of this search to a Z decay to V is probably much less than for a V radiated off bottom quarks [because (depending on the V mass) either the bottom quarks in the Z decay would be less energetic and more difficult to tag, or the muons are less energetic on average, or both.] So I’m guessing that the limits on Z decays to V are always worse than one per hundred thousand, for any V mass.  (Thanks to Wei Xue for catching an error as I was finalizing my estimate.)  

If that guess/estimate is correct, then the CMS search does not rule out the possibility of a hundred or so Z decays to V particles at each of the various LEP experiments.  That said, old LEP searches might rule this possibility out; if anyone knows of such a search, please comment or contact me.

As for whether Heister’s analysis of the ALEPH experiment’s data shows signs of such a signal, I think it unlikely (though some people seemed to read my post as saying the opposite.)  As I pointed out in Friday’s post, not only is the excess too small for excitement on its own, it also is somewhat too wide and its angular correlations look like the background (which comes, of course, from bottom quarks that decay to charm quarks plus a muon and neutrino.)  The point of Friday’s post, and of today’s, is that we should be looking.

In fact, because of Heister’s work (which, by the way, is his own, not endorsed by the ALEPH collaboration), we can draw interesting if rough conclusions.  Ignore for now the bump at 30 GeV/c²; that’s more controversial.  What about the absence of a bump between 35 and 50 GeV/c²? Unless there are subtleties with his analysis that I don’t understand, we learn that at ALEPH there were fewer than ten Z decays to a V particle (plus a source of bottom quarks) for V in this mass range.  That limits such Z decays to about 2 to 3 per million.  OOPS: Dumb mistake!! At this step, I forgot to include the fact that requiring bottom quarks in the ALEPH events only works about 20% of the time (thanks to Imperial College Professor Oliver Buchmuller for questioning my reasoning!) The real number is therefore about 5 times larger, more like 10 to 15 per million. If that rough estimate is correct, it would provide a more powerful constraint than constraint roughly comparable to the current CMS analysis.

[[BUT: In my original argument I was very conservative.  When I said “fewer than 10”, I was trying to be brief; really, looking at the invariant mass plot, the allowed numbers of excess events for a V with mass above 36 GeV is typically fewer than 7 or even 5.  And that doesn’t include any angular information, which for many signals would reduce the numbers to 3.   Including these effects properly brings the ALEPH bound back down to something close to my initial estimate.  Anyway, it’s clear that CMS is nipping at ALEPH’s heels, but I’m still betting they haven’t passed ALEPH — yet.]]

So my advice would be to set Heister’s bump aside and instead focus on the constraints that one can obtain, and the potential discoveries that one could make, with this type of analysis, either at LEP or at LHC. That’s where I think the real lesson lies.

Filed under: LHC News, Other Collider News, Particle Physics Tagged: ALEPH, cms, dilepton, LEP, LHC

ParticlebitesHorton Hears a Sterile Neutrino?

Article: Limits on Active to Sterile Neutrino Oscillations from Disappearance Searches in the MINOS, Daya Bay, and Bugey-3 Experiments
Authors:  Daya Bay and MINOS collaborations
Reference: arXiv:1607.01177v4

So far, the hunt for sterile neutrinos has come up empty. Could a joint analysis between MINOS, Daya Bay and Bugey-3 data hint at their existence?

Neutrinos, like the beloved Whos in Dr. Seuss’ “Horton Hears a Who!,” are light and elusive, yet have a large impact on the universe we live in. While neutrinos only interact with matter through the weak nuclear force and gravity, they played a critical role in the formation of the early universe. Neutrino physics is now an exciting line of research pursued by the Hortons of particle physics, cosmology, and astrophysics alike. While most of what we currently know about neutrinos is well described by a three-flavor neutrino model, a few inconsistent experimental results such as those from the Liquid Scintillator Neutrino Detector (LSND) and the Mini Booster Neutrino Experiment (MiniBooNE) hint at the presence of a new kind of neutrino that only interacts with matter through gravity. If this “sterile” kind of neutrino does in fact exist, it might also have played an important role in the evolution of our universe.

Horton hears a sterile neutrino? Source:

The three known neutrinos come in three flavors: electron, muon, or tau. The discovery of neutrino oscillation by the Sudbury Neutrino Observatory and the Super-Kamiokande Observatory, which won the 2015 Nobel Prize, proved that one flavor of neutrino can transform into another. This led to the realization that each neutrino mass state is a superposition of the three different neutrino flavor states. From neutrino oscillation measurements, most of the parameters that define the mixing between neutrino states are well known for the three standard neutrinos.

The relationship between the three known neutrino flavor states and mass states is usually expressed as a 3×3 matrix known as the PMNS matrix, for Bruno Pontecorvo, Ziro Maki, Masami Nakagawa and Shoichi Sakata. The PMNS matrix includes three mixing angles, the values of which determine “how much” of each neutrino flavor state is in each mass state. The distance required for one neutrino flavor to become another, the neutrino oscillation wavelength, is determined by the difference between the squared masses of the two mass states. The values of mass splittings m_2^2-m_1^2 and m_3^2-m_2^2 are known to good precision.

A fourth flavor? Adding a sterile neutrino to the mix

A “sterile” neutrino is referred to as such because it would not interact weakly: it would only interact through the gravitational force. Neutrino oscillations involving the hypothetical sterile neutrino can be understood using a “four-flavor model,” which introduces a fourth neutrino mass state, m_4, heavier than the three known “active” mass states. This fourth neutrino state would be mostly sterile, with only a small contribution from a mixture of the three known neutrino flavors. If the sterile neutrino exists, it should be possible to experimentally observe neutrino oscillations with a wavelength set by the difference between m_4^2 and the square of the mass of another known neutrino mass state. Current observations suggest a squared mass difference in the range of 0.1-10 eV^2.

Oscillations between active and sterile states would result in the disappearance of muon (anti)neutrinos and electron (anti)neutrinos. In a disappearance experiment, you know how many neutrinos of a specific type you produce, and you count the number of that type of neutrino a distance away, and find that some of the neutrinos have “disappeared,” or in other words, oscillated into a different type of neutrino that you are not detecting.

A joint analysis by the MINOS and Daya Bay collaborations

The MINOS and Daya Bay collaborations have conducted a joint analysis to combine independent measurements of muon (anti)neutrino disappearance by MINOS and electron antineutrino disappearance by Daya Bay and Bugey-3. Here’s a breakdown of the involved experiments:

  • MINOS, the Main Injector Neutrino Oscillation Search: A long-baseline neutrino experiment with detectors at Fermilab and northern Minnesota that use an accelerator at Fermilab as the neutrino source
  • The Daya Bay Reactor Neutrino Experiment: Uses antineutrinos produced by the reactors of China’s Daya Bay Nuclear Power Plant and the Ling Ao Nuclear Power Plant
  • The Bugey-3 experiment: Performed in the early 1990s, used antineutrinos from the Bugey Nuclear Power Plant in France for its neutrino oscillation observations
Screen Shot 2016-09-12 at 10.22.49 AM

MINOS and Daya Bay/Bugey-3 combined 90% confidence level limits (in red) compared to the LSND and MiniBooNE 90% confidence level allowed regions (in green/purple). Plots the mass splitting between mass states 1 and 4 (corresponding to the sterile neutrino) against a function of the \mu-e mixing angle, which is equivalent to a function involving the 1-4 and 2-4 mixing angles. Regions of parameter space to the right of the red contour are excluded, counting out the majority of the LSND/MiniBooNE allowed regions. Source: arXiv:1607.01177v4.

Assuming a four-flavor model, the MINOS and Daya Bay collaborations put new constraints on the value of the mixing angle \theta_{\mu e}, the parameter controlling electron (anti)neutrino appearance in experiments with short neutrino travel distances. As for the hypothetical sterile neutrino? The analysis excluded the parameter space allowed by the LSND and MiniBooNE appearance-based indications for the existence of light sterile neutrinos for \Delta m_{41}^2 < 0.8 eV^2 at a 95% confidence level. In other words, the MINOS and Daya Bay analysis essentially rules out the LSND and MiniBooNE inconsistencies that allowed for the presence of a sterile neutrino in the first place. These results illustrate just how at odds disappearance searches and appearance searches are when it comes to providing insight into the existence of light sterile neutrinos. If the Whos exist, they will need to be a little louder in order for the world to hear them.


Background reading:

n-Category Café Linear Algebraic Groups (Part 3)

This time we touch on some other aspects of algebraic group theory, again using the example of projective geometry. We describe the decomposition of projective space into ‘Bruhat cells’. These let us count the points of projective spaces over finite fields, which gets us a wee bit deeper into the fascinating and somewhat mysterious topic of ‘qq-mathematics’.

As before, you can read John Simanyi’s wonderful notes in LaTeX. If you find mistakes, please let me know.

  • Lecture 3 (Sept. 29) - The Schubert decomposition of kP nk\mathrm{P}^n into Bruhat cells. Examples: the real projective line P 1\mathbb{R}\mathrm{P}^1, the complex projective plane P 1\mathbb{C}\mathrm{P}^1 and the real projective plane P 2\mathbb{R}\mathrm{P}^2. Projective geometry over finite fields: for any prime power qq, there is a field 𝔽 q\mathbb{F}_q with qq elements, and the cardinality of 𝔽 qP n\mathbb{F}_q\mathrm{P}^n is the qq-integer [n] q[n]_q.

Scott AaronsonMy 5-minute quantum computing talk at the White House

(OK, technically it was in the Eisenhower Executive Office Building, which is not exactly the White House itself, but is adjacent to the West Wing in the White House complex.  And President Obama wasn’t there—maybe, like Justin Trudeau, he already knows everything about quantum computing?  But lots of people from the Office of Science and Technology Policy were!  And some of us talked with Valerie Jarrett, Obama’s adviser, when she passed us on her way to the West Wing.

The occasion was a Quantum Information Science policy workshop that OSTP held, and which the White House explicitly gave us permission to discuss on social media.  Indeed, John Preskill already tweeted photos from the event.  Besides me and Preskill, others in attendance included Umesh Vazirani, Seth Lloyd, Yaoyun Shi, Rob Schoelkopf, Krysta Svore, Hartmut Neven, Stephen Jordan…

I don’t know whether this is the first time that the polynomial hierarchy, or the notion of variation distance, were ever invoked in a speech at the White House.  But in any case, I was proud to receive a box of Hershey Kisses bearing the presidential seal.  I thought of not eating them, but then I got hungry, and realized that I can simply refill the box later if desired.

For regular readers of Shtetl-Optimized, my talk won’t have all that much that’s new, but in any case it’s short.

Incidentally, during the workshop, a guy from OSTP told me that, when he and others at the White House were asked to prepare materials about quantum computing, posts on Shtetl-Optimized (such as Shor I’ll Do It) were a huge help.  Honored though I was to have “served my country,” I winced, thinking about all the puerile doofosities I might’ve self-censored had I had any idea who might read them.  I didn’t dare ask whether anyone at the White House also reads the comment sections!

Thanks so much to all the other participants and to the organizers for a great workshop.  –SA)

Quantum Supremacy

by Scott Aaronson (UT Austin)

October 18, 2016

Thank you; it’s great to be here.  There are lots of directions that excite me enormously right now in quantum computing theory, which is what I work on.  For example, there’s the use of quantum computing to get new insight into classical computation, into condensed matter physics, and recently, even into the black hole information problem.

But since I have five minutes, I wanted to talk here about one particular direction—one that, like nothing else that I know of, bridges theory and experiment in the service of what we hope will be a spectacular result in the near future.  This direction is what’s known as “Quantum Supremacy”—John [Preskill], did you help popularize that term?  [John nods yes]—although some people have been backing away from the term recently, because of the campaign of one of the possible future occupants of this here complex.

But what quantum supremacy means to me, is demonstrating a quantum speedup for some task as confidently as possible.  Notice that I didn’t say a useful task!  I like to say that for me, the #1 application of quantum computing—more than codebreaking, machine learning, or even quantum simulation—is just disproving the people who say quantum computing is impossible!  So, quantum supremacy targets that application.

What is important for quantum supremacy is that we solve a clearly defined problem, with some relationship between inputs and outputs that’s independent of whatever hardware we’re using to solve the problem.  That’s part of why it doesn’t cut it to point to some complicated, hard-to-simulate molecule and say “aha!  quantum supremacy!”

One discovery, which I and others stumbled on 7 or 8 years ago, is that quantum supremacy seems to become much easier to demonstrate if we switch from problems with a single valid output to sampling problems: that is, problems of sampling exactly or approximately from some specified probability distribution.

Doing this has two advantages.  First, we no longer need a full, fault-tolerant quantum computer—in fact, very rudimentary types of quantum hardware appear to suffice.  Second, we can design sampling problems for which we can arguably be more confident that they really are hard for a classical computer, than we are that (say) factoring is classically hard.  I like to say that a fast classical factoring algorithm might collapse the world’s electronic commerce, but as far as we know, it wouldn’t collapse the polynomial hierarchy!  But with sampling problems, at least with exact sampling, we can often show the latter implication, which is about the best evidence you can possibly get for such a problem being hard in the present state of mathematics.

One example of these sampling tasks that we think are classically hard is BosonSampling, which Alex Arkhipov and I proposed in 2011.  BosonSampling uses a bunch of identical photons that are sent through a network of beamsplitters, then measured to count the number of photons in each output mode.  Over the past few years, this proposal has been experimentally demonstrated by quantum optics groups around the world, with the current record being a 6-photon demonstration by the O’Brien group in Bristol, UK.  A second example is the IQP (“Instantaneous Quantum Polynomial-Time”) or Commuting Hamiltonians model of Bremner, Jozsa, and Shepherd.

A third example—no doubt the simplest—is just to sample from the output distribution of a random quantum circuit, let’s say on a 2D square lattice of qubits with nearest-neighbor interactions.  Notably, this last task is one that the Martinis group at Google is working toward achieving right now, with 40-50 qubits.  They say that they’ll achieve it in as little as one or two years, which translated from experimental jargon, means maybe five years?  But not infinity years.

The challenges on the experimental side are clear: get enough qubits with long enough coherence times to achieve this.  But there are also some huge theoretical challenges remaining.

A first is, can we still solve classically hard sampling problems even in the presence of realistic experimental imperfections?  Arkhipov and I already thought about that problem—in particular, about sampling from a distribution that’s merely close in variation distance to the BosonSampling one—and got results that admittedly weren’t as satisfactory as the results for exact sampling.  But I’m delighted to say that, just within the last month or two, there have been some excellent new papers on the arXiv that tackle exactly this question, with both positive and negative results.

A second theoretical challenge is, how do we verify the results of a quantum supremacy experiment?  Note that, as far as we know today, verification could itself require classical exponential time.  But that’s not the showstopper that some people think, since we could target the “sweet spot” of 40-50 qubits, where classical verification is difficult (and in particular, clearly “costlier” than running the experiment itself), but also far from impossible with cluster computing resources.

If I have any policy advice, it’s this: recognize that a clear demonstration of quantum supremacy is at least as big a deal as (say) the discovery of the Higgs boson.  After this scientific milestone is achieved, I predict that the whole discussion of commercial applications of quantum computing will shift to a new plane, much like the Manhattan Project shifted to a new plane after Fermi built his pile under the Chicago stadium in 1942.  In other words: at this point, the most “applied” thing to do might be to set applications aside temporarily, and just achieve this quantum supremacy milestone—i.e., build the quantum computing Fermi pile—and thereby show the world that quantum computing speedups are a reality.  Thank you.

October 25, 2016

Terence TaoMath 246A, Notes 3: Cauchy’s theorem and its consequences

We now come to perhaps the most central theorem in complex analysis (save possibly for the fundamental theorem of calculus), namely Cauchy’s theorem, which allows one to compute (or at least transform) a large number of contour integrals {\int_\gamma f(z)\ dz} even without knowing any explicit antiderivative of {f}. There are many forms and variants of Cauchy’s theorem. To give one such version, we need the basic topological notion of a homotopy:

Definition 1 (Homotopy) Let {U} be an open subset of {{\bf C}}, and let {\gamma_0: [a,b] \rightarrow U}, {\gamma_1: [a,b] \rightarrow U} be two curves in {U}.

  • (i) If {\gamma_0, \gamma_1} have the same initial point {z_0} and final point {z_1}, we say that {\gamma_0} and {\gamma_1} are homotopic with fixed endpoints in {U} if there exists a continuous map {\gamma: [0,1] \times [a,b] \rightarrow U} such that {\gamma(0,t) = \gamma_0(t)} and {\gamma(1,t) = \gamma_1(t)} for all {t \in [a,b]}, and such that {\gamma(s,a) = z_0} and {\gamma(s,b) = z_1} for all {s \in [0,1]}.
  • (ii) If {\gamma_0, \gamma_1} are closed (but possibly with different initial points), we say that {\gamma_0} and {\gamma_1} are homotopic as closed curves in {U} if there exists a continuous map {\gamma: [0,1] \times [a,b] \rightarrow U} such that {\gamma(0,t) = \gamma_0(t)} and {\gamma(1,t) = \gamma_1(t)} for all {t \in [a,b]}, and such that {\gamma(s,a) = \gamma(s,b)} for all {s \in [0,1]}.
  • (iii) If {\gamma_2: [c,d] \rightarrow U} and {\gamma_3: [e,f] \rightarrow U} are curves with the same initial point and same final point, we say that {\gamma_2} and {\gamma_3} are homotopic with fixed endpoints up to reparameterisation in {U} if there is a reparameterisation {\tilde \gamma_2: [a,b] \rightarrow U} of {\gamma_2} which is homotopic with fixed endpoints in {U} to a reparameterisation {\tilde \gamma_3: [a,b] \rightarrow U} of {\gamma_3}.
  • (iv) If {\gamma_2: [c,d] \rightarrow U} and {\gamma_3: [e,f] \rightarrow U} are closed curves, we say that {\gamma_2} and {\gamma_3} are homotopic as closed curves up to reparameterisation in {U} if there is a reparameterisation {\tilde \gamma_2: [a,b] \rightarrow U} of {\gamma_2} which is homotopic as closed curves in {U} to a reparameterisation {\tilde \gamma_3: [a,b] \rightarrow U} of {\gamma_3}.

In the first two cases, the map {\gamma} will be referred to as a homotopy from {\gamma_0} to {\gamma_1}, and we will also say that {\gamma_0} can be continously deformed to {\gamma_1} (either with fixed endpoints, or as closed curves).

Example 2 If {U} is a convex set, that is to say that {(1-s) z_0 + s z_1 \in U} whenever {z_0,z_1 \in U} and {0 \leq s \leq 1}, then any two curves {\gamma_0, \gamma_1: [0,1] \rightarrow U} from one point {z_0} to another {z_1} are homotopic, by using the homotopy

\displaystyle \gamma(s,t) := (1-s) \gamma_0(t) + s \gamma_1(t).

For a similar reason, in a convex open set {U}, any two closed curves will be homotopic to each other as closed curves.

Exercise 3 Let {U} be an open subset of {{\bf C}}.

  • (i) Prove that the property of being homotopic with fixed endpoints in {U} is an equivalence relation.
  • (ii) Prove that the property of being homotopic as closed curves in {U} is an equivalence relation.
  • (iii) If {\gamma_0, \gamma_1: [a,b] \rightarrow U} are closed curves with the same initial point, show that {\gamma_0} is homotopic to {\gamma_1} with fixed endpoints if and only if {\gamma_0} is homotopic to {\gamma_1} as closed curves.
  • (iv) Define a point in {U} to be a curve {\gamma_1: [a,b] \rightarrow U} of the form {\gamma_1(t) = z_0} for some {z_0 \in U} and all {t \in [a,b]}. Let {\gamma_0: [a,b] \rightarrow U} be a closed curve in {U}. Show that {\gamma_0} is homotopic with fixed endpoints to a point in {U} if and only if {\gamma_0} is homotopic as a closed curve to a point in {U}. (In either case, we will call {\gamma_0} homotopic to a point, null-homotopic, or contractible to a point in {U}.)
  • (v) If {\gamma_0, \gamma_1: [a,b] \rightarrow U} are curves with the same initial point and the same terminal point, show that {\gamma_0} is homotopic to {\gamma_1} with fixed endpoints in {U} if and only if {\gamma_0 + (-\gamma_1)} is homotopic to a point in {U}.
  • (vi) If {U} is connected, and {\gamma_0, \gamma_1: [a,b] \rightarrow U} are any two curves in {U}, show that there exists a continuous map {\gamma: [0,1] \times [a,b] \rightarrow U} such that {\gamma(0,t) = \gamma_0(t)} and {\gamma(1,t) = \gamma_1(t)} for all {t \in [a,b]}. Thus the notion of homotopy becomes rather trivial if one does not fix the endpoints or require the curve to be closed.
  • (vii) Show that if {\gamma_1: [a,b] \rightarrow U} is a reparameterisation of {\gamma_0: [a,b] \rightarrow U}, then {\gamma_0} and {\gamma_1} are homotopic with fixed endpoints in U.
  • (viii) Prove that the property of being homotopic with fixed endpoints in {U} up to reparameterisation is an equivalence relation.
  • (ix) Prove that the property of being homotopic as closed curves in {U} up to reparameterisation is an equivalence relation.

We can then phrase Cauchy’s theorem as an assertion that contour integration on holomorphic functions is a homotopy invariant. More precisely:

Theorem 4 (Cauchy’s theorem) Let {U} be an open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be holomorphic.

  • (i) If {\gamma_0: [a,b] \rightarrow U} and {\gamma_1: [c,d] \rightarrow U} are rectifiable curves that are homotopic in {U} with fixed endpoints up to reparameterisation, then

    \displaystyle \int_{\gamma_0} f(z)\ dz = \int_{\gamma_1} f(z)\ dz.

  • (ii) If {\gamma_0: [a,b] \rightarrow U} and {\gamma_1: [c,d] \rightarrow U} are closed rectifiable curves that are homotopic in {U} as closed curves up to reparameterisation, then

    \displaystyle \int_{\gamma_0} f(z)\ dz = \int_{\gamma_1} f(z)\ dz.

This version of Cauchy’s theorem is particularly useful for applications, as it explicitly brings into play the powerful technique of contour shifting, which allows one to compute a contour integral by replacing the contour with a homotopic contour on which the integral is easier to either compute or integrate. This formulation of Cauchy’s theorem also highlights the close relationship between contour integrals and the algebraic topology of the complex plane (and open subsets {U} thereof). Setting {\gamma_1} to be a point, we obtain an important special case of Cauchy’s theorem (which is in fact equivalent to the full theorem):

Corollary 5 (Cauchy’s theorem, again) Let {U} be an open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be holomorphic. Then for any closed rectifiable curve {\gamma} in {U} that is contractible in {U} to a point, one has {\int_\gamma f(z)\ dz = 0}.

Exercise 6 Show that Theorem 4 and Corollary 5 are logically equivalent.

An important feature to note about Cauchy’s theorem is the global nature of its hypothesis on {f}. The conclusion of Cauchy’s theorem only involves the values of a function {f} on the images of the two curves {\gamma_0, \gamma_1}. However, in order for the hypotheses of Cauchy’s theorem to apply, the function {f} must be holomorphic not only on the images on {\gamma_0, \gamma_1}, but on an open set {U} that is large enough (and sufficiently free of “holes”) to support a homotopy between the two curves. This point can be emphasised through the following fundamental near-counterexample to Cauchy’s theorem:

Example 7 Let {U := {\bf C} \backslash \{0\}}, and let {f: U \rightarrow {\bf C}} be the holomorphic function {f(z) := \frac{1}{z}}. Let {\gamma_{0,1,\circlearrowleft}: [0,2\pi] \rightarrow {\bf C}} be the closed unit circle contour {\gamma_{0,1,\circlearrowleft}(t) := e^{it}}. Direct calculation shows that

\displaystyle \int_{\gamma_{0,1,\circlearrowleft}} f(z)\ dz = 2\pi i \neq 0.

As a consequence of this and Cauchy’s theorem, we conclude that the contour {\gamma_{0,1,\circlearrowleft}} is not contractible to a point in {U}; note that this does not contradict Example 2 because {U} is not convex. Thus we see that the lack of holomorphicity (or singularity) of {f} at the origin can be “blamed” for the non-vanishing of the integral of {f} on the closed contour {\gamma_{0,1,\circlearrowleft}}, even though this contour does not come anywhere near the origin. Thus we see that the global behaviour of {f}, not just the behaviour in the local neighbourhood of {\gamma_{0,1,\circlearrowleft}}, has an impact on the contour integral.

One can of course rewrite this example to involve non-closed contours instead of closed ones. For instance, if we let {\gamma_0, \gamma_1: [0,\pi] \rightarrow U} denote the half-circle contours {\gamma_0(t) := e^{it}} and {\gamma_1(t) := e^{-it}}, then {\gamma_0,\gamma_1} are both contours in {U} from {+1} to {-1}, but one has

\displaystyle \int_{\gamma_0} f(z)\ dz = +\pi i


\displaystyle \int_{\gamma_1} f(z)\ dz = -\pi i.

In order for this to be consistent with Cauchy’s theorem, we conclude that {\gamma_0} and {\gamma_1} are not homotopic in {U} (even after reparameterisation).

In the specific case of functions of the form {\frac{1}{z}}, or more generally {\frac{f(z)}{z-z_0}} for some point {z_0} and some {f} that is holomorphic in some neighbourhood of {z_0}, we can quantify the precise failure of Cauchy’s theorem through the Cauchy integral formula, and through the concept of a winding number. These turn out to be extremely powerful tools for understanding both the nature of holomorphic functions and the topology of open subsets of the complex plane, as we shall see in this and later notes.

— 1. Proof of Cauchy’s theorem —

The underlying reason for the truth of Cauchy’s theorem can be explained in one sentence: complex differentiable functions behave locally like complex linear functions, which are conservative thanks to the fundamental theorem of calculus. More precisely, if {f(z) = a + bz} is any complex linear function of {z}, then {f} has an antiderivative {az + \frac{1}{2} bz^2}, and hence

\displaystyle \int_\gamma a+bz\ dz = 0 \ \ \ \ \ (1)

for any rectifiable closed curve {\gamma} in the complex plane.

Perhaps the slickest way to make this intuition rigorous is through the following special case of Cauchy’s theorem.

Theorem 8 (Goursat’s theorem) Let {U} be an open subset of {{\bf C}}, and {z_1,z_2,z_3} be complex numbers such that the solid (and closed) triangle spanned by {z_1,z_2,z_3} (or more precisely, the convex hull of {\{z_1,z_2,z_3\}}) is contained in {U}. (We allow the triangle to degenerate in that we allow the {z_1,z_2,z_3} to be collinear, or even coincident.) Then for any holomorphic function {f: U \rightarrow {\bf C}}, one has

\displaystyle \int_{\gamma_{z_1 \rightarrow z_2 \rightarrow z_3 \rightarrow z_1}} f(z)\ dz = 0,

where {\gamma_{z_1 \rightarrow z_2 \rightarrow z_3 \rightarrow z_1}} is the closed polygonal path that traverses the vertices {z_1, z_2, z_3} of the solid triangle in order.

Proof: Let us denote the triangular contour {\gamma_{z_1 \rightarrow z_2 \rightarrow z_3 \rightarrow z_1}} as {T_0}. It is convenient (though odd-looking at first sight) to prove this theorem by contradiction. That is to say, suppose for contradiction that we had

\displaystyle |\int_{T_0} f(z)\ dz| \geq \varepsilon \ \ \ \ \ (2)

for some {\varepsilon > 0}. We now run the following “divide and conquer” strategy. We let {z_{12} := \frac{z_1+z_2}{2}}, {z_{23} := \frac{z_2+z_3}{2}}, {z_{31} := \frac{z_3 + z_1}{2}} be the midpoints of {z_1, z_2, z_3}. Then from the basic properties of contour integration (see Exercise 16 of Notes 2) we can split the triangular integral {\int_{T_1} f(z)\ dz} as the sum of four integrals on smaller triangles, namely

\displaystyle \int_{\gamma_{z_1 \rightarrow z_{12} \rightarrow z_{31} \rightarrow z_1}} f(z)\ dz

\displaystyle \int_{\gamma_{z_{12} \rightarrow z_2 \rightarrow z_{23} \rightarrow z_{12}}} f(z)\ dz

\displaystyle \int_{\gamma_{z_{23} \rightarrow z_3 \rightarrow z_{31} \rightarrow z_{23}}} f(z)\ dz

\displaystyle \int_{\gamma_{z_{12} \rightarrow z_{23} \rightarrow z_{31} \rightarrow z_{12}}} f(z)\ dz.

(The reader is encouraged to draw a picture to visualise this decomposition.) By (2) and the triangle inequality (or, if one prefers, the pigeonhole principle), we must therefore have

\displaystyle |\int_{T_1} f(z)\ dz| \geq \frac{\varepsilon}{4}

where {T_1} is one of the four triangular contours {\gamma_{z_1 \rightarrow z_{12} \rightarrow z_{31} \rightarrow z_1}}, {\gamma_{z_{12} \rightarrow z_2 \rightarrow z_{23} \rightarrow z_{12}}}, {\gamma_{z_{23} \rightarrow z_3 \rightarrow z_{31} \rightarrow z_{23}}}, or {\gamma_{z_{12} \rightarrow z_{23} \rightarrow z_{31} \rightarrow z_{12}}}. Regardless of which of the four contours {T_1} is, observe that the triangular region enclosed by {T_1} is contained in that of {T_0}. Furthermore, the diameter of {T_1} is precisely half that of {T_0}, where the diameter {\mathrm{diam}(\gamma)} of a curve {\gamma: [a,b] \rightarrow {\bf C}} is defined by the formula

\displaystyle \mathrm{diam}(\gamma) := \sup_{t, t' \in [a,b]} |\gamma(t) - \gamma(t')|;

similarly, the perimeter {|T_1|} of {T_1} is precisely half that of {T_0}. If we iterate the above process, we can find a nested sequence {T_0, T_1, T_2, \dots} of triangular contours, each of which is contained in the previous one with half the diameter and perimeter, such that

\displaystyle |\int_{T_n} f(z)\ dz| \geq \frac{\varepsilon}{4^n} \ \ \ \ \ (3)

for all {n=0,1,2,\dots}. If we let {z_n} be any point enclosed by {T_n}, then from the decreasing diameters it is clear that the {z_n} are a Cauchy sequence and thus converge to some limit {z_*}, which is then contained in all of the closed triangles enclosed by any of the {T_n}.

In particular, {z_*} lies in {U} and so {f} is differentiable at {z_*}. This implies, for any {\varepsilon'>0}, that there exists a {\delta > 0} such that

\displaystyle |\frac{f(z) - f(z_*)}{z-z_*} - f'(z_*)| \leq \varepsilon'

whenever {z \in D(z_*,\delta) \backslash \{z_*\}}. We can rearrange this as

\displaystyle |f(z) - (f(z_*) + (z-z_*) f'(z_*))| \leq \varepsilon' |z-z_*|

on {D(z_*,\delta)}. In particular, for {n} large enough, this bound holds on the image on {T_n}. In this case we can bound {|z-z_*|} by {\mathrm{diam}(T_n)}, and hence by Exercise 16(v) of Notes 2,

\displaystyle |\int_{T_n} f(z)\ dz - \int_{T_n} (f(z_*) + (z-z_*) f'(z_*))\ dz| \leq \varepsilon' \hbox{diam}(T_n) |T_n|.

From (1), the second integral vanishes. As each {T_n} has half the diameter and perimeter of the previous, we thus have

\displaystyle |\int_{T_n} f(z)\ dz| \leq \frac{\varepsilon' \hbox{diam}(T_0) |T_0|}{4^n}.

But if one chooses {\varepsilon'} small enough depending on {\varepsilon} and {T_0}, we contradict (3). \Box

Remark 9 This is a rare example of an argument in which a hypothesis of differentiability, rather than continuous differentiability, is used, because one can localise any failure of the conclusion all the way down to a single point. Another instance of such an argument is the standard proof of Rolle’s theorem.

Exercise 10 Find a proof of Goursat’s theorem that avoids explicit use of proof by contradiction. (Hint: use the fact that a solid triangle is compact, in the sense that every open cover has a finite subcover. For the purposes of this question, ignore the possibility that the proof of this latter fact might also use proof by contradiction.)

Goursat’s theorem only directly handles triangular contours, but as long as one works “locally”, or more precisely in a convex domain, we can quickly generalise:

Corollary 11 (Local Cauchy’s theorem for polygonal paths) Let {U} be a convex open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be a holomorphic function. Then for any closed polygonal path {\gamma = \gamma_{z_1 \rightarrow \dots \rightarrow z_n \rightarrow z_1}} in {U}, we have {\int_\gamma f(z)\ dz = 0}.

Proof: We induct on the number of vertices {n}. The cases {n=1,2} are trivial, and the {n=3} case follows directly from Goursat’s theorem (using the convexity of {U} to ensure that the interior of the polygon lies in {U}). If {n > 3}, we can split

\displaystyle \int_{\gamma_{z_1 \rightarrow \dots \rightarrow z_n \rightarrow z_1}} f(z)\ dz = \int_{\gamma_{z_1 \rightarrow \dots \rightarrow z_{n-1} \rightarrow z_1}} f(z)\ dz + \int_{z_{n-1} \rightarrow z_n \rightarrow z_1 \rightarrow z_{n-1}} f(z)\ dz.

The second integral on the right-hand side vanishes by Goursat’s theorem. The claim then follows from induction. \Box

Exercise 12 By using the (real-variable) fundamental theorem of calculus and Fubini’s theorem in place of Goursat’s theorem, give an alternate proof of Corollary 11 in the case that {\gamma} is a rectangle {\gamma = \gamma_{a+bi \rightarrow c+bi \rightarrow c+di \rightarrow a+di \to a+bi}} and the derivative {f'} of {f} is continuous. (One can also use Stokes’ theorem in place of the fundamental theorem of calculus and Fubini’s theorem.)

We can amplify Corollary 11 using the fundamental theorem of calculus again:

Corollary 13 (Local Cauchy’s theorem) Let {U} be a convex open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be a holomorphic function. Then {f} has an antiderivative {F: U \rightarrow {\bf C}}. Also, {\int_\gamma f(z)\ dz = 0} for any closed rectifiable curve {\gamma} in {U}, and {\int_{\gamma_1} f(z)\ dz = \int_{\gamma_2} f(z)\ dz} whenever {\gamma_1, \gamma_2} are two rectifiable curves in {U} with the same initial point and same terminal point. In other words, {f} is conservative on {U}.

Proof: The first claim follows from Corollary 11 and the second fundamental theorem of calculus (Theorem 30 from Notes 2). The remaining claims then follow from the first fundamental theorem of calculus (Theorem 27 from Notes 2). \Box

We can now prove Cauchy’s theorem in the form of Theorem 4.

Proof: We will just prove part (i), as part (ii) is similar (and in any event it follows from part (i)). Since reparameterisation does not affect the integral, we may assume without loss of generality that {\gamma_0: [a,b] \rightarrow U} and {\gamma_1: [a,b] \rightarrow U} are homotopic with fixed endpoints, and not merely homotopic with fixed endpoints up to reparameterisation.

Let {\gamma: [0,1] \times [a,b] \rightarrow U} be a homotopy from {\gamma_0} to {\gamma_1}. Note that for any {s \in [0,1]} and {t \in [a,b]}, {\gamma(s,t)} lies in the open set {U}. From compactness, there must exist a radius {r>0} such that {D(\gamma(s,t),r) \subset U} for all {s \in [0,1]} and {t \in [a,b]}. Next, as {\gamma} is continuous on a compact set, it is uniformly continuous. In particular, there exists {\delta > 0} such that

\displaystyle |\gamma(s',t') - \gamma(s,t)| \leq \frac{r}{4}

whenever {s,s' \in [0,1]} and {t,t' \in [a,b]} are such that {|s-s'| \leq \delta} and {|t-t'| \leq \delta}.

Now partition {[0,1]} and {[a,b]} as {0 = s_0 < \dots < s_n = 1} and {a = t_0 < \dots < t_m = b} in such a way that {|s_i - s_{i-1}| \leq \delta} and {|t_j-t_{j-1}| \leq \delta} for all {1 \leq i \leq n} and {1 \leq j \leq m}. For each such {i} and {j}, let {C_{i,j}} denote the closed polygonal contour

\displaystyle C_{i,j} := \gamma_{\gamma(s_i,t_{j-1}) \rightarrow \gamma(s_i,t_j) \rightarrow \gamma(s_{i-1},t_j) \rightarrow \gamma(s_{i-1},t_{j-1}) \rightarrow \gamma(s_i,t_{j-1})}.

(the reader is encouraged here to draw a picture of the situation; we are using polygonal contours here rather than the homotopy {\gamma} because we did not require any rectifiability properties on the homotopy). By construction, the diameter of this contour is at most {\frac{r}{4}+\frac{4}{4}+\frac{r}{4}+\frac{r}{4} = r}, so the contour is contained entirely in the disk {D( \gamma(s_i,t_i), r)}. This disk is convex and contained in {U}. Applying Corollary 11 or Corollary 13, we conclude that

\displaystyle \int_{C_{i,j}} f(z)\ dz = 0

for all {1 \leq i \leq n} and {1 \leq j \leq m}. If we sum this over all {i} and {j}, and noting that the homotopy fixes the endpoints, we conclude after a lot of cancelling that

\displaystyle \int_{\gamma(0,t_0) \rightarrow \gamma(0, t_1) \rightarrow \dots \rightarrow \gamma(0, t_n)} f(z)\ dz = \int_{\gamma(1,t_0) \rightarrow \gamma(1, t_1) \rightarrow \dots \rightarrow \gamma(1, t_n)} f(z)\ dz

(again, the reader is encouraged to draw a picture to see this cancellation). However, from a further application of Corollary 11 we have

\displaystyle \int_{\gamma(0,t_{i-1}) \rightarrow \gamma_{0,t_i}} f(z)\ dz = \int_{\gamma_{0,[t_{i-1},t_i]}} f(z)\ dz

for {i=1,\dots,n}, where {\gamma_{0,[t_{i-1},t_i]}: [t_{i-1},t_i] \rightarrow U} is the restriction of {\gamma: [a,b] \rightarrow U} to {[t_{i-1},t_i]}, and similarly for {\gamma_1}. Putting all this together we conclude that

\displaystyle \int_{\gamma_0} f(z)\ dz = \int_{\gamma_1} f(z)\ dz

as required. \Box

One nice feature of Cauchy’s theorem is that it allows one to integrate holomorphic functions on curves that are not necessarily rectifiable. Indeed, if {\gamma: [a,b] \rightarrow U} is a curve in {U}, then for a sufficiently fine partition {a = t_0 < t_1 < \dots < t_n = b}, the polygonal (and hence rectifiable) path {\gamma_{t_0 \rightarrow t_1 \rightarrow \dots \rightarrow t_n}} will be contained in {U}, and furthermore be homotopic to {\gamma} with fixed endpoints. One can then define {\int_\gamma f(z)\ dz} when {f} is holomorphic in {U} and {\gamma} is non-rectifiable by declaring

\displaystyle \int_\gamma f(z)\ dz := \int_{\tilde \gamma} f(z)\ dz

where {\tilde \gamma} is any rectifiable curve that is homotopic (with fixed endpoints) to {\gamma}. This is a well defined definition thanks to the above discussion as well as Cauchy’s theorem; also observe that the exact open set {U} in which the homotopy lives is not relevant, since given any two open sets {U,U'} containing the image of {\gamma} one can find a rectifiable curve {\tilde \gamma} which is homotopic to {\gamma} with fixed endpoints in {U \cap U'}, and hence in {U} and {U'} separately. With this extended notion of the contour integral, one can then remove the hypothesis of rectifiability from many theorems involving integration of holomorphic functions. In particular, Cauchy’s theorem itself now holds for non-rectifiable curves. This reflects some duality in the integration concept {\int_\gamma f(z)\ dz}; if one assumes more regularity on the function {f}, one can get away with worse regularity on the curve {\gamma}, and vice versa.

A special case of Cauchy’s theorem is worth recording explicitly. We say that an open set {U} in the complex plane is simply connected if it is non-empty, connected, and if every closed curve in {U} is contractible in {U} to a point. For instance, from Example 2 we see that any convex non-empty open set is simply connected. From Theorem 4 we then have

Theorem 14 (Cauchy’s theorem, simply connected case) Let {U} be a simply connected subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be holomorphic. Then {\int_\gamma f(z)\ dz = 0} for any closed curve in {U}. In particular (by Exercise 31 of Notes 2), {f} is conservative and has an antiderivative.

— 2. Consequences of Cauchy’s theorem —

Now that we have Cauchy’s theorem, we use it to quickly give a large number of striking consequences. We begin with a special case of the Cauchy integral formula.

Theorem 15 (Cauchy integral formula, special case) Let {U} be an open subset of {{\bf C}}, let {f: U \rightarrow {\bf C}} be holomorphic, and let {z_0} be a point in {U}. Let {r>0} be such that the closed disk {\overline{D(z_0,r)} := \{ z \in {\bf C}: |z-z_0| \leq r \}} is contained in {U}. Let {\gamma} be a closed curve in {U \backslash \{z_0\}} that is homotopic (as a closed curve, and up to reparameterisation) in {U \backslash \{z_0\}} to {\gamma_{z_0,r,\circlearrowleft}} in {U}. Then

\displaystyle f(z_0) = \frac{1}{2\pi i} \int_\gamma \frac{f(z)}{z-z_0}\ dz.

Here we are already taking advantage of the ability to integrate holomorphic functions (such as {\frac{f(z)}{z-z_0}}, which is holomorphic on {U \backslash \{z_0\}}) on curves {\gamma} that are not necessarily rectifiable.

Note the remarkable feature here that the value of {f} at some point other than that on {\gamma} is completely determined by the value of {f} on the curve {\gamma}, which is a strong manifestation of the “rigid” or “global” nature of holomorphic functions. Such a formula is certainly not available in the real case (Cauchy’s theorem is technically true on the real line, but there is no analogue of the circular contours {\gamma_{z_0,r,\circlearrowleft}} available in that setting).

Proof: Observe that for any {0 < \varepsilon < r}, the circles {\gamma_{z_0,r,\circlearrowleft}} and {\gamma_{z_0,\varepsilon,\circlearrowleft}} are homotopic (as closed curves) in {\overline{D(z_0,r)}}, and hence in {U}. Since the function {z \mapsto \frac{f(z)}{z-z_0}} is holomorphic on {U \backslash \{z_0\}}, we conclude from Cauchy's theorem that

\displaystyle \int_\gamma \frac{f(z)}{z-z_0}\ dz = \int_{\gamma_{z_0,\varepsilon,\circlearrowleft}} \frac{f(z)}{z-z_0}\ dz

As {f} is complex differentiable at {z_0}, there exists a finite {M} such that

\displaystyle |\frac{f(z)-f(z_0)}{z-z_0}| \leq M

for all {z} in {\gamma_{z_0,\varepsilon,\circlearrowleft}}, and all sufficiently small {\varepsilon}. The length of this circle is of course {2\pi \varepsilon}. Applying Exercise 16(v) of Notes 2 we have

\displaystyle | \int_{\gamma_{z_0,\varepsilon,\circlearrowleft}} \frac{f(z)-f(z_0)}{z-z_0}\ dz| \leq 2 \pi \varepsilon M.

On the other hand, from explicit computation (cf. Example 7) we have

\displaystyle \int_{\gamma_{z_0,\varepsilon,\circlearrowleft}} \frac{1}{z-z_0}\ dz = 2\pi i;

putting all this together, we see that

\displaystyle |\frac{1}{2\pi i} \int_\gamma \frac{f(z)}{z-z_0}\ dz - f(z_0)| \leq \varepsilon M.

Sending {\varepsilon} to zero, we obtain the claim. \Box

Note the same argument would give

\displaystyle m f(z_0) = \frac{1}{2\pi i} \int_\gamma \frac{f(z)}{z-z_0}\ dz

if {\gamma} were homotopic to the curve {t \mapsto z_0 + r e^{i m t}}: {0 \leq t \leq 2\pi} rather than {\gamma_{z_0,r,\circlearrowleft}}, for some integer {m}. In particular, if {\gamma} were homotopic to a point in {U \backslash \{z_0\}}, then the right-hand side would vanish.

Remark 16 For various explicit examples of closed contours {\gamma}, it is also possible to prove the Cauchy integral formula by applying Cauchy’s theorem to various “keyhole contours”. We will not pursue this approach here, but see for instance Chapter 2 of Stein-Shakarchi.

Exercise 17 (Mean value property and Poisson kernel) Let {U} be an open subset of {{\bf C}}, and let {\overline{D(z_0,r)}} be a closed disk contained in {U}.

  • (i) If {f: U \rightarrow {\bf C}} is holomorphic, show that

    \displaystyle f(z_0) = \frac{1}{2\pi} \int_0^{2\pi} f(z_0 + re^{i\theta})\ d\theta..  Use this to give an alternate proof of Exercise 26 from Notes 1.

  • (ii) If {u: U \rightarrow {\bf R}} is harmonic, show that

    \displaystyle u(z_0) = \frac{1}{2\pi} \int_0^{2\pi} u(z_0 + re^{i\theta})\ d\theta..  Use this to give an alternate proof of Theorem 25 from Notes 1.

  • (iii) If {u: U \rightarrow {\bf R}} is harmonic, show that

    \displaystyle u(z) = \frac{1}{2\pi} \int_0^{2\pi} P( \frac{z - z_0}{re^{i\theta}} ) u(z_0 + re^{i\theta})\ d\theta

    for any {z \in D(z_0,r)}, where the Poisson kernel {P: D(0,1) \rightarrow {\bf R}} is defined by the formula

    \displaystyle P(z) := \mathrm{Re} \frac{1 + z}{1-z}.

    (Hint: it simplifies the calculations somewhat if one reduces to the case {z_0=0}, {r=1}, and {z = s} for some {0 < s < 1}. Then compute the integral {\frac{1}{2\pi i} \int_{\gamma_{0,1,\circlearrowleft}} f(w) \frac{1}{2} (\frac{1+s/w}{1-s/w} + \frac{1+sw}{1-sw})\ dw} in two different ways, where {f=u+iv} is holomorphic with real part {u}.)

The first important consequence of the Cauchy integral formula is the analyticity of holomorphic functions:

Corollary 18 (Holomorphic functions are analytic) Let {U} be an open subset of {{\bf C}}, let {f: U \rightarrow {\bf C}} be holomorphic, and let {z_0} be a point in {U}. Let {r>0} be such that the closed disk {\overline{D(z_0,r)} := \{ z \in {\bf C}: |z-z_0| \leq r \}} is contained in {U}. For each natural number {n}, let {a_n} denote the complex number

\displaystyle a_n := \frac{1}{2\pi i} \int_{\gamma_{z_0,r,\circlearrowleft}} \frac{f(z)}{(z-z_0)^{n+1}}\ dz. \ \ \ \ \ (4)

Then the power series {\sum_{n=0}^\infty a_n (z-z_0)^n} has radius of convergence at least {r}, and converges to {f(z)} inside the disk.

Proof: By continuity, there exists a finite {M} such that {|f(z)| \leq M} for all {z} on the circle {\gamma_{z_0,r,\circlearrowleft}}, which of course has length {2\pi r}. From Exercise 16(v) of Notes 2 we conclude that

\displaystyle |a_n| \leq \frac{1}{2\pi} 2\pi r \frac{M}{r^{n+1}}.

From this and Proposition 7 of Notes 1, we see that the radius of convergence of {a_n} is indeed at least {r}.

Next, for any {w \in D(z_0,r)}, the circle {\gamma_{z_0,r,\circlearrowleft}} is homotopic (as a closed curve) in {\overline{D(z_0,r)} \backslash \{w\}} (and hence in {U \backslash \{w\})} to {\gamma_{w,\varepsilon,\circlearrowleft}} to {\varepsilon} small enough that {D(w,\varepsilon)} lies in {D(z_0,r)}. Applying the Cauchy integral formula, we conclude that

\displaystyle f(w) = \frac{1}{2\pi i} \int_{\gamma_{z_0,r,\circlearrowleft}} \frac{f(z)}{z-w}\ dz.

On the other hand, from the geometric series formula (Exercise 12 of Notes 1) one has

\displaystyle \frac{1}{z-w} = \sum_{n=0}^\infty \frac{1}{(z-z_0)^{n+1}} (w-z_0)^n

for all {w \in D(z_0,r)}, and thus

\displaystyle f(w) = \frac{1}{2\pi i} \int_{\gamma_{z_0,r,\circlearrowleft}} (\sum_{n=0}^\infty \frac{f(z)}{(z-z_0)^{n+1}} (w-z_0)^n)\ dz.

If we could interchange the sum and integral, we would conclude from (4) that

\displaystyle f(w) = \sum_{n=0}^\infty a_n (w-z_0)^n

which would give the claim. To justify the interchange, we will use the Weierstrass {M}-test (the dominated convergence theorem would also work here). We have the pointwise bound

\displaystyle |\frac{f(z)}{(z-z_0)^{n+1}} (w-z_0)^n| \leq \frac{M}{r^{n+1}} |w-z_0|^n;

by the geometric series formula and the hypothesis {w \in D(z_0,r)}, the sum {\sum_{n=0}^\infty \frac{M}{r^{n+1}} |w-z_0|^n} is finite, and so the {M}-test applies and we are done. \Box

Remark 19 A function {f: U \rightarrow {\bf C}} on an open set {U \subset {\bf C}} is said to be complex analytic on {U} if, for every {z_0 \in U}, there is a power series {\sum_{n=0}^\infty a_n(z-z_0)^n} with a positive radius of convergence that converges to {f} on some neighbourhood of {z_0}. Combining the above corollary with Theorem 15 of Notes 1, we see that {f} is holomorphic on {U} if and only if {f} is complex analytic on {U}; thus the terms “complex differentiable”, “holomorphic”, and “complex analytic” may be used interchangeably. This can be contrasted with real variable case: there is a completely parallel notion of a real analytic function {f: (a,b) \rightarrow {\bf R}} (i.e., a function that, around every point {x_0} in the domain, can be expanded as a convergent power series in some neighbourhood of that point), and real analytic functions are automatically smooth and differentiable, but the converse is quite false.

Recalling (see Remark 21 of Notes 1) that power series are infinitely differentiable (in both the real and complex senses) inside their disk of convergence, and working locally in various small disks in {U}, we conclude

Corollary 20 Let {U} be an open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be a holomorphic function. Then {f': U \rightarrow {\bf C}} is also holomorphic, and {f} is smooth (i.e. infinitely differentiable in the real sense).

In view of this corollary, we may now drop hypotheses of continuous first or second differentiability from several of the theorems in Notes 1, such as Exercise 26 from that set of notes.

Combining Corollary 20 with Proposition 28 of Notes 1 (with {{\bf C}} replaced by various rectangles in {U}), we obtain a form of elliptic regularity:

Corollary 21 (Elliptic regularity) Let {U} be an open subset of {{\bf C}}, and let {u: U \rightarrow {\bf R}} be a harmonic function. Then {u} is smooth.

In fact one can even omit the hypothesis of continuous twice differentiability in the definition of harmonicity if one works with the notion of weak harmonicity, but this is a topic for a PDE or distribution theory course and will not be pursued further here.

Another immediate consequence of Corollary 18 is a version of the factor theorem:

Corollary 22 (Factor theorem for analytic functions) Let {U} be an open subset of {{\bf C}}, and let {z_0} be a point in {{\bf C}}. Let {f: U \rightarrow {\bf C}} be a complex analytic function that vanishes at {z_0}. Then there exists a unique complex analytic function {g: U \rightarrow {\bf C}} such that {f(z) = g(z) (z-z_0)} for all {z \in U}.

Proof:  For {z \neq z_0}, we can simply define {g(z) := f(z)/(z-z_0)}, and this is clearly the unique choice here.  Uniqueness at {z_0} follows from continuity. For {z} equal to or near {z_0}, we can expand {f} as a Taylor series {f(z) = \sum_{n=1}^\infty a_n (z-z_0)^n} (noting that the constant term vanishes since {f(z_0)=0}) and then set {g(z) := \sum_{n=0}^\infty a_{n+1} (z-z_0)^n}. One can check that these two definitions of {g} agree on their common domain, and that {g} is complex differentiable (and hence analytic) on {U}. \Box

Yet another consequence is the important property of analytic continuation:

Corollary 23 (Analytic continuation) Let {U} be a connected non-empty open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}}, {g: U \rightarrow {\bf C}} be complex analytic functions. If {f} and {g} agree on some non-empty open subset of {U}, then they in fact agree on all of {U}.

Proof: Let {V} denote the set of all points {z_0} in {U} where {f} and {g} agree to all orders, that is to say that

\displaystyle f^{(n)}(z_0) = g^{(n)}(z_0)

for all {n=0,1,\dots}. By hypothesis, {V} is non-empty; by the continuity of the {f^{(n)}} {V} is closed; and from analyticity and Taylor expansion (Exercise 17 of Notes 1) {V} is open. As {U} is connected, {V} must therefore be all of {U}, and the claim follows. \Box

There is also a variant of the above corollary:

Corollary 24 (Non-trivial analytic functions have isolated zeroes) Let {U} be a connected non-empty open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be a function which vanishes at some point {z_0 \in {\bf C}} but is not identically zero. Then there exists a disk {D(z_0,r)} in {U} on which {f} does not vanish except at {z_0}; in other words, all the zeroes of {f} are isolated points.

Proof: If all the derivatives {f^{(n)}(z_0)} of {f} at {z_0} vanish, then by Taylor expansion {f} vanishes in some open neighbourhood of {z_0}, and then by Corollary 23 {f} vanishes everywhere, a contradiction. Thus at least one of the {f^{(n)}(z_0)} is non-zero. If {n_0} is the first natural number for which {f^{(n_0)}(z_0) \neq 0}, then by iterating the factor theorem (Corollary 22) we see that {f(z) = (z-z_0)^{n_0} g(z)} for some analytic function {g:U \rightarrow {\bf C}} which is non-vanishing at {z_0}. By continuity, {g} is also non-vanishing in some disk {D(z_0,r)} in {U}, and the claim follows. \Box

One particular consequence of the above corollary is that if two entire functions {f,g} agree on the real line (or even on an infinite bounded subset of the complex plane), then they must agree everywhere, since otherwise {f-g} would have a non-isolated zero, contradicting Corollary 24. This strengthens Corollary 23, and helps explain why real-variable identities such as {\sin^2(x)+\cos^2(x)=1} automatically extend to their complex counterparts {\sin^2(z) + \cos^2(z) = 1}. Another consequence is that if an entire function {f: {\bf C} \rightarrow {\bf C}} is real-valued on the real axis, then one has the identity

\displaystyle f(z) = \overline{f(\overline{z})}

for all complex {z}, because this identity already holds on the real line, and both sides are complex analytic. Thus for instance

\displaystyle \sin(z) = \overline{\sin(\overline{z})}.

Next, if we combine Corollary 18 with Exercise 17 of Notes 1, as well as Cauchy’s theorem, we obtain

Theorem 25 (Higher order Cauchy integral formula, special case) Let {U} be an open subset of {{\bf C}}, let {f: U \rightarrow {\bf C}} be holomorphic, and let {z_0} be a point in {U}. Let {r>0} be such that the closed disk {\overline{D(z_0,r)} := \{ z \in {\bf C}: |z-z_0| \leq r \}} is contained in {U}. Let {\gamma} be a closed curve in {U \backslash \{z_0\}} that is homotopic (as a closed curve, up to reparameterisation) in {U \backslash \{z_0\}} to {\gamma_{z_0,r,\circlearrowleft}} in {U}. Then for any natural number {n}, the {n^{\mathrm{th}}} derivative {f^{(n)}(z_0)} of {f} at {z_0} is given by the formula

\displaystyle f^{(n)}(z_0) = \frac{n!}{2\pi i} \int_\gamma \frac{f(z)}{(z-z_0)^{n+1}}\ dz.

Exercise 26 Give an alternate proof of Theorem 25 by rigorously differentiating the Cauchy integral formula with respect to the {z_0} parameter.

Combining Theorem 25 with Exercise 16(v) of Notes 2, we obtain a more quantitative form of Corollary 20, which asserts not only that the higher derivatives of a holomorphic function exist, but also places a bound on them:

Corollary 27 (Cauchy inequalities) Let {U} be an open subset of {{\bf C}}, let {f: U \rightarrow {\bf C}} be holomorphic, and let {z_0} be a point in {U}. Let {r>0} be such that the closed disk {\overline{D(z_0,r)} := \{ z \in {\bf C}: |z-z_0| \leq r \}} is contained in {U}. Suppose that there is an {M} such that {|f(z)| \leq M} on the circle {\{ z \in {\bf C}: |z-z_0| = r\}}. Then for any natural number {n}, we have

\displaystyle |f^{(n)}(z_0)| \leq \frac{n!}{r^n} M. \ \ \ \ \ (5)

Note that the {n=0} case of this corollary is compatible with the maximum principle (Exercise 26 of Notes 1).

The right-hand side of (5) has a denominator {r^n} that improves when {r} gets large. In particular we have the remarkable theorem of Liouville:

Theorem 28 (Liouville’s theorem) Let {f: {\bf C} \rightarrow {\bf C}} be an entire function that is bounded. Then {f} is constant.

Proof: By hypothesis, there is a finite {M} such that {|f(z)| \leq M} for all {M}. Applying the Cauchy inequalities with {n=1} and any disk {D(z_0,r)}, we conclude that

\displaystyle |f'(z_0)| \leq \frac{M}{r}

for any {z_0 \in {\bf C}} and {r>0}. Sending {r} to infinity, we conclude that {f'} vanishes identically. The claim then follows from the fundamental theorem of calculus. \Box

This theorem displays a strong “rigidity” property for entire functions; if such a function is even vaguely close to being constant (by being bounded), then it almost magically “snaps into place” and actually is forced to be a constant! This is in stark contrast to the real case, in which there are functions such as {\sin(x)} that are differentiable (and even smooth and analytic) on the real line and bounded, but definitely not constant. Note that the complex analogue {\sin(z)} of the sine function is not a counterexample to Liouville’s theorem, since {\sin(z)} becomes quite unbounded away from the real axis (Exercise 16 of Notes 0). This also fits well with the intuition of harmonic functions (and hence also holomorphic functions) being “balanced” in that any convexity in one direction has to be balanced by concavity in the orthogonal direction, and vice versa (as discussed before Theorem 25 of Notes 1): any attempt to create an entire function that is bounded and oscillating in one direction will naturally force that function to become unbounded in the orthogonal direction.

Exercise 29 Let {f: {\bf C} \rightarrow {\bf C}} be an entire function which is of polynomial growth in the sense that there exists a finite quantity {M>0} and some exponent {A \geq 0} such that {|f(z)| \leq M (1+|z|)^A} for all {z \in {\bf C}}. Show that {f} is, in fact, a polynomial.

Now we can prove the fundamental theorem of algebra discussed back in Notes 0.

Theorem 30 (Fundamental theorem of algebra) Let

\displaystyle P(z) = a_n z^n + \dots + a_0

be a polynomial of degree {n \geq 0} for some {a_0,\dots,a_n \in {\bf C}} with {a_n} non-zero. Then there exist complex numbers {z_1,\dots,z_n} such that

\displaystyle P(z) = a_n (z-z_1) \dots (z-z_n).

Proof: This is trivial for {n=0,1}, so suppose inductively that {n \geq 2} and the claim has already been proven for {n-1}. Suppose first that the equation {P(z)=0} has no roots in the complex plane, then the function {1/P(z)} is entire. Also, this function goes to zero as {|z| \rightarrow \infty}, and so is bounded on the exterior of any sufficiently large disk; as it is also continuous, it is bounded on any disk and is thus bounded everywhere. By Liouville’s theorem, {1/P(z)} is constant, which implies that {P(z)} is constant, which is absurd (for instance, the {n^{\mathrm{th}}} derivative of {P} is the non-zero function {n! a_n}). Hence {P(z)} has at least one root {z_n}. By the factor theorem (which works in any field, including the complex numbers) we can then write {P(z) = Q(z) (z-z_n)} for some polynomial {Q(z)}, which by the long division algorithm (or by comparing coefficients) must take the form

\displaystyle Q(z) = a_n z^{n-1} + b_{n-2} z^{n-2} + \dots + b_0

for some complex numbers {b_0,\dots,b_{n-2}}. The claim then follows from the induction hypothesis. \Box

The following exercises show that {{\bf C}} can be alternatively defined as an algebraic closure of the reals {{\bf R}} (together with a designated square root {i} of {-1}), and that extending {{\bf R}} using a different irreducible polynomial than {x^2+1} would still give a field isomorphic to the complex numbers, thus supporting the notion that the complex numbers are not an arbitrary extension of the reals, but rather a quite natural and canonical one.

Exercise 31 Let {k} be a field containing {{\bf R}} which is a finite extension of {{\bf R}}, in the sense that {k} is a finite-dimensional vector space over {{\bf R}}. Show that {k} is isomorphic (as a field) to either {{\bf R}} or {{\bf C}}. (Hint: if {\alpha} is some element of {k} not in {{\bf R}}, show that {P(\alpha)=0} for some irreducible polynomial {P} with real coefficients but no real roots. Use this to set up an isomorphism between the field {\tilde k} generated by {{\bf R}} and {\alpha} with {{\bf C}}. If there is an element {\beta} of {k} not in this field {\tilde k}, show that there {Q(\beta)=0} for some irreducible polynomial {Q} with coefficients in {\tilde k} and no roots in {\tilde k}, and contradict the fundamental theorem of algebra.)

Exercise 32 A field {k} is said to be algebraically closed if the conclusion of Theorem 30 with {{\bf C}} replaced by {k}. Show that any algebraically closed field {k} containing {{\bf R}}, contains a subfield that is isomorphic to {{\bf C}} (and which contains {{\bf R}} as a subfield, isomorphic to the copy of {{\bf R}} inside {{\bf C}}). Thus, up to isomorphism, {{\bf C}} is the unique algebraic closure of {{\bf R}}, that is to say a minimal algebraically closed field containing {{\bf R}}.

Another nice consequence of the Cauchy integral formula is a converse to Cauchy’s theorem known as Morera’s theorem.

Theorem 33 (Morera’s theorem) Let {U} be an open subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be a continuous function. Suppose that {f} is conservative in the sense that {\int_\gamma f(z)\ dz = 0} for any closed polygonal path in {U}. Then {f} is holomorphic on {U}.

Proof: By working locally with small balls in {U} we may assume that {U} is a ball (and in particular connected). By Exercise 31 of Notes 2, {f} has an antiderivative {F: U \rightarrow {\bf R}}. By definition, {F} is complex differentiable at every point of {U} (with derivative {f}), so by Corollary 20, {F} is smooth, which implies in particular that {f' = F} is holomorphic on {U} as claimed. \Box

The power of Morera’s theorem comes from the fact that there are no differentiability requirements in the hypotheses on {f}, and yet the conclusion is that {f} is differentiable (and hence smooth, by Corollary 20); it can be viewed as another manifestation of “elliptic regularity”. Here is one basic application of Morera’s theorem:

Theorem 34 (Uniform limit of holomorphic functions is holomorphic) Let {U} be an open subset of {{\bf C}}, and let {f_n: U \rightarrow {\bf C}} be a sequence of holomorphic functions that converge uniformly on compact sets to a limit {f: U \rightarrow {\bf C}}. Then {f} is also holomorphic. Furthermore, for each natural number {k}, the derivatives {f_n^{(k)}: U \rightarrow {\bf C}} also converge uniformly on compact sets to {f^{(k)}: U \rightarrow {\bf C}} for any natural number  (In particular, {f_n^{(k)}} converges pointwise to {f^{(k)}} on {U}.)

Proof: Again we may work locally and assume that {U} is a ball (and in partiular is convex and simply connected). The {f_n} are continuous, hence their locally uniform limit {f} is also continuous. From Corollary 11 (or Corollary 14), we have {\int_\gamma f_n(z)\ dz = 0} on any closed polygonal path in {U}, hence on taking locally uniform limits we also have {\int_\gamma f(z)\ dz = 0} for such paths. The holomorphicity of {f} then follows from Morera’s theorem. The uniform convergence of {f_n^{(k)}} to {f^{(k)}} on compact sets {K} follows from applying Theorem 25 to circular contours {\gamma_{z_0,\varepsilon,\circlearrowleft}} for {z_0 \in K} and {\varepsilon>0} small enough that these contours lie in {U} (note from compactness that one can take {\varepsilon} independent of {z_0}). \Box

Actually, one can weaken the uniform nature of the convergence in Theorem 34 substantially; even the weak limit of holomorphic functions in the space of locally integrable functions on {U} will remain harmonic. However, we will not need these weaker versions of this theorem here.

Exercise 35 (Riemann’s theorem on removable singularities) Let {U} be an open subset of {{\bf C}}, let {z_0} be a point in {U}, and let {f: U \backslash \{z_0\} \rightarrow {\bf C}} be a holomorphic function on {U \backslash \{z_0\}} which is bounded near {z_0}, in the sense that it is bounded on some punctured disk {D(z_0,r) \backslash \{z_0\}} contained in {z_0}. Show that {f} has a removable singularity at {z_0}, in the sense that {f} is the restriction to {U \backslash \{z_0\}} of a holomorphic function {\tilde f: U \rightarrow {\bf C}} on {U}. (Hint: show that {f} is conservative near {z_0}, find an antiderivative, extend it to {U}, and use Morera’s theorem to show that this extension is holomorphic. Alternatively, one can also proceed by some version of the Cauchy integral formula.)

Exercise 36 (Integrals of holomorphic functions) Let {U} be an open subset of {{\bf C}}, and let {f: [0,1] \times U \rightarrow {\bf C}} be a continuous function such that, for each {t \in [0,1]}, the function {z \mapsto f(t,z)} is holomorphic on {U}. Show that the function {z \mapsto \int_0^1 f(t,z)\ dz} is also holomorphic on {U}. (Hint: work locally and use Cauchy’s theorem, Morera’s theorem, and Fubini’s theorem.)

Exercise 37 (Schwarz reflection principle) Let {U} be an open subset of {{\bf C}} that is symmetric around the real axis, that is to say {\overline{z} \in U} whenever {z \in U}. Let {f_+: \overline{U_+} \rightarrow {\bf C}} be a continuous function on the set {\overline{U_+} := \{ z \in U: \mathrm{Im}(z) \geq 0\}} that is holomorphic in the open subset {U_+ := \{ z \in U: \mathrm{Im}(z) > 0 \}}. Similarly, let {f_-: \overline{U_-} \rightarrow {\bf C}} be continuous on {\overline{U_-} := \{ z \in U: \mathrm{Im}(z) \leq 0\}} that is holomorphic in the open subset {U_- := \{ z \in U: \mathrm{Im}(z) < 0 \}}. Suppose further that {f_+} and {f_-} agree on {U \cap {\bf R}}. Show that {f_+} and {f_-} are both restrictions of a single holomorphic function {f: U \rightarrow {\bf C}}.

The following two Venn diagrams (or more precisely, Euler diagrams) summarise the relationships between different types of regularity amongst continuous functions over both the reals and the complexes. The first diagram


describes the class of continuous functions on some interval {(a,b)} in the real line; such functions are automatically conservative, but not necessarily differentiable, while differentiable functions are not necessarily smooth, and smooth functions are not necessarily analytic. On the other hand, when considering the class of continuous functions on an open subset {U} of {{\bf C}}, the picture is different:


Now, very few continuous functions are conservative, and only slightly more functions are complex differentiable (and for simply connected domains {U}, these two classes in fact coincide). Whereas in the real case, differentiable functions were considerably less regular than analytic functions, in the complex case the two classes in fact coincide.

— 3. Winding number —

One defect of the current formulation of the Cauchy integral formula (see Theorem 15 and the ensuing discussion) is that the curve {\gamma} involved has to be homotopic (as a closed curve, up to reparameterisation) to a circular arc {\gamma_{z_0,r,\circlearrowleft}}, or at least to a curve of the form {t \mapsto z_0 + re^{imt}}, {t \in [0,2\pi]} for some integer {m}. We now investigate what happens when this hypothesis is removed. A key notion is that of a winding number.

Definition 38 (Winding number) Let {\gamma} be a closed curve, and let {z_0} be a complex number that is not in the image of {\gamma}. The winding number {W_\gamma(z_0)} of {\gamma} around {z_0} is defined by the integral

\displaystyle W_\gamma(z_0) := \frac{1}{2\pi i} \int_\gamma \frac{dz}{z-z_0}. \ \ \ \ \ (6)

Here we again take advantage of the ability to integrate holomorphic functions on curves that are not necessarily rectifiable. Clearly the winding number is unchanged if we replace {\gamma} by any equivalent curve, and if one replaces the curve {\gamma} with its reversal {-\gamma}, then the winding number is similarly negated. In some texts, the winding number is also referred to as the index or degree.

From the Cauchy integral formula we see that

\displaystyle W_{\gamma}(z_0) = 1

when {\gamma} is homotopic in {{\bf C} \backslash \{z_0\}} (as a closed curve, up to reparameterisation) to a circle {\gamma_{z_0,r,\circlearrowleft}}, and more generally that

\displaystyle W_{\gamma}(z_0) = m

if {\gamma} is homotopic in {{\bf C} \backslash \{z_0\}} (as a closed curve, up to reparameterisation) to a curve of the form {t \mapsto z_0 + r e^{imt}}, {t \in [0,2\pi]}. Thus we see, intuitively at least, that {W_\gamma(z_0)} measures the number of times {\gamma} winds counterclockwise about {z_0}, which explains the term “winding number”.

We can now state a more general form of the Cauchy integral formula:

Theorem 39 (General Cauchy integral formula) Let {U} be a simply connected subset of {{\bf C}}, let {\gamma} be a closed curve in {{\bf C}}, and let {f: U \rightarrow {\bf C}} be holomorphic. Then for any {z_0} that lies in {U} but not in the image of {\gamma}, we have

\displaystyle \frac{1}{2\pi i} \int_\gamma \frac{f(z)}{z-z_0}\ dz = W_\gamma(z_0) f(z_0).

Proof: By Corollary 22 (or Exercise 35), we have {f(z) - f(z_0) = (z-z_0) g(z)} for some holomorphic function {g: U \rightarrow {\bf C}}. Hence by Theorem 14 we have

\displaystyle \int_\gamma \frac{f(z)-f(z_0)}{z-z_0}\ dz = \int_\gamma g(z)\ dz = 0.

The claim then follows from (6). \Box

Exercise 40 (Higher order general Cauchy integral formula) With {U, \gamma, f, z_0} as in the above theorem, show that

\displaystyle W_\gamma(z_0) f^{(n)}(z_0) = \frac{n!}{2\pi i} \int_\gamma \frac{f(z)}{(z-z_0)^{n+1}}\ dz

for every natural number {n}. (Hint: instead of approximating {f(z)} by {f(z_0)}, use a partial Taylor expansion of {f}. Many of the terms that arise can be handled using the fundamental theorem of calculus. Alternatively, one can use differentiation under the integral sign and Lemma 44 below.)

To use Theorem 39, it becomes of interest to obtain more properties on the winding number. From Cauchy’s theorem we hav

Lemma 41 (Homotopy invariance) Let {z_0 \in {\bf C}}, and let {\gamma_0, \gamma_1} be two closed curves in {{\bf C} \backslash \{z_0\}} that are homotopic as closed curves up to reparameterisation in {{\bf C} \backslash \{z_0\}}. Then {W_{\gamma_0}(z_0) = W_{\gamma_1}(z_0)}.

The following specific corollary of this lemma will be useful for us.

Corollary 42 (Rouche’s theorem for winding number) Let {\gamma_0: [a,b] \rightarrow {\bf C}} be a closed curve, and let {z_0} lie outside of the image of {\gamma_0}. Let {\gamma_1: [a,b] \rightarrow {\bf C}} be a closed curve such that

\displaystyle |\gamma_1(t) -\gamma_0(t)| < |\gamma_0(t) - z_0| \ \ \ \ \ (7)

for all {t \in [a,b]}. Then {W_{\gamma_0}(z_0) = W_{\gamma}(z_1)}.

Proof: The map {\gamma: [0,1] \times [a,b] \rightarrow {\bf C}} defined by {\gamma(s,t) := (1-s) \gamma_0(t) + s \gamma_1(t)} is a homotopy from {\gamma_0} to {\gamma_1}; by (7) and the triangle inequality, it avoids {z_0}. The claim then follows from Lemma 41. \Box

Corollary 42 can be used to compute the winding number near infinity as follows. Given a curve {\gamma: [a,b] \rightarrow {\bf C}} and a point {z_0}, define the distance

\displaystyle \mathrm{dist}(z_0,\gamma) := \inf_{t \in [a,b]} |z_0-\gamma(t)|

and the diameter

\displaystyle \mathrm{diam}(\gamma) := \sup |\gamma(t) - \gamma(t')|.

Corollary 43 (Vanishing near infinity) Let {\gamma} be a closed curve. Then {W_\gamma(z_0) = 0} whenever {z_0 \in {\bf C}} is such that {\mathrm{dist}(z_0,\gamma) > \mathrm{diam}(\gamma)}.

Proof: Apply Corollary 42 with {\gamma_0} equal to {\gamma} and {\gamma_1} equal to any point in the image of {\gamma_0}. \Box

Corollary 42 also gives local constancy of the winding number:

Lemma 44 (Local constancy in {z_0}) Let {\gamma} be a closed curve. Then {W_\gamma} is locally constant. That is to say, if {z_0} does not lie in the image of {\gamma}, then there exists a disk {D(z_0,r)} outside of the image of {\gamma} such that {W_\gamma(z) = W_\gamma(z_0)} for all {z \in D(z_0,r)}.

Proof: From Corollary 42, we see that if {r} is small enough and {h \in D(0,r)}, then

\displaystyle W_{\gamma-h}(z_0) = W_\gamma(z_0),

where {\gamma-h: t \mapsto \gamma(t)-h} is the translation of {\gamma} by {h}. But by a translation change of variables we see that

\displaystyle W_{\gamma-h}(z_0) = W_\gamma(z_0+h)

and the claim follows. \Box

Exercise 45 Give an alternate proof of Lemma 44 based on differentiation under the integral sign and using the fact that {\frac{1}{(z-z_0)^2}} has an antiderivative away from {z_0}.

As confirmation of the interpretation of {W_\gamma(z_0)} as a winding number, we can now establish integrality:

Lemma 46 (Integrality) Let {\gamma} be a closed curve, and let {z_0} lie outside of the image of {\gamma}. Then {W_\gamma(z_0)} is an integer.

Proof: By Corollary 42 we may assume without loss of generality that {\gamma} is a closed polygonal path. By partitioning a polygon into triangles (and using Lemma 44 to move {z_0} slightly out of the way of any new edges formed by this partition) it suffices to verify this for triangular {\gamma}. But this follows from the Cauchy integral formula (if {z_0} is in the interior of the triangle) or Cauchy’s theorem (if {z_0} is in the exterior). \Box

Exercise 47 Give another proof of Lemma 46 by restricting again to closed polygonal paths {\gamma: [a,b] \rightarrow {\bf C}}, and showing that the function {t \mapsto \exp( \int_a^t \frac{\gamma'(s)}{\gamma(s)-z_0} )\ ds / (\gamma(t) - z_0)} is constant on {[a,b]} by establishing that it is continuous and has vanishing derivative at all but finitely many points. (Note that {\gamma'(s)} exists for all but finitely many {s}, so the integral here can be well defined.)

We now come to a fundamental and well known theorem about simple closed curves, namely the Jordan curve theorem.

Theorem 48 (Jordan curve theorem) Let {\gamma: [a,b] \rightarrow {\bf C}} be a non-trivial simple closed curve. Then there is an orientation {\sigma \in \{-1,+1\}} such that the complex plane {{\bf C}} is partitioned into the boundary region {\gamma([a,b])}, the exterior region

\displaystyle \{ z_0 \not \in \gamma([a,b]): W_\gamma(z_0) = 0 \}, \ \ \ \ \ (8)

and the interior region

\displaystyle \{ z_0 \not \in \gamma([a,b]): W_\gamma(z_0) = \sigma \}. \ \ \ \ \ (9)

Furthermore the exterior region is connected and unbounded, and the interior region is connected, non-empty and bounded. Finally, if {U} is any open set that contains {\gamma} and its interior, then {\gamma} is contractible to a point in {U}.

This theorem is relatively easy to prove for “nice” curves, such as polygons, but is surprisingly delicate to prove in general. Some idea of the subtlety involved can be seen by considering pathological examples such as the lakes of Wada, which are three disjoint open connected subsets of {{\bf C}} which all happen to have exactly the same boundary! This does not contradict the Jordan curve theorem, because the boundary set in this example is not given by a simple closed curve. However it does indicate that one has to carefully use the hypothesis of being a simple closed curve in order to prove Theorem 48. Another indication of the difficulty of the theorem is its global nature; the claim does not hold if one replaces the complex plane {{\bf C}} by other surfaces such as the torus, the projective plane, or the Klein bottle, so the global topological structure of the complex plane must come into play at some point. For the sake of completeness, we give a proof of this theorem in an appendix to these notes.

If the quantity {\sigma} in the above theorem is equal to {+1}, we say that the simple closed curve {\gamma} has an anticlockwise orientation; if instead {\sigma=-1} we say that {\gamma} has a clockwise orientation. Thus for instance, {\gamma_{z_0,r,\circlearrowleft}} has an anticlockwise orientation, while its reversal {-\gamma_{z_0,r,\circlearrowleft}} has the clockwise orientation.

Exercise 49 Let {\gamma_1}, {\gamma_2} be non-trivial simple closed curves.

  • (i) If {\gamma_1,\gamma_2} have disjoint image, show that {\gamma_2} either lies entirely in the interior of {\gamma_1}, or in the exterior.
  • (ii) If {\gamma_2} avoids the exterior of {\gamma_1}, show that the interior of {\gamma_2} is contained in the interior of {\gamma_1}, and the exterior of {\gamma_2} contains the exterior of {\gamma_1}.
  • (iii) If {\gamma_2} avoids the interior of {\gamma_1}, and {\gamma_1} avoids the interior of {\gamma_2}, and the two curves have disjoint images, show that the interior of {\gamma_2} is contained in the exterior of {\gamma_1}, and the exterior of {\gamma_2} contains the interior of {\gamma_1}.

(This is all visually “obvious” as soon as one draws a picture, but the challenge is to provide a rigorous proof. One should of course use the Jordan curve theorem extensively to do so. You will not need to use the final part of the Jordan curve theorem concerning contractibility.)

Exercise 50 Let {\gamma} be a non-trivial simple closed curve. Show that the interior of {\gamma} is simply connected. (Hint: first show that any simple closed polygonal path in {\gamma} is contractible to a point in the interior; then extend this to closed polygonal paths that are not necessarily simple by an induction on the number of edges in the path; then handle general closed curves.)

Remark 51 There is a refinement of the Jordan curve theorem known as the Jordan-Schoenflies theorem, that asserts that for non-trivial simple closed curve {\gamma} there is a homeomorphism {\phi: {\bf C} \rightarrow {\bf C}} that maps {\gamma} to the unit circle {S^1}, the interior of {\gamma} to the unit disk {D(0,1)}, and the exterior to the exterior region {\{ z \in {\bf C}: |z| > 1 \}}. The proof of this improved version of the Jordan curve theorem will have to wait until we have the Riemann mapping theorem (as well as a refinement of this theorem due to Carathéodory). The Jordan-Schoenflies theorem may seem self-evident, but it is worth pointing out that the analogous result in three dimensions fails without additional regularity assumptions on the boundary surface, thanks to the counterexample of the Alexander horned sphere.

From the Jordan curve theorem we have yet another form of the Cauchy theorem and Cauchy integral formula:

Theorem 52 (Cauchy’s theorem and Cauchy integral formula for simple curves) Let {\gamma} be a simple closed curve, and let {U} be an open set containing {\gamma} and its interior. Let {f: U \rightarrow {\bf C}} be a holomorphic function.

  • (i) (Cauchy’s theorem) One has {\int_\gamma f(z)\ dz = 0}.
  • (ii) (Cauchy integral formula) If {z_0 \in U} lies outside of the image of {\gamma}, then the expression {\frac{1}{2\pi i} \int_\gamma \frac{f(z)}{z-z_0}\ dz} vanishes if {z_0} lies in the exterior of {\gamma}, equals {f(z_0)} if {z_0} lies in the interior of {\gamma} and {\gamma} is oriented anti-clockwise, and equals {-f(z_0)} if {z_0} lies in the interior of {\gamma} and {\gamma} is oriented clockwise.

Exercise 53 Let {P(z) = a_n z^n + \dots + a_0} be a polynomial with complex coefficients {a_0,\dots,a_n} and {a_n \neq 0}. For any {R>0}, let {\gamma_R: [0,2\pi] \rightarrow {\bf C}} denote the closed contour {\gamma_R(t) := P(R e^{it})}.

  • (i) Show that if {R} is sufficiently large, then {W_{\gamma_R}(0) = n}.
  • (ii) Show that if {P} does not vanish on the closed disk {\overline{D(0,R)}}, then {W_{\gamma_R}(0)=0}.
  • (iii) Use these facts to give an alternate proof of the fundamental theorem of algebra that does not invoke Liouville’s theorem.

In the case when the closed curve {\gamma} is a contour (which includes of course the case of closed polygonal paths), one can describe the interior and exterior regions, as well as the winding number, more explicitly.

Exercise 54 (Local structure of interior and exterior) Let {\gamma = \gamma_1 + \dots + \gamma_n: [a,b] \rightarrow {\bf C}} be a simple closed contour formed by concatenating smooth curves {\gamma_1,\dots,\gamma_n} together. Let {z_0} be an interior point of one of these curves {\gamma_i: [a_i,b_i] \rightarrow {\bf C}}, thus {z_0 = \gamma_i(t_i)} for some $latex {a_i < t_i 0}&fg=000000$ and {\theta \in {\bf R}}. Recall from Exercise 24 of Notes 2 that for sufficiently small {\varepsilon}, the set {\gamma([a,b]) \cap D(z_0,\varepsilon)} can be expressed as a graph of the form

\displaystyle  \gamma([a,b] \cap D(z_0,\varepsilon)) = \{ z_0 + e^{i\theta} (s + i f(s)): s \in I_\varepsilon \}

for some interval {I_\varepsilon} and some continuously differentiable function {f: I_\varepsilon \rightarrow {\bf R}} with {f(0) =f'(0) = \varepsilon}. Show that if {\gamma} is oriented anticlockwise, and {\varepsilon} is sufficiently small then the interior of {\gamma} contains all points in {D(z_0,\varepsilon)} of the form {z_0 + e^{i\theta} (s + i (f(s)+u))} for some {s \in I_\varepsilon} and {u>0}, and the exterior of {\gamma} contains all points in {D(z_0,\varepsilon)} of the form {z_0 + e^{i\theta} (s + i (f(s)+u))} for some {s \in I_\varepsilon} and {u0} and {u<0} swapped.

Exercise 55 (Alexander numbering rule) Let {\gamma = \gamma_1 + \dots + \gamma_n: [a,b] \rightarrow {\bf C}} be a simple closed contour oriented anticlockwise formed by concatenating smooth curves {\gamma_1,\dots,\gamma_n} together. Let {\sigma = \sigma_1 + \dots + \sigma_m: [c,d] \rightarrow {\bf C}} be a contour formed by concatenating smooth curves {\sigma_1,\dots,\sigma_m}, with initial point {z_0} and final point {z_1}. Assume that there are only finitely many points {w_1,\dots,w_k} where the images of {\gamma} and of {\sigma} intersect. Furthermore, assume at each of the points {w_l}, {l=1,\dots,k}, that one has a “smooth simple transverse intersection” in the sense that the following axioms are obeyed:

  • (i) {w_l} lies in the interior of one of the smooth curves {\gamma_i: [a_i,b_i] \rightarrow {\bf C}} that make up {\gamma}, thus {w_l = \gamma_i(t_i)} for some {a_i < t_i < b_i}.
  • (ii) {w_l} lies in the interior of one of the smooth curves {\sigma_j: [c_j,d_j] \rightarrow {\bf C}} that make up {\sigma}, thus {w_l = \sigma_j(s_j)} for some {c_j < t_j < b_j}.
  • (iii) {w_l} is only traversed once by {\sigma}, thus there do not exist {t \neq t'} in {[c,d]} such that {\sigma(t)=\sigma(t')=w_l}.
  • (iv) The derivatives {\gamma'_i(t_i)} and {\sigma'_j(s_j)} are linearly independent over {{\bf R}}. In other words, we either have a crossing from the right in which {\sigma'_j(s_j) = \lambda e^{i\theta} \gamma'_i(t_i)} for some {\lambda > 0} and {0 < \theta < \pi}, or else we have a crossing from the left in which {\sigma'_j(s_j) = \lambda e^{i\theta} \gamma'_i(t_i)} for some {\lambda > 0} and {-\pi < \theta < 0}.

Show that {W_\gamma(z_0) - W_\gamma(z_1)} is equal to the total number of crossings from the left, minus the total number of crossings from the right.

Exercise 56 Let {U} be a non-empty connected open subset of {{\bf C}}. Show that {U} is simply connected if and only if every holomorphic function on {U} is conservative.

— 4. Appendix: proof of the Jordan curve theorem (optional) —

We now prove the Jordan curve theorem. We begin with a variant of Corollary 42 in which the curve {\gamma_1} is only required to have image close to the image of {\gamma_0}, rather than be close to {\gamma_0} in a pointwise (and uniform) sense. For any curve {\gamma} and any {r>0}, let {N_r(\gamma) := \{ z \in {\bf C}: \mathrm{dist}(z,\gamma) < r \}} denote the {r}-neighbourhood of {\gamma}.

Proposition 57 Let {\gamma_0} be a non-trivial simple closed curve, and let {\delta>0}. Suppose that {\varepsilon>0} is sufficiently small depending on {\gamma_0} and {\delta}. Let {\gamma_1} be a closed curve (not necessarily simple) whose image lies in {N_{\varepsilon}(\gamma_0)}. Then {\gamma_1} is homotopic (as a closed curve, up to reparameterisation) to {m\gamma_0} in {N_\delta(\gamma_0)}, where {m\gamma_0} is defined as the concatenation of {m} copies of {\gamma_0} if {m} is positive, the trivial curve at the initial point of {\gamma_0} if {m} is zero, and the concatenation of {-m} copies of {-\gamma_0} if {m} is negative. In particular, from Lemma 41 one has

\displaystyle W_{\gamma_1}(z_0) = m W_{\gamma_0}(z_0)

for all {z_0 \in {\bf C} \backslash N_{\delta}(\gamma_0)}.


Proof: After reparameterisation, we can take {\gamma_0: [0,1] \rightarrow {\bf C}} to have domain on the unit interval {[0,1]}, and then by periodic extension we can view {\gamma_0: {\bf R} \rightarrow {\bf C}} as a continuous {1}-periodic function on {{\bf R}}.

As {[0,1]} is compact, {\gamma_0} is uniformly continuous on {[0,1]}, and hence also on {{\bf R}}. In particular, there exists {0 < \kappa < \frac{1}{10}} such that

\displaystyle |\gamma_0(t_1) - \gamma_0(t_2)| \leq \frac{\delta}{2} \ \ \ \ \ (10)

whenever {t_1,t_2 \in {\bf R}} are such that {|t_1-t_2| \leq \kappa}.

Fix this {\kappa}. Observe that the function {(t_1,t_2) \mapsto |\gamma_0(t_1)-\gamma_0(t_2)|} is continuous and nowhere vanishing on the region {\{ (t_1,t_2) \in [0,2] \times [0,2]: \kappa \leq |t_1-t_2| \leq 1-\kappa \}}. Thus, if {\varepsilon} is small enough depending on {\gamma_0,\kappa}, we have the lower bound

\displaystyle |\gamma_0(t_1)-\gamma_0(t_2)| \geq 3\varepsilon

whenever {t_1,t_2 \in [0,2]} are such that {\kappa \leq |t_1-t_2| \leq 1-\kappa}. Using the {1}-periodicity of {\gamma_0}, we conclude that if {t_1,t_2 \in {\bf R}} are such that

\displaystyle |\gamma_0(t_1)-\gamma_0(t_2)| < 3\varepsilon

the there must be an integer {m_{t_1,t_2}} such that

\displaystyle |t_2 - (t_1+m_{t_1,t_2})| < \kappa. \ \ \ \ \ (11)

Note that this integer {m_{t_1,t_2}} is uniquely determined by {t_1} and {t_2}.

Let {[a,b]} be the domain of {\gamma_1: [a,b] \rightarrow {\bf C}}. By the uniform continuity of {\gamma_1}, we can find a partition {a = s_0 < \dots < s_n = b} of {[a,b]} such that

\displaystyle |\gamma_1(s) - \gamma_1(s')| < \varepsilon \ \ \ \ \ (12)

for all {1 \leq j \leq n} and {s_{j-1} \leq s, s' \leq s_j}. Since the image of {\gamma_1} lies in {N_{\varepsilon}(\gamma_0)}, we can find, for each {0 \leq j \leq n}, a real number {t_j} such that

\displaystyle |\gamma_1(s_j) - \gamma_0(t_j)| < \varepsilon. \ \ \ \ \ (13)

Since {\gamma_1} is closed, we may arrange matters so that

\displaystyle \gamma_0(t_0) = \gamma_0(t_n). \ \ \ \ \ (14)

From the triangle inequality and (12), (13) we have

\displaystyle |\gamma_0(t_j) - \gamma_0(t_{j-1})| < 3\varepsilon.

Using (11), we conclude that for each {1 \leq j \leq n}, there is an integer {m_j} such that

\displaystyle |t_j - (t_{j-1}+m_j)| < \kappa.

As {\gamma_0} is {1}-periodic, we have the freedom to shift each of the {t_j} by an arbitrary integer, and by doing this for {t_1, \dots, t_n} in turn, we may assume without loss of generality that all the {m_j} vanish, thus

\displaystyle |t_j - t_{j-1}| < \kappa \ \ \ \ \ (15)

for all {j=1,\dots,n}. In particular, from (10) we have

\displaystyle |\gamma_0(t) - \gamma_0(t')| < \frac{\delta}{2} \ \ \ \ \ (16)

whenever {t_{j-1} \leq t, t' \leq t_j}. Also, as {\gamma_0} is simple, we have from (14) that

\displaystyle t_n = t_0 + m

for some integer {m}. (Note that by enforcing (15), we no longer have the freedom to individually move {t_0} or {t_n} by an integer, so we cannot assume without loss of generality that {m} vanishes.)

For {j=1,\dots,n}, let {\gamma_{0,j}: [j-1,j] \rightarrow {\bf C}} denote the curve

\displaystyle \gamma_{0,j}(t) := \gamma_0( t_{j-1} + (t-j+1) (t_j - t_{j-1}) )

from {\gamma_0(t_{j-1})} to {\gamma_0(t_j)}; similarly let {\gamma_{1,j}: [j-1,j] \rightarrow {\bf C}} denote the curve

\displaystyle \gamma_{1,j}(t) := \gamma_1( s_{j-1} + (t-j+1) (s_j - s_{j-1}) )

from {\gamma_1(s_{j-1})} to {\gamma_1(s_j)}. Observe from (16), (12), (13) that for each {j=1,\dots,n}, the images of {\gamma_{0,j}} and {\gamma_{1,j}} both lie in {D( \gamma_0(t_j), \frac{\delta}{2} + 3\varepsilon)}, which will lie in {N_\delta(\gamma_0([a,b]))} if {\varepsilon} is small enough. We can thus form a homotopy {\gamma: [0,1] \times [0,n] \rightarrow N_\delta(\gamma_0([a,b]))} from {\gamma_{0,1} + \dots + \gamma_{0,n}} to {\gamma_{1,1} + \dots + \gamma_{1,n}} by defining

\displaystyle \gamma( s, t ) = (1-s) \gamma_{0,j}(t) + s \gamma_{1,j}(t)

for all {1 \leq j \leq n} and {j-1 \leq t \leq j}. Thus {\gamma_{0,1} + \dots + \gamma_{0,n}} and {\gamma_{1,1} + \dots + \gamma_{1,n}} are homotopic as closed curves in {N_\delta(\gamma_0([a,b]))}. But by Exercise 3, {\gamma_{0,1} + \dots + \gamma_{0,n}} is homotopic up to reparameterisation as closed curves to {m \gamma_0} in {N_\delta(\gamma_0([a,b]))}, and {\gamma_{1,1} + \dots + \gamma_{1,n}} is similarly homotopic up to reparameterisation as closed curves to {\gamma_1} in {N_\delta(\gamma_0([a,b]))}, and the claim follows. \Box

We can now prove Theorem 48. We first verify the claim in the easy (and visually intuitive) case that {\gamma} is a non-trivial simple closed polygonal curve. Removing the polygon {\gamma([a,b])} from {{\bf C}} leaves an open set, which we may decompose into connected components as per Exercise 34 of Notes 2. On each of these components, the winding number {W_\gamma} is constant. Since each component has a non-empty boundary that is contained in {\gamma([a,b])}, this constant value of {W_\gamma} must also be attained arbitrarily close to {\gamma([a,b])}.

Now, a routine application of the Cauchy integral formula (see Exercise 59) shows that as {z_0} crosses one of the edges of the polygon {\gamma([a,b])}, the winding number {W_\gamma(z_0)} is shifted by either {+1} or {-1}. Hence at each point {z} on {\gamma([a,b])}, the winding number will take two values {\{ k, k+1\}} in a sufficiently small neighbourhood of {z} (excluding {\gamma([a,b])}). By a continuity argument, the integer {k} is independent of {z}. On the other hand, from Corollary 43 the winding number must be able to attain the value of zero. Thus we have {\{k,k+1\} = \{0,\sigma\}} for some {\sigma = \pm 1}. Dividing a small neighbourhood of {\gamma([a,b])} (excluding {\gamma([a,b])} itself) into the regions where the winding numbers are {0} or {\sigma}, a further continuity argument shows that each of these regions lie in a single connected component. Thus there are only two connected components, one where the winding number is zero and one where the winding number is {\sigma}. From (43) the latter component is bounded, hence the former is unbounded, and the claim follows.

Now we handle the significantly more difficult case when {\gamma} is just a non-trivial simple closed curve. As one may expect, the strategy will be to approximate this curve by a polygonal path, but some care has to be taken when performing a limit, in order to prevent the interior region from collapsing into nothingness, or becoming disconnected, in the limit.

The first challenge is to ensure that there is at least one point {z_0} outside of {\gamma([a,b])} in which {W_f(z_0)} is non-zero. This is actually rather tricky; we will achieve this by a parity argument (loosely inspired by a nonstandard version of this argument from this paper of Kanovei and Reeken). Clearly, {\gamma([a,b])} contains at least two points; by an appropriate rotation, translation, and dilation we may assume that {\gamma([a,b])} contains the points {+i} and {-i}, with {i} being both the initial point and the final point. Then we can decompose {\gamma = \gamma_1 + \gamma_2}, where {\gamma_1: [a,c] \rightarrow {\bf C}} is a curve from {i} to {-i}, and {\gamma_2: [c,b] \rightarrow {\bf C}} is a curve from {-i} to {i}.


Observe from the simplicity of {\gamma} that {|\gamma_1(t_1) - \gamma_2(t_2)| > 0} whenever {t_1 \in [a,c]} and {t_2 \in [c,b]} are such that

\displaystyle |\mathrm{Im}(\gamma_1(t))|, |\mathrm{Im}(\gamma_2(t_2))| \leq \frac{1}{2}. \ \ \ \ \ (17)

Thus, by compactness, there exists {0 < \delta < \frac{1}{10}} such that one has the lower bound

\displaystyle |\gamma_1(t_1) - \gamma_2(t_2)| \geq \delta \ \ \ \ \ (18)

separating {\gamma_1} from {\gamma_2} whenever {t_1 \in [a,c]}, {t_2 \in [c,b]} are such that (17) holds.

Next, for any natural number {N}, we may approximate {\gamma: [a,b] \rightarrow {\bf C}} by a polygonal closed path {\gamma^{(N)}: [a,b] \rightarrow {\bf C}} with

\displaystyle |\gamma^{(N)}(t) - \gamma(t)| < \frac{1}{N} \ \ \ \ \ (19)

for all {t \in [a,b]}. Although it is not particularly necessary, we can ensure that {\gamma^{(N)}(a) = \gamma(a) = i} and {\gamma^{(N)}(c) = \gamma(c) = -i}. By perturbing the edges of the polygonal path {\gamma^{(N)}} slightly, we may assume that none of the vertices of {\gamma^{(N)}} lie on the real axis, and that none of the self-crossings of {\gamma^{(N)}} (if any exist) lie on the real axis; thus, whenever {\gamma^{(N)}} crosses the real axis, it does so at an interior point of an edge, with no other edge of {\gamma^{(N)}} passing through that point. Note that we do not assert that the curve {\gamma^{(N)}} is simple; with some more effort one could “prune” {\gamma^{(N)}} by deleting short loops to make it simple, but this turns out to be unnecessary for the parity argument we give below.

Let {x^{(N)}_1 < x^{(N)}_2 < \dots < x^{(N)}_{n^{(N)}}} be the points on the real axis where {\gamma^{(N)}} crosses. By Exercise 59 below, the winding number {W_{\gamma^{(N)}}(x)} changes by {+1} or {-1} as {x} crosses each of the {x^{(N)}_j}; by Lemma 44, this winding number is constant otherwise, and by Corollary 43 it vanishes near infinity. Thus {n^{(N)}} is even, and the winding number is odd between {x^{(N)}_j} and {x^{(N)}_{j+1}} for any odd {j}.

Next, observe that each point {x^{(N)}_j} belongs to exactly one of the polygonal paths {\gamma^{(N)}([a,c])} or {\gamma^{(N)}([c,b])}. Since each of these curves starts on one side of the real axis and ends up on the other, they must both cross the real axis an odd number of times. On the other hand, the crossing points {x^{(N)}_1,\dots,x^{(N)}_n} can be grouped into pairs {\{x^{(N)}_j,x^{(N)}_{j+1}\}} with {j} odd. We conclude that there must exist an odd {j} such that one of the {x^{(N)}_j,x^{(N)}_{j+1}} lies in {\gamma^{(N)}([a,c])} and the other lies in {\tilde \gamma([c,b])}.

Fix such a {j}. For sake of discussion let suppose that {x^{(N)}_j} lies in {\gamma^{(N)}([a,c])} and {x^{(N)}_{j+1}} lies in {\gamma^{(N)}([c,b])}. From (19) we have

\displaystyle \mathrm{dist}( x^{(N)}_j, \gamma([a,c]) ) \leq \frac{1}{N}; \quad \mathrm{dist}( x^{(N)}_{j+1}, \gamma([c,b]) ) \leq \frac{1}{N}

and from (18) we have

\displaystyle \mathrm{dist}( x, \gamma([a,c]) ) + \mathrm{dist}( x, \gamma([c,b]) ) \geq \delta.

for any {x \in [x^{(N)}_j, x^{(N)}_{j+1}]}. By the intermediate value theorem, we can thus (for {N} large enough) find {x^{(N)}_j < x^{(N)}_* < x^{(N)}_{j+1}} such that

\displaystyle \mathrm{dist}( x^{(N)}_*, \gamma([a,c]) ) = \mathrm{dist}( x^{(N)}_*, \gamma([c,b]) )

and thus

\displaystyle \mathrm{dist}( x^{(N)}_*, \gamma([a,c]) ), \mathrm{dist}( x^{(N)}_*, \gamma([c,b]) ) \geq \frac{\delta}{2}

or equivalently

\displaystyle \mathrm{dist}( x^{(N)}_*, \gamma([a,b]) ) \geq \frac{\delta}{2}.

We arrive at the same conclusion in the opposite case when {x^{(N)}_j} lies in {\gamma^{(N)}([c,b])} and {x^{(N)}_{j+1}} lies in {\gamma^{(N)}([a,c])}.

By Corollary 43 (and (19)), the {x^{(N)}_*} are bounded in {N}. By the Bolzano-Weierstrass theorem, we can thus extract a subsequence of the {x^{(N)}_*} that converges to some limit {x_*}. By continuity we then have

\displaystyle \mathrm{dist}( x_*, \gamma([a,b]) ) \geq \frac{\delta}{2},

in particular {x_*} does not lie in {\gamma([a,b])}. By construction of {x^{(N)}_*}, we know that {W_{\gamma^{(N)}}( x^{(N)}_* )} is odd for all {N}; using Lemma 44 and Lemma 42 we conclude that {W_\gamma(x_*)} is also odd. Thus we have found at least one point where the winding number is non-zero.

Now we can finish the proof of the Jordan curve theorem. Let {\gamma: [a,b] \rightarrow {\bf C}} be a non-trivial simple closed curve. By the preceding discussion, we can find a point {z_*} outside of {\gamma([a,b])} where the winding number {W_\gamma} is non-zero. Let {\delta > 0} be a sufficiently small parameter, and let {0 < \varepsilon = \varepsilon(\delta) < \delta} be sufficiently small depending on {\delta}. By compactness, one can cover the region {N_{\varepsilon/10}(\gamma([a,b]))} by a finite number of (solid) squares {S} of sidelength {\varepsilon/10} and sides parallel to the real and imaginary axes; by perturbation we may assume that no edge of one square is collinear to an edge of any other square. These squares all lie in {N_\varepsilon(\gamma([a,b]))}, and in particular will not contain {z_*} if {\delta} is small enough; their union can easily be seen to be connected. The boundaries of these squares divide the complex plane into a finite number of polygonal regions (one of whom is unbounded). One of these regions, call it {\Omega_\delta}, contains the point {z_*}. This region cannot contain any interior point of a square {S}, since otherwise {\Omega_\delta} would be trapped inside a square of sidelength {\varepsilon/10} and hence not contain {z_*}. In particular, {\Omega_\delta} avoids {N_{\varepsilon/10}(\gamma([a,b]))}. The region {\Omega_\delta} cannot be unbounded, since one could then continuously move {z_*} to infinity without ever meeting {\gamma([a,b])}, contradicting Lemma 44, Corollary 43, and the non-vanishing nature of {W_\gamma(z_*)}. The boundary of {\Omega_\delta} consists of one or more disjoint closed polygonal paths, whose edges consist of horizontal and vertical line segments. Actually, the boundary must consist of just one closed path, since otherwise the union of the squares {S} would be disconnected, a contradiction. Let {\gamma_\delta} denote the path that bounds {\Omega_\delta} (traversed in either of the two possible directions). This path must be simple, because a crossing can only be formed by an edge of one square {S} crossing an edge of another square {S'} at a point that is not on the corner of either of the two squares; as {\Omega_\delta} avoids both {S} and {S'}, it can thus only occupy one quadrant of a neighbourhood of this crossing and so cannot bound all four edges of the crossing.


Applying the Jordan curve theorem to the polygonal path {\gamma_\delta}, we conclude that there is {\sigma_\delta \in \{-1,+1\}} such that {W_{\gamma_\delta}(z) = \sigma_\delta} on {\Omega_\delta}, and {W_{\gamma_\delta}(z) = 0} for all {z} outside of {\Omega_\delta} and {\tilde \gamma([a,b])}. On the other hand, by Proposition 57 there is an integer {m_\delta} such that {\gamma_\delta} is homotopic (as closed curves, up to reparameterisation) in {N_\delta([a,b])} to {m_\delta \gamma}, so in particular

\displaystyle W_{\gamma_\delta}(z) = m_\delta W_\gamma(z)

for all {z \not \in N_\delta(\gamma([a,b]))}. Applying this to {z = z_*}, we conclude that {m_\delta} is either {+1} or {-1}. If we write {\sigma_\delta = m_\delta \sigma} (where {\sigma}, a priori, may depend on {\delta}), then {\sigma \in \{-1,+1\}}, and we have

\displaystyle W_\gamma(z) = \sigma \ \ \ \ \ (20)

for {z \in \Omega_\delta \backslash N_\delta(\gamma([a,b]))}, and

\displaystyle W_\gamma(z) = 0 \ \ \ \ \ (21)

for {z \in ({\bf C} \backslash \Omega) \backslash N_\delta(\gamma([a,b]))}. Thus {W_\gamma} takes only two values outside of {N_\delta(\gamma([a,b]))}. Sending {\delta \rightarrow 0}, we conclude that {\sigma} is in fact independent of {\delta}, and {W_\gamma} takes only the two values {0, \sigma} outside of {\gamma([a,b])}.

We now define the interior and exterior regions by (9), (8), then we have partitioned {{\bf C}} into the interior, exterior, and {\gamma([a,b])}. From Lemma 44 the interior and exterior are open, and from Lemma 43 the interior is bounded, and hence the exterior is unbounded. The point {z_*} lies in the interior, so the interior is non-empty. The only remaining task to show is that the interior and exterior are connected. Suppose for instance that {z_1, z_2} lie in the interior region. Then for {\delta} small enough, {z_1, z_2} lie outside of {N_\delta(\gamma([a,b]))}. From (20), (21) we conclude that {z_1,z_2} lie in {\Omega_\delta}. As {\Omega_\delta} is connected, we can thus join {z_1} to {z_2} by a path in {\Omega_\delta}. As the region {\Omega_\delta} avoids {N_{\varepsilon(\delta)/10}(\gamma([a,b]))}, we see from Lemma 44 that the winding number {W_\gamma} stays constant on this path, and so the path remains in the interior region (9). This establishes the connectedness of the interior region; the connectedness of the exterior is proven similarly.

It remains to prove the contractibility of {\gamma} in any open set {U} that contains {\gamma} and its interior. Once again, we begin with the simpler case when {\gamma} is a simple closed polygonal path. We induct on the number {n} of edges in {\gamma}. The cases {n \leq 3} can be handled by direct calculation, so suppose that {n>3} and the claim has been proven for all smaller values of {n}. We may remove any edges of zero length from the polygon. If the interior of the polygon is convex, then the claim follows from Example 2, so we may assume that the interior is non-convex. This implies that one of the interior angles in the polygon exceeds {\pi} (see Exercise 11 below), thus there are two adjacent edges {e,f} whose interior angle exceeds {\pi}. If one extends {e} in the interior until it meets the polygon again, this wil divide the polygon into two subpolygons {\gamma_1,\gamma_2}, each of which can be verified to have fewer than {n} edges. By Exercise 49 (which does not use the contractibility part of the Jordan curve theorem), the interiors of {\gamma_1} and {\gamma_2} are contained in the interior of {\gamma}, and so by the induction hypothesis they are contractible to a point in {U}. Using Exercise 3 we conclude that {\gamma} is contractible to a point in {U} also.


Now suppose that {\gamma: [a,b] \rightarrow {\bf C}} is an arbitrary simple closed curve. Let {\delta > 0} be a small parameter. As before, we can find a simple polygonal path {\gamma_\delta} whose interior {\Omega_\delta} lies in the interior of {\gamma}, and such that {\gamma_\delta} is homotopic to {m_\delta \gamma} in {N_\delta(\gamma([a,b]))}, and hence in {U} if {\delta} is small enough, for some {m_\delta = \pm 1}. From the previous discussion we see that {\gamma_\delta} is contractible to a point in {U}, and so {m_\delta \gamma} is also. The claim then follows (after reversing the contour {m_\delta \gamma} if necessary). This concludes the proof of the Jordan curve theorem.

Exercise 58 Let {\gamma} be a simple closed polygonal path with all edges of positive length. Suppose that all interior angles of {\gamma} (that is, the angle that two adjacent edges make in the interior of the polygon) are less than or equal to {\pi}. Show that the interior of {\gamma} is convex. (Hint: use a continuity argument to show that every line meets the interior of {\gamma} in at most one interval.)

Exercise 59 Let {\gamma} be a non-trivial simple closed polygonal curve, and let {z_0} be a point in the interior of an edge {e} of {\gamma} (i.e., {z_0} is not one of the two vertices of {e}). Let {z, z'} be two points sufficiently close to {z_0} that lie on opposite sides of {e}. Without using the Jordan curve theorem, show that {|W_\gamma(z) - W_\gamma(z')| = 1}. (Hint: replace {\gamma} by a “local” closed contour that is quite short, and a “global” closed contour which avoids the line segment connecting {z} and {z'}. Then use the Cauchy integral formula.)

Exercise 60 (Jordan arc theorem) Let {\gamma: [a,b] \rightarrow {\bf C}} be a simple non-closed curve. Show that the complement of {\gamma([a,b])} in {{\bf C}} is connected. (Hint: first establish a variant of Proposition 57 for non-closed curves, in which {m} is now set to zero. Then adapt the proof of the Jordan curve theorem.

Exercise 61 Let {U} be a bounded connected non-empty open subset of {{\bf C}}. Show that {U} is simply connected if and only if the complement {{\bf C} \backslash U} is connected. (Hint: suppose that there is a point {z_*} in {{\bf C} \backslash U} that is separated from infinity by {U}. Show that there is some compact subset {K} of {U} that also separates {z_*} from infinity. Then cover {K} by small squares as in the proof of the Jordan curve theorem to locate a simple closed polygonal path in {U} that separates {z_*} from infinity.)

(The exercises below were added after the notes were first released; they will ultimately be moved to a more appropriate location, but are being placed here for now in order to not disrupt existing numbering.)

Exercise 62 Let {U} be a simply connected subset of {{\bf C}}, and let {u: U \rightarrow {\bf R}} be a harmonic function. Show that {u} has a harmonic conjugate {v: U \rightarrow {\bf R}}, which is unique up to additive constants.

One can interpret Cauchy’s theorem through the lens of algebraic topology, and particularly through the machinery of homology and cohomology. We will not develop this perspective in depth in these notes, but the following exercise will give a brief glimpse of the connections to homology and cohomology.

Exercise 63 Let {U} be an open subset of {{\bf C}}. Define a {0}-chain in {U} to be a formal linear combination

\displaystyle \sum_{i=1}^n m_i [z_i]

of points {z_i \in U} (which we enclose in brackets to avoid confusion with the arithmetic operations on {{\bf C}}, in particular {[z]+[w]} is not identified with {[z+w]}), where {n} is a natural number and the {m_i} are integers; these form an additive abelian group in the usual fashion. Similarly, define a {1}-chain in {U} to be a formal linear combination

\displaystyle \sum_{i=1}^n m_i [\gamma_i]

of curves {\gamma_i: [0,1] \rightarrow U} in {U}, which (for very minor notational reasons) we will fix to have domain in the unit interval {[0,1]}. Finally, define a {2}-chain in {U} to be a formal linear combination

\displaystyle \sum_{i=1}^n m_i [T_i]

of {2}-simplices {T_i: \Delta_2 \rightarrow U}, defined as continuous maps from the solid triangle {\Delta_2 := \{ (x,y) \in {\bf R}^2: x,y \geq 0; x+y \leq 1 \}}.

Given a {1}-chain {c = \sum_{i=1}^n m_i [\gamma_i]} in {U}, we define its boundary {\partial c} to be the {0}-chain

\displaystyle \partial c := \sum_{i=1}^n m_i ( [\gamma_i(1)] - [\gamma_i(0)] )

and call {c} a {1}-cycle if {\partial c = 0}. Similarly, given a {2}-chain {c = \sum_{i=1}^2 m_i [T_i]}, we define its boundary {\partial c} to be the {1}-chain

\displaystyle \partial c := \sum_{i=1}^n m_i ( [t \mapsto T_i(0,t)] + [t \mapsto T_i(t,1-t)]

\displaystyle + [t \mapsto T_i(1-t,0)] )

where {t \mapsto T_i(0,t)} is the curve on {[0,1]} that maps {t} to {T_i(0,t)}, and similarly for {t \mapsto T_i(t,1-t)} and {t \mapsto T_i(1-t,0)}. If {c := \sum_{i=1}^n m_i [\gamma_i]} is a {1}-cycle, and {f: U \rightarrow {\bf C}} is holomorphic, define the integral {\int_c f} by

\displaystyle \int_c f := \sum_{i=1}^n m_i \int_{\gamma_i} f(z)\ dz.

If {z_0} lies outside of a {1}-cycle {c}, define the winding number

\displaystyle W_c(z_0) := \frac{1}{2\pi i} \int_c \frac{1}{z-z_0}.

  • (i) Show that if {c} is a {2}-chain in {U}, then {\partial \partial c = 0}.
  • (ii) Show that if {c} is a {2}-chain in {U} and {f: U \rightarrow {\bf C}} is holomorphic, then {\int_{\partial c} f = 0}.
  • (iii) If {c} is a {1}-cycle in {U}, and {W_c(z) = 0} for all {z \in {\bf C} \backslash U}, show that {\int_c f = 0}. (Hint: first perturb {c} to be the union of line segments coming from a grid of some small sidelength {\varepsilon}. Observe that the winding number {W_c(z)} is constant whenever {z} ranges in the interior of one of the squares in this grid. Then find another {1}-cycle {c'} coming from summing boundaries of such squares such that {W_c(z) = W_{c'}(z)} for all {z} in the interior of grid squares. Then show that {\int_{c-c'} f = 0} and {\int_{c'} f = 0}.)
  • (iv) If {c} is a {1}-cycle, and {f: U \rightarrow {\bf C}} has an antiderivative, show that {\int_c f = 0}.
  • (v) If {c = \sum_{i=1}^n m_i [\gamma_i]} is a {1}-cycle, {W_c(z) = 0} for all {z \in {\bf C} \backslash U}, {f: U \rightarrow {\bf C}} is holomorphic, and {z_0} is a point lying outside of any of the {\gamma_i}, show that

    \displaystyle \frac{1}{2\pi i} \int_c \frac{f(z)}{z-z_0} = W_c( z_0) f(z_0).

Exercise 64 Let {U} be a subset of the complex plane which is star-shaped, which means that there exists {z_0 \in U} such that for any {z \in U}, the line segment {\{ (1-t) z_0 + tz: t \in [0,1]\}} is also contained in {U}. Show that every star-shaped set is simply connected.


Filed under: 246A - complex analysis, math.AT Tagged: Cauchy's theorem, contour integration, Jordan's theorem, simply connected

David Hogg#dsesummit, day 1

I'm at the Moore-Sloan Data Science Environments annual summit. Much of what we have been doing doesn't exactly count as research, by my (constantly weakening) standards. However, there was an absolutely great and wide-ranging discussion of Hack Weeks and Sprints and their role in education and scientific investigation. This led to a group of us committing to start a paper on the subject (not a white paper, but a paper). The just-started draft is here, and we accept pull requests.

There were some great lightning talks at dinner time. My personal favorite was Kellie Ottoboni (Berkeley) talking about the finiteness of the state space of random number generators. She (with Stark and Rivest) is looking at the possibility that there are random number generators possible with an infinite state space, capitalizing on the ideas around cryptographic hash functions. She sowed some (deserved) fear about using a 32-bit random-number generator in combinatoric contexts. Since our own emcee makes combinatoric choices, this could conceivably be relevant to our master branch!

Matt StrasslerA Hidden Gem At An Old Experiment?

This summer there was a blog post from   claiming that “The LHC `nightmare scenario’ has come true” — implying that the Large Hadron Collider [LHC] has found nothing but a Standard Model Higgs particle (the simplest possible type), and will find nothing more of great importance. With all due respect for the considerable intelligence and technical ability of the author of that post, I could not disagree more; not only are we not in a nightmare, it isn’t even night-time yet, and hardly time for sleep or even daydreaming. There’s a tremendous amount of work to do, and there may be many hidden discoveries yet to be made, lurking in existing LHC data.  Or elsewhere.

I can defend this claim (and have done so as recently as this month; here are my slides). But there’s evidence from another quarter that it is far too early for such pessimism.  It has appeared in a new paper (a preprint, so not yet peer-reviewed) by an experimentalist named Arno Heister, who is evaluating 20-year old data from the experiment known as ALEPH.

In the early 1990s the Large Electron-Positron (LEP) collider at CERN, in the same tunnel that now houses the LHC, produced nearly 4 million Z particles at the center of ALEPH; the Z’s decayed immediately into other particles, and ALEPH was used to observe those decays.  Of course the data was studied in great detail, and you might think there couldn’t possibly be anything still left to find in that data, after over 20 years. But a hidden gem wouldn’t surprise those of us who have worked in this subject for a long time — especially those of us who have worked on hidden valleys. (Hidden Valleys are theories with a set of new forces and low-mass particles, which, because they aren’t affected by the known forces excepting gravity, interact very weakly with the known particles.  They are also often called “dark sectors” if they have something to do with dark matter.)

For some reason most experimenters in particle physics don’t tend to look for things just because they can; they stick to signals that theorists have already predicted. Since hidden valleys only hit the market in a 2006 paper I wrote with then-student Kathryn Zurek, long after the experimenters at ALEPH had moved on to other experiments, nobody went back to look in ALEPH or other LEP data for hidden valley phenomena (with one exception.) I didn’t expect anyone to ever do so; it’s a lot of work to dig up and recommission old computer files.

This wouldn’t have been a problem if the big LHC experiments (ATLAS, CMS and LHCb) had looked extensively for the sorts of particles expected in hidden valleys. ATLAS and CMS especially have many advantages; for instance, the LHC has made over a hundred times more Z particles than LEP ever did. But despite specific proposals for what to look for (and a decade of pleading), only a few limited searches have been carried out, mostly for very long-lived particles, for particles with mass of a few GeV/c² or less, and for particles produced in unexpected Higgs decays. And that means that, yes, hidden physics could certainly still be found in old ALEPH data, and in other old experiments. Kudos to Dr. Heister for taking a look.

Now, has he actually found something hidden at ALEPH? It’s far too early to say. Dr. Heister is careful not to make a strong claim: his paper refers to an observed excess, not to the discovery of or even evidence for anything. But his analysis can be interpreted as showing a hint of a new particle (let’s call it the V particle, just to have a name for it) decaying sometimes to a muon and an anti-muon, and probably also sometimes to an electron and an anti-electron, with a rest mass about 1/3 of that of the Z particle — about 30 GeV/c². Here’s one of the plots from his paper, showing the invariant mass of the muon and anti-muon in Z decays that also have evidence of a bottom quark and a bottom anti-quark (each one giving a jet of hadrons that has been “b-tagged”).  There’s an excess at about 30 GeV.


ALEPH data as analyzed in Heister’s paper, showing the number of Z particle decays with two bottom quark jets and a muon/anti-muon pair, as a function of the invariant mass of the muon/anti-muon pair.  The bump at around 30 GeV is unexpected; might it be a new particle?  Not likely, but not impossible.

The simplest physical effect that would produce such a bump is a new particle; indeed this is how the Z particle itself was identified, over three decades ago.

However, the statistical significance of the bump is still only (after look-elsewhere effect) at most 3 standard deviations, according to the paper. So this bump could just be a fluke; we’ve seen similar ones disappear with more data, for example this one. There are also a couple of serious issues that will give experts pause (the width of the bump is surprisingly large; the angular correlations seem consistent with background rather than a new signal; etc.) So the data itself is not enough to convince anyone, including Dr. Heister, though it is certainly interesting.

Conversely it is intriguing that the bump in the plot above is observed in events with bottom quarks. It is common for hidden valleys (including everything from a simple abelian Higgs models to more complex confining models) to contain

  • at least one spin-one particle V (which can decay to muon/anti-muon or electron/positron) and
  • at least one spin-zero particle S (which can decay to bottom/anti-bottom preferentially, with occasional decays to tau/anti-tau.)

For example, in such models, a rare decay such as Z  ⇒ V + S, producing a muon/anti-muon pair plus two bottom quark/anti-quark jets, would often be a possibility.*

*[In this case the bottom and anti-bottom jets would themselves show a peak in their invariant mass, but unfortunately their distribution in the presence of a candidate V was not shown. One other obvious prediction of such a model is a handful of striking Z ⇒ V + S ⇒ muon/anti-muon + tau/anti-tau events; but the expected number is very small and somewhat model-dependent.]

Another possibility (also common in hidden valleys) is that the bottom-tagged jets aren’t actually from real bottom quarks, and are instead fake bottom jets generated by one or two new long-lived hidden valley particles.

But clearly, before anyone gets excited, far more evidence is required. We’ll need to see similar studies done at one or more of the three other experiments that ran concurrently with ALEPH — L3, OPAL, and DELPHI. And of course ATLAS, CMS, and LHCb will surely take a look in their own data; for instance, ATLAS and CMS could search for a dilepton resonance in events with at least two bottom-tagged jets, where the whole system of bottom-tagged jets and dileptons has a invariant mass not greater than about 100 GeV/c². [[IMPORTANT NOTE ADDED: It has been pointed out to me (thanks Matt Reece) that there was a relevant CMS search from 2015 that had somehow entirely escaped my attention, in which one b-tag was required and a di-muon bump was sought between 25 and 60 GeV.  Although not aimed at hidden valleys, it provides one of the few constraints upon them in this mass range.  And at first glance, it seems to disfavor any signal large enough to explain the ALEPH excess.  But there might be subtleties, so let me not draw firm conclusions yet.]] They should also look for the V particle in other ways — perhaps following the methods I’ve suggested repeatedly (see for example pages 40-45 of this 2008 talk) — since the V might not only appear in Z particle decays. [That is: look for boosted V’s; look for V’s in high-energy events or high missing-energy events; look for V’s in events with many jets, possibly with bottom-tags; etc.] In any case, if anything like the V particle really exists, several (and perhaps all) of the experiments should see some evidence for it, and in more than just a single context.

Though we should be skeptical that today’s paper on ALEPH data is the first step toward a major discovery, at minimum it is important for what it indirectly confirms: that searches at the LHC are far from complete, and that discoveries might lie hidden, for example in rare Z decays (and in rare decays of other particles, such as the top quark.) Neither ATLAS, CMS nor LHCb have ever done a search for rare but spectacular Z particle decays, but they certainly could, as they recently did for the Higgs particle; and if Heister’s excess turns out to be a real signal, they will be seen to have missed a huge opportunity.  So I hope that Heister’s paper, at a minimum, will encourage the LHC experiments to undertake a broader and more comprehensive program of searches for low-mass particles with very weak interactions.  Otherwise, my own nightmare, in which the diamonds hidden in the rough might remain undetected — perhaps for decades — might come true.

Filed under: Other Collider News, Particle Physics Tagged: ALEPH, atlas, cms, dilepton, HiddenValleys, LEP, LHC, LHCb

October 24, 2016

David Hogg#GaiaSprint, day 5

Today was the final day for the Sprint, and included an incredible wrap-up. My best way to communicate the awesome is just to link out to the final wrap-up slides, which we all edited simultaneously. Each participant was permitted one slide, and we worked through the full crowd (one presentation each, and questions) in a few hours. Amazing things were accomplished this week, and I anticipate multiple papers submitted to the refereed literature. My own work was on vertical heating of the Milky Way disk, measurement of the disk mid-plane location and tilt (yes, I think we have a result), and the metallicities of co-moving star pairs. The day ended with a short talk by Jim Simons (Simons) who told us about his plans for the CCA and his other centers for computational science.

Terence TaoMath 246A, Notes 4: singularities of holomorphic functions

In the previous set of notes we saw that functions {f: U \rightarrow {\bf C}} that were holomorphic on an open set {U} enjoyed a large number of useful properties, particularly if the domain {U} was simply connected. In many situations, though, we need to consider functions {f} that are only holomorphic (or even well-defined) on most of a domain {U}, thus they are actually functions {f: U \backslash S \rightarrow {\bf C}} outside of some small singular set {S} inside {U}. (In this set of notes we only consider interior singularities; one can also discuss singular behaviour at the boundary of {U}, but this is a whole separate topic and will not be pursued here.) Since we have only defined the notion of holomorphicity on open sets, we will require the singular sets {S} to be closed, so that the domain {U \backslash S} on which {f} remains holomorphic is still open. A typical class of examples are the functions of the form {\frac{f(z)}{z-z_0}} that were already encountered in the Cauchy integral formula; if {f: U \rightarrow {\bf C}} is holomorphic and {z_0 \in U}, such a function would be holomorphic save for a singularity at {z_0}. Another basic class of examples are the rational functions {P(z)/Q(z)}, which are holomorphic outside of the zeroes of the denominator {Q}.

Singularities come in varying levels of “badness” in complex analysis. The least harmful type of singularity is the removable singularity – a point {z_0} which is an isolated singularity (i.e., an isolated point of the singular set {S}) where the function {f} is undefined, but for which one can extend the function across the singularity in such a fashion that the function becomes holomorphic in a neighbourhood of the singularity. A typical example is that of the complex sinc function {\frac{\sin(z)}{z}}, which has a removable singularity at the origin {0}, which can be removed by declaring the sinc function to equal {1} at {0}. The detection of isolated removable singularities can be accomplished by Riemann’s theorem on removable singularities (Exercise 35 from Notes 3): if a holomorphic function {f: U \backslash S \rightarrow {\bf C}} is bounded near an isolated singularity {z_0 \in S}, then the singularity at {z_0} may be removed.

After removable singularities, the mildest form of singularity one can encounter is that of a pole – an isolated singularity {z_0} such that {f(z)} can be factored as {f(z) = \frac{g(z)}{(z-z_0)^m}} for some {m \geq 1} (known as the order of the pole), where {g} has a removable singularity at {z_0} (and is non-zero at {z_0} once the singularity is removed). Such functions have already made a frequent appearance in previous notes, particularly the case of simple poles when {m=1}. The behaviour near {z_0} of function {f} with a pole of order {m} is well understood: for instance, {|f(z)|} goes to infinity as {z} approaches {z_0} (at a rate comparable to {|z-z_0|^{-m}}). These singularities are not, strictly speaking, removable; but if one compactifies the range {{\bf C}} of the holomorphic function {f: U \backslash S \rightarrow {\bf C}} to a slightly larger space {{\bf C} \cup \{\infty\}} known as the Riemann sphere, then the singularity can be removed. In particular, functions {f: U \backslash S \rightarrow {\bf C}} which only have isolated singularities that are either poles or removable can be extended to holomorphic functions {f: U \rightarrow {\bf C} \cup \{\infty\}} to the Riemann sphere. Such functions are known as meromorphic functions, and are nearly as well-behaved as holomorphic functions in many ways. In fact, in one key respect, the family of meromorphic functions is better: the meromorphic functions on {U} turn out to form a field, in particular the quotient of two meromorphic functions is again meromorphic (if the denominator is not identically zero).

Unfortunately, there are isolated singularities that are neither removable or poles, and are known as essential singularities. A typical example is the function {f(z) = e^{1/z}}, which turns out to have an essential singularity at {z=0}. The behaviour of such essential singularities is quite wild; we will show here the Casorati-Weierstrass theorem, which shows that the image of {f} near the essential singularity is dense in the complex plane, as well as the more difficult great Picard theorem which asserts that in fact the image can omit at most one point in the complex plane. Nevertheless, around any isolated singularity (even the essential ones) {z_0}, it is possible to expand {f} as a variant of a Taylor series known as a Laurent series {\sum_{n=-\infty}^\infty a_n (z-z_0)^n}. The {\frac{1}{z-z_0}} coefficient {a_{-1}} of this series is particularly important for contour integration purposes, and is known as the residue of {f} at the isolated singularity {z_0}. These residues play a central role in a common generalisation of Cauchy’s theorem and the Cauchy integral formula known as the residue theorem, which is a particularly useful tool for computing (or at least transforming) contour integrals of meromorphic functions, and has proven to be a particularly popular technique to use in analytic number theory. Within complex analysis, one important consequence of the residue theorem is the argument principle, which gives a topological (and analytical) way to control the zeroes and poles of a meromorphic function.

Finally, there are the non-isolated singularities. Little can be said about these singularities in general (for instance, the residue theorem does not directly apply in the presence of such singularities), but certain types of non-isolated singularities are still relatively easy to understand. One particularly common example of such non-isolated singularity arises when trying to invert a non-injective function, such as the complex exponential {z \mapsto \exp(z)} or a power function {z \mapsto z^n}, leading to branches of multivalued functions such as the complex logarithm {z \mapsto \log(z)} or the {n^{th}} root function {z \mapsto z^{1/n}} respectively. Such branches will typically have a non-isolated singularity along a branch cut; this branch cut can be moved around the complex domain by switching from one branch to another, but usually cannot be eliminated entirely, unless one is willing to lift up the domain {U} to a more general type of domain known as a Riemann surface. As such, one can view branch cuts as being an “artificial” form of singularity, being an artefact of a choice of local coordinates of a Riemann surface, rather than reflecting any intrinsic singularity of the function itself. The further study of Riemann surfaces is an important topic in complex analysis (as well as the related fields of complex geometry and algebraic geometry), but unfortunately this topic will probably be postponed to the next course in this sequence (which I will not be teaching).

— 1. Laurent series —

Suppose we are given a holomorphic function {f: U \rightarrow {\bf C}} and a point {z_0} in {U}. For a sufficiently small radius {r > 0}, the circle {\gamma_{z_0,r,\circlearrowleft}} and its interior both lie in {U}, and the Cauchy integral formula tells us that

\displaystyle f(z) = \frac{1}{2\pi i} \int_{\gamma_{z_0,r,\circlearrowleft}} \frac{f(w)}{w-z}\ dw

in the interior of this circle. In Corollary 18 of Notes 3, this was used to form a convergent Taylor series expansion

\displaystyle f(z) = \sum_{n=0}^\infty a_n (z-z_0)^n

in the interior of this circle, where the coefficients {a_n} could be reconstructed from the values of {f} on the circle {\gamma_{z_0,r,\circlearrowleft}} by the formula

\displaystyle a_n := \frac{1}{2\pi i} \int_{\gamma_{z_0,r,\circlearrowleft}} \frac{f(w)}{(w-z)^{n+1}}\ dw.

Now suppose that {f: U \backslash \{z_0\} \rightarrow {\bf C}} is only known to be holomorphic outside of {z_0}. Then the Cauchy integral formula no longer directly applies, because the interiors of contours such as {\gamma_{z_0,r,\circlearrowleft}} are no longer contained in the region {U \backslash \{z_0\}} where {f} is holomorphic. To deal with this issue, we use the following convenient decomposition.

Lemma 1 (Cauchy integral formula decomposition in annular regions) Let {f: U \rightarrow {\bf C}} be a holomorphic function. Let {\gamma_1}, {\gamma_2} be simple closed anticlockwise contours in {U} such that {\gamma_2} is contained in the interior {\mathrm{int}(\gamma_1)} of {\gamma_1} (or equivalently, by Exercise 49 of Notes 3, that {\gamma_1} is contained in the exterior {\mathrm{ext}(\gamma_2)} of {\gamma_2}). Suppose also that the “annular region” {\mathrm{int}(\gamma_1) \cap \mathrm{ext}(\gamma_2)} is contained in {U}. Then there exists a decomposition

\displaystyle f = f_1 + f_2

on {U}, where {f_1: U \cup \mathrm{int}(\gamma_1) \rightarrow {\bf C}} is holomorphic on the union of {U} and the interior of {\gamma_1}, and {f_2: U \cup \mathrm{ext}(\gamma_2) \rightarrow {\bf C}} is holomorphic on the union of {U} and the exterior of {\gamma_2}, with {f_2(z) \rightarrow 0} as {z \rightarrow \infty}. Furthermore, if {U} is connected, then this decomposition is unique.

In addition, we have the Cauchy integral type formulae

\displaystyle f_1(z) = \frac{1}{2\pi i} \int_{\gamma_1} \frac{f(w)}{z-w}\ dw

for {z} in the interior of {\gamma_1}, and

\displaystyle f_2(z) = - \frac{1}{2\pi i} \int_{\gamma_2} \frac{f(w)}{z-w}\ dw

for {z} in the exterior of {\gamma_2}. In particular, we have

\displaystyle f(z) = \frac{1}{2\pi i} \int_{\gamma_1} \frac{f(w)}{z-w}\ dw - \frac{1}{2\pi i} \int_{\gamma_2} \frac{f(w)}{z-w}\ dw \ \ \ \ \ (1)

for {z} in {\mathrm{int}(\gamma_1) \cap \mathrm{ext}(\gamma_2)}.


Proof: We begin with uniqueness. Suppose we have two decompositions

\displaystyle f = f_1 + f_2 = f'_1 + f'_2

on {U}, where {f_1,f'_1: U \cup \mathrm{int}(\gamma_1) \rightarrow {\bf C}} and {f_2,f'_2: U \cup \mathrm{ext}(\gamma_2) \rightarrow {\bf C}} holomorphic, and {f_2, f'_2} both going to zero at infinity. Then the holomorphic functions {f_1-f'_1: \mathrm{int}(\gamma_1) \rightarrow {\bf C}} and {f'_2 - f_2: \mathrm{ext}(\gamma_2) \rightarrow {\bf C}} agree on the common domain {\mathrm{int}(\gamma_1) \cap \mathrm{ext}(\gamma_2)}, and are hence restrictions of a single entire function {F: {\bf C} \rightarrow {\bf C}}. But {F} goes to zero at infinity and is hence bounded; applying Liouville’s theorem (Theorem 28 of Notes 3) we see that {F} vanishes entirely. This gives {f_1=f'_1} and {f_2=f'_2} on the non-empty open set {\mathrm{int}(\gamma_1) \cap \mathrm{ext}(\gamma_2)}, and then we have {f_1=f'_1} and {f_2=f'_2} on {U \cup \mathrm{int}(\gamma_1)} and {U \cup \mathrm{ext}(\gamma_2)} by analytic continuation (Corollary 23 of Notes 3).

Now for existence. Suppose that we can establish the identity (1) for {z} in {\mathrm{int}(\gamma_1) \cap \mathrm{ext}(\gamma_2)}. Then we can define {f_1} on {\mathrm{int}(\gamma_1)} by

\displaystyle f_1(z) := \frac{1}{2\pi i} \int_{\gamma_1} \frac{f(w)}{w-z}\ dw

and on {U \cap \mathrm{ext}(\gamma_2)} by

\displaystyle f_1(z) := f(z) + \frac{1}{2\pi i} \int_{\gamma_2} \frac{f(w)}{w-z}\ dw,

noting from (1) that this consistently defines {f_1} on

\displaystyle \mathrm{int}(\gamma_1) \cup (U \cap \mathrm{ext}(\gamma_2)) = U \cup \mathrm{int}(\gamma_1).

From Exercise 36 of Notes 3 we see that {f_1} is holomorphic. Similarly if we define {f_2} on {\mathrm{ext}(\gamma_2)} by

\displaystyle f_2(z) := - \frac{1}{2\pi i} \int_{\gamma_2} \frac{f(w)}{w-z}\ dw

and on {U \cap \mathrm{int}(\gamma_1)} by

\displaystyle f_2(z) := f(z) - \frac{1}{2\pi i} \int_{\gamma_1} \frac{f(w)}{w-z}\ dw.

One can then verify that {f_1,f_2} obey all the required properties.

Thus it remains to establish (1). This follows from the homology form of the Cauchy integral formula (Exercise 63(v) of Notes 3), but we can also avoid explicit use of homology by the following “keyhole contour” argument. For {z \in \mathrm{int}(\gamma_1) \cap \mathrm{ext}(\gamma_2)}, we have

\displaystyle \frac{1}{2\pi i} \int_{\gamma_1} \frac{1}{w-z}\ dw = 1


\displaystyle \frac{1}{2\pi i} \int_{\gamma_2} \frac{1}{w-z}\ dw = 0

and so to prove (1), it suffices to show that

\displaystyle \frac{1}{2\pi i} \int_{\gamma_1} \frac{f(w)-f(z)}{z-w}\ dw = \frac{1}{2\pi i} \int_{\gamma_2} \frac{f(w)-f(z)}{w-z}\ dw.

By the factor theorem (Corollary 22 of Notes 3) it thus suffices to show that

\displaystyle \int_{\gamma_1} F(w)\ dw = \int_{\gamma_2} F(w)\ dw \ \ \ \ \ (2)

whenever {F: U \rightarrow {\bf C}} is holomorphic.

By perturbing {\gamma_1,\gamma_2} using Cauchy’s theorem we may assume that these curves are simple closed polygonal paths (if one wishes, one can also restrict the edges to be horizontal and vertical, although this is not strictly necessary for the argument). By connecting a point in {\gamma_1} to a point in {\gamma_2} by a polygonal path in the interior of {\gamma_1}, and removing loops, self-intersections, or excursions into the interior (or image) of {\gamma_2}, we can find a simple polygonal path {\gamma_3} from a point {z_1} in {\gamma_1} to a point {z_2} in {\gamma_2} that lies entirely in {\mathrm{int}(\gamma_1) \cap \mathrm{ext}(\gamma_2)} except at the endpoints. By rearranging {\gamma_1} and {\gamma_2} we may assume that {z_1} is the initial and terminal point of {\gamma_1}, and {z_2} is the initial and terminal point of {\gamma_2}. Then the closed polygonal path {\gamma_1 + \gamma_3 + \gamma_2 + (-\gamma_3)} has vanishing winding number in the interior of {\gamma_2} or exterior of {\gamma_1}, thus {U} contains all the points where the winding number is non-zero. This path is not simple, but we can approximate it to arbitrary accuracy {\varepsilon} by a simple closed polygonal path {\gamma_\varepsilon} by shifting the simple polygonal paths {\gamma_3} and {-\gamma_3} slightly; for {\varepsilon} small enough, the interior of {\gamma_\varepsilon} will then lie in {U}. Applying Cauchy’s theorem (Theorem 52 of Notes 3) we conclude that

\displaystyle \int_{\gamma_\varepsilon} F(w)\ dw = 0;

taking limits as {\varepsilon \rightarrow 0} we obtain (2) as claimed. \Box

Exercise 2 Let {\gamma_0} be a simple closed anticlockwise contour, and let {\gamma_1,\dots,\gamma_n} be simple closed anticlockwise contours in the interior {\mathrm{int}(\gamma_0)} of {\gamma_0} whose images are disjoint, and such that the interiors {\mathrm{int}(\gamma_1),\dots,\mathrm{int}(\gamma_n)} are also disjoint. Let {U} be an open set containing {\gamma_0,\gamma_1,\dots,\gamma_n} and the region

\displaystyle \Omega := \mathrm{int}(\gamma_0) \cap \mathrm{ext}(\gamma_1) \cap \dots \cap \mathrm{ext}(\gamma_n).

Show that for any {z_0 \in \Omega}, one has

\displaystyle f(z_0) = \frac{1}{2\pi i} \int_{\gamma_0} \frac{f(z)}{z-z_0}\ dz - \sum_{j=1}^n \frac{1}{2\pi i} \int_{\gamma_j} \frac{f(z)}{z-z_0}\ dz.

(Hint: induct on {n} using Lemma 1.)

Exercise 3 (Painlevé’s theorem on removable singularities) Let {U} be an open subset of {{\bf C}}. Let {S} be a compact subset of {U} which has zero length in the following sense: for any {\varepsilon>0}, one can cover {S} by a countable number of disks {D(z_n,r_n)} such that {\sum_n r_n < \varepsilon}. Let {f: U \backslash S \rightarrow {\bf C}} be a bounded holomorphic function. Show that the singularities in {S} are removable in the sense that there is an extension {\tilde f: U \rightarrow {\bf C}} of {f} to {U} which remains holomorphic. (Hint: one can work locally in some disk in {U} that contains a portion of {S}. Cover this portion by a finite number of small disks, group them into connected components, use the previous exercise, and take an appropriate limit.) Note that this result generalises Riemann’s theorem on removable singularities, see Exercise 35 from Notes 3. The situation when {S} has positive length is considerably more subtle, and leads to the theory of analytic capacity, which we will not discuss further here.

Now suppose that {f: U \rightarrow {\bf C}} is holomorphic for some open set {U} that contains an annulus of the form

\displaystyle \{ z \in {\bf C}: r < |z-z_0| < R \} \ \ \ \ \ (3)

for some {z_0 \in {\bf C}} and {0 \leq r < R \leq \infty}. From Lemma 1, we can split {f = f_1 + f_2}, where {f_1} is holomorphic in {D(z_0,R)}, and {f_2} is holomorphic in the exterior region {\{ z: |z-z_0| > r \}}, with {f_2(z)} going to zero as {z \rightarrow \infty}. From Corollary 18 of Notes 3, one has a Taylor expansion

\displaystyle f_1(z) = \sum_{n=0}^\infty a_n (z-z_0)^n

for some coefficients {a_0,a_1,\dots \in {\bf C}} that is absolutely convergent in the disk {D(z_0,R)}. One cannot directly apply this Taylor expansion to {f_2}. However, observe that the function {z \mapsto f_2(z_0 + \frac{1}{z})} is holomorphic in the punctured disk {D(0,1/r) \backslash \{0\}}, and goes to zero as one approaches zero. By Riemann’s theorem (Exercise 35 from Notes 3), this function may be extended to {D(0,1/r)} to a holomorphic function that vanishes at the origin. Applying Corollary 18 of Notes 3 again, we conclude that there is a Taylor expansion

\displaystyle f_2( z_0 + \frac{1}{z} ) = \sum_{n=1}^\infty a_{-n} z^n

for some coefficients {a_{-1}, a_{-2},\dots \in {\bf C}} that is absolutely convergent in the punctured disk {D(0,1/r) \backslash \{0\}}. Changing variables, we conclude that

\displaystyle f_2(z) = \sum_{n=-\infty}^{-1} a_n (z-z_0)^n

for {z} in {\{ z: |z-z_0| > r \}}, and thus

\displaystyle f(z) = \sum_{n=-\infty}^\infty a_n (z-z_0)^n \ \ \ \ \ (4)

for all {z} in (3), with the doubly infinite series on the right-hand side being absolutely convergent. This series is known as the Laurent series in the annulus (3). The coefficients {a_n} may be explicitly computed in terms of {f}:

Exercise 4 (Fourier inversion formula) Let {f: U \rightarrow {\bf C}} be holomorphic on some open set {U} that contains an annulus of the form (3), and let {a_n} be the coefficients of the Laurent expansion (4) in this annulus. Show that the coefficients {a_n} are uniquely determined by {f} and {r,R}, and are given by the formula

\displaystyle a_n = \frac{1}{2\pi i} \int_\gamma \frac{f(z)}{(z-z_0)^{n+1}}\ dz

for all integers {n}, whenever {\gamma} is a simple closed curve in the annulus with {W_\gamma(z_0) = 1}. Also establish the bounds

\displaystyle \liminf_{n \rightarrow +\infty} |a_n|^{-1/n} \geq R \ \ \ \ \ (5)


\displaystyle \liminf_{n \rightarrow -\infty} |a_n|^{-1/n} \geq 1/r. \ \ \ \ \ (6)

The following modification of the above exercise may help explain the terminology “Fourier inversion formula”.

Exercise 5 (Fourier inversion formula, again) Let {0 < r < 1 < R}.

The Laurent series for a given function can vary as one varies the annulus. Consider for instance the function {f(z) = \frac{1}{1-z}}. In the annulus {\{ z: 0 < |z| < 1 \}}, the Laurent expansion coincides with the Taylor expansion:

\displaystyle \frac{1}{1-z} = \sum_{n=0}^\infty z^n = 1 + z + z^2 + \dots.

On the other hand, in the exterior region {\{ z: |z| > 1\}}, the Taylor expansion is no longer convergent. Instead, if one writes {\frac{1}{1-z} = \frac{-1/z}{1-1/z}} and uses the geometric series formula, one instead has the Laurent expansion

\displaystyle \frac{1}{1-z} = \sum_{n=-\infty}^{-1} -z^n = -\frac{1}{z} - \frac{1}{z^2} - \dots

in this region.

Exercise 6 Find the Laurent expansions for the function {f(z) := \frac{1}{(z-1)(z-2)}} in the regions {\{ z: 0 < |z| < 1 \}}, {\{ z: 1 < |z|  2 \}}. (Hint: use partial fractions.)

We can use Laurent series to analyse an isolated singularity. Suppose that {f: D(z_0,r) \backslash \{z_0\} \rightarrow {\bf C}} is holomorphic on a punctured disk {D(z_0,r) \backslash \{z_0\}}. By the above discussion, we have a Laurent series expansion (4) in this punctured disk. If the singularity is removable, then the Laurent series must coincide with the Taylor series (by the uniqueness component of Exercise 4), so in partcular {a_n=0} for all negative {n}; conversely, if {a_n} vanishes for all negative {n}, then the Laurent series matches up with a convergent Taylor series and so the singularity is removable. We then adopt the following classification:

  • (i) {f} has a removable singularity at {z_0} if one has {a_n=0} for all negative {n}. If furthermore there is an {m \geq 0} such that {a_m \neq 0} and {a_n=0} for {n < m}, we say that {f} has a zero of order {m} at {z_0} (after removing the singularity). Zeroes of order {1} are known as simple zeroes, zeroes of order {2} are known as double zeroes, and so forth.
  • (ii) {f} has a pole of order {m} at {z_0} for some {m \geq 1} if one has {a_{-m} \neq 0}, and {a_n=0} for all {n < -m}. Poles of order {1} are known as simple poles, poles of order {2} are double poles, and so forth.
  • (iii) {f} has an essential singularity if {a_n \neq 0} for infinitely many negative {n}.

It is clear that any holomorphic function {f: D(z_0,r) \backslash \{z_0\} \rightarrow {\bf C}} will be of exactly one of the above three categories. Also, from the uniqueness of Laurent series, shrinking {r} does not affect which of the three categories {f} will lie in (or what order of pole {f} will have, in the second category). Thus, we can classify any isolated singularity {z_0 \in S} of a holomorphic function {f: U \backslash S \rightarrow {\bf C}} with singularities as being either removable, a pole of some finite order, or an essential singularity by restricting {f} to a small punctured disk {D(z_0,r) \backslash \{z_0\}} and inspecting the Laurent coefficients {a_n} for negative {n}.

Example 7 The function {z \mapsto e^{1/z}} has a Laurent expansion

\displaystyle e^{1/z} = 1 + \frac{1}{z} + \frac{1}{2! z^2} + \dots

and thus has an essential singularity at {z=0}.

It is clear from the definition (and the holomorphicity of Taylor series) that (as discussed in the introduction), a holomorphic function {f: U \backslash S \rightarrow {\bf C}} has a pole of order {m} at an isolated singularity {z_0 \in S} if and only if it is of the form {f(z) = \frac{g(z)}{(z-z_0)^m}} for some holomorphic {g: U \backslash S \cup \{z_0\} \rightarrow {\bf C}} with {g(z_0) \neq 0}. Similarly, a holomorphic function would have a zero of order {m} at {z_0} if and only if {f(z) = g(z) (z-z_0)^m} for some {g: U \backslash S \cup \{z_0\} \rightarrow {\bf C}} with {g(z_0) \neq 0}.

We can now define a class of functions that only have “nice” singularities:

Definition 8 (Meromorphic functions) Let {U} be an open subset of {{\bf C}}. A function {f:U \backslash S \rightarrow {\bf C}} defined on {U} outside of a singular set {S} is said to be meromorphic on {U} if

  • (i) {S} is closed and discrete (i.e., all points in {S} are isolated); and
  • (ii) Every {z_0 \in S} is either a removable singularity or a pole of finite order.

Two meromorphic functions {f_1: U \backslash S_1 \rightarrow {\bf C}}, {f_2: U \backslash S_2 \rightarrow {\bf C}} are said to be equivalent if they agree on their common domain of definition {U \backslash (S_1 \cup S_2)}. It is easy to see that this is an equivalence relation. It is common to identify meromorphic functions up to equivalence, similarly to how in measure theory it is common to identify functions which agree almost everywhere.

Exercise 9 (Meromorphic functions form a field) Let {M(U)} denote the space of meromorphic functions on a connected open set {U \subset {\bf C}}, up to equivalence. Show that {M(U)} is a field (with the obvious field operations). What happens if {U} is not connected?

Exercise 10 (Order is a valuation) If {f: U \backslash S \rightarrow {\bf C}} is a meromorphic function, and {z_0 \in U}, define the order {\mathrm{ord}_{z_0}(f)} of {f} at {z_0} as follows:

  • (a) If {f} has a removable singularity at {z_0}, and has a zero of order {m} at {z_0} once the singularity is removed, then {\mathrm{ord}_{z_0}(f) := m}.
  • (b) If {f} is holomorphic at {z_0}, and has a zero of order {m} at {z_0}, then {\mathrm{ord}_{z_0}(f) := m}.
  • (c) If {f} has a pole of order {m} at {z_0}, then {\mathrm{ord}_{z_0}(f) := -m}.
  • (d) If {f} is identically zero, then {\mathrm{ord}_{z_0}(f) = +\infty}.

Establish the following facts:

  • (i) If {f_1: U \backslash S_1 \rightarrow {\bf C}} and {f_2: U \backslash S_2 \rightarrow {\bf C}} are equivalent meromorphic functions, then {\mathrm{ord}_{z_0}(f_1) = \mathrm{ord}_{z_0}(f_2)} for all {z_0 \in U}. In particular, one can meaningfully define the order of an element of {M(U)} at any point {z_0} in {U}, where {M(U)} is as in the preceding exercise.
  • (ii) If {f,g \in M(U)} and {z_0 \in U}, show that {\mathrm{ord}_{z_0}(fg) = \mathrm{ord}_{z_0}(f) + \mathrm{ord}_{z_0}(g)}. If {g} is not zero, show that {\mathrm{ord}_{z_0}(f/g) = \mathrm{ord}_{z_0}(f) - \mathrm{ord}_{z_0}(g)}.
  • (iii) If {f,g \in M(U)} and {z_0 \in U}, show that {\mathrm{old}_{z_0}(f+g) \geq \min( \mathrm{ord}_{z_0}(f), \mathrm{ord}_{z_0}(g) )}. Furthermore, show if {\mathrm{ord}_{z_0}(f) \neq \mathrm{ord}_{z_0}(g)}, then the above inequality is in fact an equality.

In the language of abstract algebra, the above facts are asserting that {\mathrm{ord}_{z_0}} is a valuation on the field {M(U)}.

The behaviour of a holomorphic function near an isolated singularity depends on the type of singularity.

Theorem 11 Let {f: U \backslash S \rightarrow {\bf C}} be holomorphic on an open set {U} outside of a singular set {S}, and let {z_0} be an isolated singularity in {S}.

Proof: Part (i) is obvious. Part (ii) is immediate from the factorisation {f(z) = g(z) / (z-z_0)^m} and noting that {g(z)} converges to the non-zero value {g(z_0)} as {z \rightarrow z_0}. The {c=\infty} case of (iii) follows from Riemann’s theorem on removable singularities (Exercise 35 from Notes 3). Now suppose {c} is finite. If (iii) failed, then there exist {\varepsilon, r > 0} such that {f} avoids the disk {D(c,\varepsilon)} on the domain {D(z_0,r) \backslash \{z_0\}}. In particular, the function {\frac{1}{f-c}} is bounded and holomorphic on {D(z_0,r) \backslash \{z_0\}}, and thus extends holomorphically to {D(z_0,r)} by Riemann’s theorem. This function cannot vanish identically, so we must have {\frac{1}{f(z)-c} = (z-z_0)^m g(z)} on {D(z_0,r) \backslash \{z_0\}} for some {m \geq 0} and some holomorphic {g: D(z_0,r) \rightarrow {\bf C}} that does not vanish at {z_0}. Rearranging this as {f(z) = c + \frac{1/g(z)}{(z-z_0)^m}}, we see that {f} has a pole or removable singularity at {z_0}, a contradiction. \Box

In Theorem 56 below we will establish a significant strengthening of the Casorati-Weierstrass theorem known as the Great Picard Theorem.

Exercise 12 Let {f: {\bf C} \backslash S \rightarrow {\bf C}} be holomorphic in {{\bf C}} outside of a discrete set {S} of singularities. Let {z_0 \in {\bf C} \backslash S}. Show that the radius of convergence of the Taylor series of {f} around {z_0} is equal to the distance from {z_0} to the nearest non-removable singularity in {S}, or {+\infty} if no such non-removable singularity exists. (This fact provides a neat way to understand the rate of growth of a sequence {a_n}: form its generating function {\sum_{n=0}^\infty a_n z^n}, locate the singularities of that function, and find out how close they get to the origin. This is a simple example of the methods of analytic combinatorics in action.)

A curious feature of the singularities in complex analysis is that the order of singularity is “quantised”: one can have a pole of order {1}, {2}, or {3} (for instance), but not a pole of order {2.5} or {0.7}. This quantisation can be exploited: if for instance one somehow knows that the order of the pole is less than {m+1-\varepsilon} for some integer {m} and real number {\varepsilon>0}, then the singularity must be removable or a pole of order at most {m}. The following exercise formalises this assertion:

Exercise 13 Let {f: D(z_0,r) \backslash \{z_0\} \rightarrow {\bf C}} be holomorphic on a disk {D(z_0,r)} except for a singularity at {z_0}. Let {m} be an integer, and suppose that there exist {C>0}, {\varepsilon>0} such that one has the upper bound

\displaystyle |f(z)| \leq C |z-z_0|^{-m-1+\varepsilon}

for all {z \in D(z_0,r) \backslash \{z_0\}}. Show that the singularity of {f} at {z_0} is either removable, or a pole of order at most {m} (the latter option is only possible when {m} is positive). (Hint: use Lemma 4 and a limiting argument to evaluate the Laurent coefficients {a_n} for {n \leq -1}.) In particular, if one has

\displaystyle |f(z)| \leq C |z-z_0|^{-1+\varepsilon}

for all {z \in D(z_0,r) \backslash \{z_0\}}, then the singularity is removable.

As mentioned in the introduction, the theory of meromorphic functions becomes cleaner if one replaces the complex plane {{\bf C}} with the Riemann sphere. This sphere is a model example of a Riemann surface, and we will now digress to briefly introduce this more general concept (though we will not develop the general theory of Riemann surfaces in any depth here). To motivate the definition, let us first recall from differential geometry the notion of a smooth {n}-dimensional manifold {M} (over the reals).

Definition 14 (Smooth manifold) Let {n \geq 1}, and let {M} be a topological space. An ({n}-dimensional real) atlas for {M} is an open cover {(U_\alpha)_{\alpha \in A}} of {M} together with a family of homeomorphisms {\phi_\alpha: U_\alpha \rightarrow V_\alpha} (known as coordinate charts) from each {U_\alpha} to an open subset {V_\alpha} of {{\bf R}^n}. Furthermore, the atlas is said to be smooth if for any {\alpha,\beta \in A}, the transition map {\phi_\beta \circ \phi_\alpha^{-1}: \phi_\alpha(U_\alpha \cap U_\beta) \rightarrow \phi_\beta(U_\alpha \cap U_\beta)}, which maps one open subset of {{\bf R}^n} to another, is required to be smooth (i.e., infinitely differentiable). A map {f: M \rightarrow M'} from one topological space {M} (equipped with a smooth atlas of coordinate charts {\phi_\alpha: U_\alpha \rightarrow V_\alpha} for {\alpha \in A}) to another {M'} (equipped with a smooth atlas of coordinate charts {\phi'_\beta: U'_\beta \rightarrow V'_\beta} for some {\beta \in B}) is said to be smooth if, for any {\alpha \in A} and {\beta \in B}, the maps {\phi'_\beta \circ f \circ \phi_\alpha^{-1}: \phi_\alpha( f^{-1}(U'_\beta) \cap U_\alpha ) \rightarrow V_\beta} are smooth; if {f} is invertible and {f} and {f^{-1}} are both smooth, we say that {f} is a diffeomorphism, and that {M} and {M'} are diffeomorphic. Two smooth atlases on {M} are said to be equivalent if the identity map from {M} (equipped with one of the two atlases) to {M} (equipped with the other atlas) is a diffeomorphism; this is easily seen to be an equivalence relation, and an equivalence class of such atlases is called a smooth structure on {M}. A smooth {n}-dimensional real manifold is a Hausdorff topological space {M} equipped with a smooth structure. (In some texts the mild additional condition of second countability on {M} is also imposed.) A map {f: M \rightarrow M'} between two smooth manifolds is said to be smooth, if the map {f} from {M} (equipped with one of the atlases in the smooth structure on {M}) to {M'} (equipped with one of the atlases in the smooth structure on {M'}) is smooth; it is easy to see that this definition is independent of the choices of atlas. We may similarly define the notion of a diffeomorphism between two smooth manifolds.

This definition may seem excessively complicated, but it captures the modern geometric philosophy that one should strive as much as possible to work with objects that are coordinate-independent in that they do not depend on which atlas of coordinate charts one picks within the equivalence class of the given smooth structure in order to perform computations or to define foundational concepts. One can also define smooth manifolds more abstractly, without explicit reference to atlases, by working instead with the structure sheaf of the rings {C^\infty(U)} of smooth real-valued functions on open subsets {U} of the manifold {M}, but we will not need to do so here.

Example 15 A simple example of a smooth {1}-dimensional manifold is the unit circle {S^1 = \{ z \in {\bf C}: |z| = 1 \}}; there are many equivalent atlases one could place on this circle to define the smooth structure, but one example would be the atlas consisting of the two charts {\phi_1: U_1 \rightarrow V_1}, {\phi_2: U_2 \rightarrow V_2}, defined by setting {V_1 := (-\pi, \pi)}, {V_2 := (0, 2\pi)}, {U_1 := \{ e^{i \theta}: \theta \in V_1 \} = S^1 \backslash \{-1\}}, {U_2 := \{ e^{i\theta}: \theta \in V_2 \} = S^1 \backslash \{1\}}, {\phi_1(e^{i\theta}) := \theta} for {\theta \in U_1}, and {\phi_2(e^{i\theta}) := \theta} for {\theta \in U_2}. Another smooth manifold, which turns out to be diffeomorphic to the unit circle {S^1}, is the one-point compactification {{\bf R} \cup \{\infty\}} of the real numbers, with the two charts {\phi_1: U_1 \rightarrow V_1}, {\phi_2: U_2 \rightarrow V_2} defined by setting {V_1=V_2:={\bf R}}, {U_1 := {\bf R}}, {U_2 := ({\bf R} \backslash \{0\}) \cup \{\infty\}}, {\phi_1} to be the identity map, and {\phi_2} defined by setting {\phi_2(x) := 1/x} for {x \in {\bf R} \backslash \{0\}} and {\phi_2(\infty) := 0}.

Exercise 16 Verify that the unit circle {S^1} is indeed diffeomorphic to the one-point compactification {{\bf R} \cup \{\infty\}}.

A Riemann surface is defined similarly to a smooth manifold, except that the dimension {n} is restricted to be one, the reals {{\bf R}} are replaced with the complex numbers, and the requirement of smoothness is replaced with holomorphicity (thus Riemann surfaces are to the complex numbers as smooth curves are to the real numbers). More precisely:

Definition 17 (Riemann surface) Let {M} be a Hausdorff topological space. A holomorphic atlas on {M} is an open cover {(U_\alpha)_{\alpha \in A}} of {M} together with a family of homeomorphisms {\phi_\alpha: U_\alpha \rightarrow V_\alpha} (known as coordinate charts) from each {U_\alpha} to an open subset {V_\alpha} of {{\bf C}}, such that, for any {\alpha,\beta \in A}, the transition map {\phi_\beta \circ \phi_\alpha^{-1}: \phi_\alpha(U_\alpha \cap U_\beta) \rightarrow \phi_\beta(U_\alpha \cap U_\beta)}, which maps one open subset of {{\bf C}} to another, is required to be holomorphic. A map {f: M \rightarrow M'} from one space {M} (equipped with coordinate charts {\phi_\alpha: U_\alpha \rightarrow V_\alpha} for {\alpha \in A}) to another {M'} (equipped with coordinate charts {\phi'_\beta: U'_\beta \rightarrow V'_\beta} for some {\beta \in B}) is said to be holomorphic if, for any {\alpha \in A} and {\beta \in B}, the maps {\phi'_\beta \circ f \circ \phi_\alpha: \phi_\alpha^{-1}( f^{-1}(U'_\beta) \cap U_\alpha ) \rightarrow V_\beta} are holomorphic; if {f} is invertible and {f} and {f^{-1}} are both holomorphic, we say that {f} is a complex diffeomorphism, and that {M} and {M'} are complex diffeomorphic. Two holomorphic atlases on {M} are said to be equivalent if the identity map from {M} (equipped with one of the atlases) to {M} (equipped with the other atlas) is a complex diffeomorphism; this is easily seem to be an equivalence relation, and we refer to such an equivalence class as a (one-dimensional) complex structure on {M}. A Riemann surface is a Hausdorff topological space {M}, equipped with a one-dimensional complex structure. (Again, in some texts the hypothesis of second countability is imposed. This makes little difference in practice, as most Riemann surfaces one actually encounters will be second countable.)

By considering dimensions {n} greater than one, one can arrive at the more general notion of a complex manifold, the study of which is the focus of complex geometry (and also plays a central role in the closely related fields of several complex variables and complex algebraic geometry). However, we will not need to deal with higher-dimensional complex manifolds in this course. The notion of a Riemann surface should not be confused with that of a Riemannian manifold, which is the topic of study of Riemannian geometry rather than complex geometry.

Clearly any open subset {U} of the complex numbers {{\bf C}} is a Riemann surface, in which one can use the atlas that only consists of one “tautological” chart, the identity map {\phi: U \rightarrow U}. More generally, any open subset of a Riemann surface is again a Riemann surface. If {U,V} are open subsets of the complex numbers, and {f: U \rightarrow V} is a map, then by unpacking all the definitions we see that {f: U \rightarrow V} is holomorphic in the sense of Definition 17 if and only if it is holomorphic in the usual sense.

Now we come to the Riemann sphere {{\bf C} \cup \{\infty\}}, which is to the complex numbers as {{\bf R} \cup \{\infty\}} is to the real numbers. As a set, this is the complex numbers {{\bf C}} with one additional point (the point at infinity) {\infty} attached. Topologically, this is the one-point compactification of the complex numbers {{\bf C}}: the open sets of {{\bf C} \cup \{\infty\}} are either subsets of {{\bf C}} that were already open, or complements {({\bf C} \cup \{\infty\}) \backslash K} of compact subsets {K} of {{\bf C}}. As a Riemann surface, the complex structure can be described by the atlas of coordinate charts {\phi_1: U_1 \rightarrow V_1}, {\phi_2: U_2 \rightarrow V_2}, where {V_1 = V_2 := {\bf C}}, {U_1 := {\bf C}}, {U_2 := ({\bf C} \backslash \{0\}) \cup \{\infty\}}, {\phi_1} is the identity map, and {\phi_2(z)} equals {1/z} for {z \in {\bf C} \backslash \{0\}} with {\phi_2(\infty) = 0}. It is not difficult to verify that this is indeed a Riemann surface (basically because the map {z \mapsto \frac{1}{z}} is holomorphic on {{\bf C} \backslash \{0\}}). One can identify the Riemann sphere with a geometric sphere, and specifically the sphere {S^2 := \{ (z,t) \in {\bf C} \times {\bf R}: |z|^2 + (t-\frac{1}{2})^2 = \frac{1}{4} \}}, through the device of stereographic projection through the north pole {N := (0,1) \in {\bf C} \times {\bf R}}, identifying a point {z} in {{\bf C} \subset {\bf C} \cup \{\infty\}} with the point {(\frac{z}{1+|z|^2}, \frac{|z|^2}{1+|z|^2})} on {S^2 \backslash \{N\}} collinear with that point, and the point at infinity {\infty} identified with the north pole {N}. This geometric perspective is especially helpful when thinking about Möbius transformations, as is for instance exemplified by this excellent video. (We may cover Möbius transformations in a subsequent set of notes.)

By unpacking the definitions, we can now work out what it means for a function to be holomorphic to or from the Riemann sphere. For instance, if {f: U \rightarrow {\bf C} \cup \{\infty\}} is a map from an open subset {U} of {{\bf C}} to the Riemann sphere {{\bf C} \cup \{\infty\}}, then {f} is holomorphic if and only if

  • (i) {f} is continuous;
  • (ii) {f} is holomorphic on the set {\{ z \in U: f(z) \neq \infty\}} (which is open thanks to (i)); and
  • (iii) {1/f} is holomorphic on the set {\{ z \in U: f(z) \neq 0 \}} (which is open thanks to (i)), where we adopt the convention {1/\infty = 0}.

Similarly, if a function {f: U \rightarrow {\bf C} \cup \{ \infty\}} is a map from an open subset {U} of the Riemann sphere {{\bf C} \cup \{\infty\}} to the Riemann sphere, then {f} is holomorphic if and only if

  • (i) {z \mapsto f(z)} is holomorphic on {U \cap {\bf C}}; and
  • (ii) {z \mapsto f(1/z)} is holomorphic on {\{ z \in {\bf C} \cup \{\infty\}: 1/z \in U \}}, where we again adopt the convention {1/\infty=0}.

We can then identify meromorphic functions with holomorphic functions on the Riemann sphere:

Exercise 18 Let {U \subset {\bf C}} be open, let {S} be a discrete subset of {U}, and let {f: U \backslash S \rightarrow {\bf C}} be a function. Show that the following are equivalent:

  • (i) {f} is meromorphic on {U}.
  • (ii) {f} is the restriction of a holomorphic function {\tilde f: U \rightarrow {\bf C} \cup \{\infty\}} to the Riemann sphere.

Furthermore, if (ii) holds, show that {\tilde f} is uniquely determined by {f}, and is unaffected if one replaces {f} with an equivalent meromorphic function.

Among other things, this exercise implies that the composition of two meromorphic functions is again meromorphic (outside of where the composition is undefined, of course).

Exercise 19 Let {f: {\bf C} \cup \{\infty\} \rightarrow {\bf C} \cup \{\infty\}} be a holomorphic map from the Riemann sphere to itself. Show that {f} is a rational function in the sense that there exist polynomials {P(z), Q(z)} of one complex variable, with {Q} not identically zero, such that {f(z) = P(z) / Q(z)} for all {z \in {\bf C}} with {Q(z) \neq 0}. (Hint: show that {f} has finitely many poles, and eliminate them by multiplying {f} by appropriate linear factors. Then use Exercise 29 from Notes 3.)

Exercise 20 (Partial fractions) Let {P(z)} be a polynomial of one complex variable, which by the fundamental theorem of algebra we may write as

\displaystyle P(z) = a (z-z_1)^{d_1} \dots (z-z_k)^{d_k}

for some distinct roots {z_1,\dots,z_k \in {\bf C}}, some non-zero {a}, and some positive integers {d_1,\dots,d_k}. Let {Q(z)} be another polynomial of one complex polynomial. Show that there exist unique polynomials {R_1(z),\dots,R_k(z),S(z)}, with each {R_i} having degree less than {d_i} for {i=1,\dots,k}, such that one has the partial fraction decomposition

\displaystyle \frac{Q(z)}{P(z)} = \sum_{j=1}^k \frac{R_j(z)}{(z-z_j)^{d_j}} + S(z)

for all {z \in {\bf C} \backslash \{z_1,\dots,z_k\}}. Furthermore, show that {S} vanishes if the degree {\mathrm{deg}(Q)} of {Q} is less than the degree {\mathrm{deg}(P)} of {P}, and has degree {\mathrm{deg}(Q)-\mathrm{deg}(P)} otherwise.

— 2. The residue theorem —

Now we can prove a significant generalisation of the Cauchy theorem and Cauchy integral formula, known as the residue theorem.

Suppose one has a function {f: U \backslash S \rightarrow {\bf C}} holomorphic on an open set {U \subset {\bf C}} outside of a singular set {S}. If {z_0 \in S} is an isolated singularity of {f}, then we have a Laurent expansion

\displaystyle f(z) = \sum_{n=-\infty}^\infty a_n (z-z_0)^n

which is convergent in some punctured disk {D(z_0,r) \backslash \{z_0\}}. The coefficient {a_{-1}} plays a privileged role and is known as the residue of {f} at {z_0}; we denote it by {\mathrm{Res}(f; z_0)}. Clearly this is quantity is local in the sense that it only depends on the behaviour of {f} in a neighbourhood of {z_0}; in particular, it does not depend on the domain {U} so long as {z_0} remains inside of that domain. By convention, we also set {\mathrm{Res}(f;z_0)=0} if {f} is holomorphic at {z_0} (i.e., if {z_0 \in U \backslash S}).

We then have

Theorem 21 (Residue theorem) Let {U \subset {\bf C}} be a simply connected open set, and let {f: U \backslash S \rightarrow {\bf C}} be holomorphic outside of a closed discrete singular set {S} (thus all singularities in {S} are isolated singularities). Let {\gamma} be a closed curve in {U \backslash S}. Then

\displaystyle \frac{1}{2\pi i} \int_\gamma f(z)\ dz = \sum_{z_0 \in S} W_\gamma(z_0) \mathrm{Res}(f; z_0),

where only finitely many of the terms on the right-hand side are non-zero.

Proof: The image of {\gamma} is contained in some large ball; restricting {U} and {S} to this ball, we may assume without loss of generality that {S} is both discrete and compact, and thus finite (by the Bolzano-Weierstrass theorem).

Next, we reduce to the case where all the residues {\mathrm{Res}(f;z_0)} vanish. We introduce the rational function {g: {\bf C} \backslash S \rightarrow {\bf C}} defined by

\displaystyle g(z) := \sum_{z_0 \in S} \frac{\mathrm{Res}(f;z_0)}{z-z_0}.

From Laurent expansion around each singularity {z_0} we see that {\mathrm{Res}(g;z_0) = \mathrm{Res}(f;z_0)} for all {z_0 \in S}, thus {\mathrm{Res}(f-g;z_0) = 0}. Also, from the definition of winding number (see Definition 38 of Notes 3) we have

\displaystyle \frac{1}{2\pi i} \int_\gamma g(z)\ dz = \sum_{z_0 \in S} W_\gamma(z_0) \mathrm{Res}(f; z_0).

Setting {F := f-g}, it thus suffices to show that

\displaystyle \int_\gamma F(z)\ dz = 0. \ \ \ \ \ (9)

As {U} is simply connected, {\gamma: [a,b] \rightarrow U} is homotopic in {U} (as closed contours) to a point. Let {\tilde \gamma: [0,1] \times [a,b] \rightarrow U} denote the homotopy. We would like to mimic the proof of Cauchy’s theorem (Theorem 4 of Notes 3) to conclude (9). The difficulty is that the homotopy {\tilde \gamma} may pass through points {z_0} in {S}. However, note from the vanishing of the residue {\mathrm{Res}(F;z_0)} that one has a Laurent expansion of the form

\displaystyle F(z) = \sum_{n=0}^\infty a_n (z-z_0)^n + \sum_{n=-\infty}^{-2} a_n (z-z_0)^n

for some coefficients {a_n, n \neq -1}, in some punctured disk {D(z_0,r) \backslash \{z_0\}}, with both series being absolutely convergent in this punctured disk. From term by term differentiation (see Theorem 15 of Notes 1) we see that {F} has an antiderivative in this punctured disk, namely

\displaystyle G(z) := \sum_{n=0}^\infty a_n \frac{(z-z_0)^n}{n+1} + \sum_{n=-\infty}^{-2} a_n \frac{(z-z_0)^n}{n+1}

(note how crucial it is that the {n=-1} term is absent in order to form this antiderivative). The absolute convergence of the series on the right-hand side in {D(z_0,r)} can be seen from the comparison test. From the fundamental theorem of calculus, we thus conclude that {F} is conservative on {D(z_0,r) \backslash \{z_0\}}. Also, for any {z_0 \in U} that is not in {S}, we see from Cauchy’s theorem that {F} is conservative on {D(z_0,r)} for some radius {r>0}. Putting this together using a compactness argument, we conclude that there exists a radius {r>0}, such that for all {z_0} in the image of the homotopy {\tilde \gamma}, the function {F} is conservative in {D(z_0,r) \backslash S}.

Now we repeat the proof of Cauchy’s theorem (Theorem 4 of Notes 3), discretising the homotopy {\tilde \gamma} into short closed polygonal paths {C_{i,j}} (each of diameter less than {r}) around which the integral of {F} is zero, to conclude (9). The argument is completely analogous, save for the technicality that the paths {C_{i,j}} may occasionally pass through one of the points in {S}. But this can be easily rectified by perturbing each of the paths {C_{i,j}} by adding a short detour around any point of {S} that is passed through; we leave the details to the interested reader. \Box

Combining the residue theorem with the Jordan curve theorem, we obtain the following special case, which is already enough for many applications:

Corollary 22 (Residue theorem for simple closed contours) Let {\gamma} be a simple closed anticlockwise contour in {{\bf C}}. Suppose that {f: U \backslash S \rightarrow {\bf C}} is holomorphic on an open set {U} containing the image and interior of {\gamma}, outside of a closed discrete {S} that does not intersect the image of {\gamma}. Then we have

\displaystyle \frac{1}{2\pi i} \int_\gamma f(z)\ dz = \sum_{z_0 \in S \cap \mathrm{int}(\gamma)} \mathrm{Res}(f;z_0).

If {\gamma} is oriented clockwise instead of anticlockwise, then we instead have

\displaystyle \frac{1}{2\pi i} \int_\gamma f(z)\ dz = -\sum_{z_0 \in S \cap \mathrm{int}(\gamma)} \mathrm{Res}(f;z_0).

Exercise 23 (Homology version of residue theorem) Show that the residue theorem continues to hold when the closed curve {\gamma} is replaced by a {1}-cycle (as in Exercise 63 of Notes 3) that avoids all the singularities in {S}, and the requirement that {U} be simply connected is replaced by the requirement that {U} contains all the points {z_0} outside of the image of {\gamma} where {W_\gamma(z_0) \neq 0}.

Exercise 24 (Exterior version of residue theorem) Let {\gamma} be a simple closed anticlockwise contour in {{\bf C}}. Suppose that {f: U \backslash S \rightarrow {\bf C}} is holomorphic on an open set {U} containing the image and exterior of {\gamma}, outside of a finite {S} that does not intersect the image of {\gamma}. Suppose also that {z f(z)} converges to a finite limit {c} in the limit {|z| \rightarrow \infty}. Show that

\displaystyle \frac{1}{2\pi i} \int_\gamma f(z)\ dz = c -\sum_{z_0 \in S \cap \mathrm{ext}(\gamma)} \mathrm{Res}(f;z_0).

If {\gamma} is oriented clockwise instead of anticlockwise, show instead that

\displaystyle \frac{1}{2\pi i} \int_\gamma f(z)\ dz = -c + \sum_{z_0 \in S \cap \mathrm{ext}(\gamma)} \mathrm{Res}(f;z_0).

In order to use the residue theorem effectively, one of course needs some tools to compute the residue at a given point. The Fourier inversion formula (4) expresses such residues as a contour integral, but this is not so useful in practice as often the best way to compute such integrals is via the residue theorem, leaving one back where one started! But if the singularity is not an essential one, we have some useful formulae:

Exercise 25 Let {f: U \backslash S \rightarrow {\bf C}} be holomorphic on an open set {U} outside of a singular set {S}, and let {z_0} be an isolated point of {S}.

  • (i) If {f} has a removable singularity at {z_0}, show that {\mathrm{Res}(f;z_0)=0}.
  • (ii) If {f} has a simple pole at {z_0}, show that {\mathrm{Res}(f;z_0)=\lim_{z \rightarrow z_0; z \in U \backslash S} f(z) (z-z_0)}.
  • (iii) If {f} has a pole of order at most {m} at {z_0} for some {m \geq 1}, show that

    \displaystyle \mathrm{Res}(f;z_0)= \frac{1}{m!} \lim_{z \rightarrow z_0; z \in U \backslash S} \frac{d^{m-1}}{dz^{m-1}} (f(z) (z-z_0)^m).

    In particular, if {f(z) = g(z)/(z-z_0)^m} near {z_0} for some {g} that is holomorphic at {z_0}, then

    \displaystyle \mathrm{Res}(f;z_0)= \frac{1}{m!} \frac{d^{m-1}}{dz^{m-1}} g(z_0).

Using these facts, show that Cauchy’s theorem (Theorem 14 from Notes 3), the Cauchy integral formula (Theorem 39 from Notes 3), and the higher order Cauchy integral formula (Exercise 40 from Notes 3) can be derived from the residue theorem. (Of course, this is not an independent proof of these theorems, as they were used in the proof of the residue theorem!)

The residue theorem can be applied in countless ways; we give only a small sample of them below.

Exercise 26 Use the residue theorem to give an alternate proof of the fundamental theorem of algebra, by considering the integral {\int_{\gamma_{0,R,\circlearrowleft}} \frac{z^{n-1}}{P(z)}\ dz} for a polynomial {P} of degree {n} and some large radius {R}.

Exercise 27 Let {f: {\bf C} \rightarrow {\bf C}} be a Dirichlet polynomial of the form

\displaystyle f(s) := \sum_{n=1}^\infty \frac{a_n}{n^s}

for some sequence {a_1,a_2,\dots} of complex numbers, with only finitely many of the {a_n} non-zero. Establish Perron’s formula

\displaystyle \sum_{n \leq x} a_n = \lim_{T \rightarrow +\infty} \frac{1}{2\pi i} \int_{\gamma_{c-iT \rightarrow c+iT}} f(s) \frac{x^s}{s}\ ds

for any real numbers {x,c>0} with {x} not an integer. What happens if {x} is an integer? Generalisations and variants of this formula, particularly with the Dirichlet polynomial replaced by more general Dirichlet series in which infinitely many of the {a_n} are allowed to be non-zero, are of particular use in analytic number theory; see for instance this previous blog post.

Exercise 28 (Spectral theorem for matrices) This exercise presumes some familiarity with linear algebra. Let {n} be a positive integer, and let {M_n({\bf C})} denote the ring of {n \times n} complex matrices. Let {A} be a matrix in {M_n({\bf C})}. The characteristic polynomial {\Delta(z) := \mathrm{det}(A - zI)}, where {I} is the {n \times n} identity matrix, is a polynomial of degree {n} in {z} with leading coefficient {(-1)^n}; we let {z_1,\dots,z_k} be the distinct zeroes of this polynomial, and let {d_1,\dots,d_k} be the multiplicities; thus by the fundamental theorem of algebra we have

\displaystyle \mathrm{det}(A-zI) = (-1)^n (z-z_1)^{d_1} \dots (z-z_k)^{d_k}.

We refer to the set {\{z_1,\dots,z_k\}} as the spectrum of {A}. Let {\gamma} be any closed anticlockwise curve that contains the spectrum of {A} in its interior, and let {U} be an open subset of {{\bf C}} that contains {\gamma} and its interior.

  • (i) Show that the resolvent {(A-zI)^{-1}} is a meromorphic function on {{\bf C}} with poles at the spectrum of {A}, where we call a matrix-valued function meromorphic if each of its {n^2} components are meromorphic. (Hint: use the adjugate matrix.)
  • (ii) For any holomorphic {f: U \rightarrow {\bf C}}, we define the matrix {f(A) := M_n({\bf C})} by the formula

    \displaystyle f(A) := \frac{1}{2\pi i} \int_\gamma f(z) (A-zI)^{-1}\ dz

    (cf. the Cauchy integral formula). We refer to {f(A)} as the holomorphic functional calculus for {A} applied to {f}. Show that the matrix {f(A)} does not depend on the choice of {\gamma}, depends linearly on {f}, and equals the identity matrix when {f} is the constant function {1}. Furthermore, if {g: U \rightarrow {\bf C}} is the function {g(z) := z f(z)}, show that

    \displaystyle g(A) = A f(A) = f(A) A.

    Conclude in particular that if {P} is a polynomial

    \displaystyle P(z) = a_m z^m + \dots + a_1 z + a_0

    with complex coefficients {a_0,\dots,a_m \in {\bf C}}, then the function {P(A)} (as defined by the holomorphic functional calculus) matches how one would define {P(A)} algebraically, in the sense that

    \displaystyle P(A) = a_m A^m + \dots + a_1 A + a_0 I.

  • (iii) Prove the Cayley-Hamilton theorem {\Delta(A)=0}. (Note from (ii) that it does not matter whether one interprets {\Delta(A)} algebraically, or via the holomorphic functional calculus.)
  • (iv) If {f: U \rightarrow {\bf C}} is holomorphic, show that the matrix-valued function {z \mapsto (f(A)-f(z)) (A-zI)^{-1}} has only removable singularities in {U}.
  • (v) If {f,g: U \rightarrow {\bf C}} are holomorphic, establish the identity

    \displaystyle (fg)(A) = f(A) g(A).

  • (vi) Show that there exist matrices {P_1,\dots,P_k \in M_n({\bf C})} that are idempotent (thus {P_j^2=P_j} for all {j=1,\dots,k}), commute with each other and with {A}, sum to the identity (thus {P_1+\dots+P_k = I}), annihilate each other (thus {P_j P_{j'} = 0} for all distinct {j,j' = 1,\dots,k}) and are such that for each {j=1,\dots,k}, one has the nilpotency property

    \displaystyle (P_j (A - z_j I) P_j)^{d_j} = 0.

    In particular, we have the spectral decomposition

    \displaystyle A = \sum_{j=1}^k P_j (z_j I + N_j) P_j

    where each {N_j} is a nilpotent matrix with {N_j^{d_j} = 0}. Finally, show that the range of {P_j} (viewed as a linear operator from {{\bf C}^n} to itself) has dimension {d_j}. Find a way to interpret each {P_j} as the (negative of the) “residue” of the resolvent operator {(A-zI)^{-1}} at {z_j}.

Under some additional hypotheses, it is possible to extend the analysis in the above exercise to infinite-dimensional matrices or other linear operators, but we will not do so here.

— 3. The argument principle —

We have not yet defined the complex logarithm {\log z} of a complex number {z}, but one of the properties we would expect of this logarithm is that its derivative should be the reciprocal function: {\frac{d}{dz} \log z = \frac{1}{z}}. In particular, by the chain rule we would expect the formula

\displaystyle \frac{d}{dz} \log f(z) = \frac{f'(z)}{f(z)} \ \ \ \ \ (10)

for a holomorphic function {f}, at least away from the zeroes of {f}. Inspired by this formal calculation, we refer to the function {\frac{f'(z)}{f(z)}} as the log-derivative of {f}. Observe the product rule and quotient rule, when applied to complex differentiable functions {f,g} that are non-zero at some point {z}, gives the formulae

\displaystyle \frac{(fg)'(z)}{(fg)(z)} = \frac{f'(z)}{f(z)} + \frac{g'(z)}{g(z)} \ \ \ \ \ (11)


\displaystyle \frac{(f/g)'(z)}{(f/g)(z)} = \frac{f'(z)}{f(z)} - \frac{g'(z)}{g(z)} \ \ \ \ \ (12)

which are of course consistent with the formal calculation (10), given how we expect the logarithm to act on products and quotients. Thus, for instance, if {P}, {Q} are polynomials that are factored as

\displaystyle P(z) = a (z-z_1)^{d_1} \dots (z-z_k)^{d_k}


\displaystyle Q(z) = b (z-w_1)^{e_1} \dots (z-w_l)^{e_l}

for some non-zero complex numbers {a,b}, distinct complex numbers {z_1,\dots,z_k,w_1,\dots,w_l}, and positive integers {d_1,\dots,d_k,e_1,\dots,e_l}, then the log-derivative of the rational function {P/Q} is given by

\displaystyle \frac{(P/Q)'(z)}{(P/Q)(z)} = \frac{d_1}{z-z_1} + \dots + \frac{d_k}{z-z_k} - \frac{e_1}{z-w_1} - \dots - \frac{e_l}{z-w_l}.

In particular, the log-derivative of {P/Q} is meromorphic with poles at {z_1,\dots,z_k, w_1,\dots,w_l}, with a residue of {+d_j} at each zero {z_j} of {P/Q}, and a residue of {-e_j} at each pole {w_j} of {P/Q}.

A general rule of thumb in complex analysis is that holomorphic functions behave like generalisations of polynomials, and meromorphic functions behave like generalisations of rational functions. In view of this rule of thumb and the above calculation, the following lemma should thus not be surprising:

Lemma 29 Let {f: U \backslash S \rightarrow {\bf C}} be a holomorphic function on an open set {U \subset {\bf C}} outside of a singular set {S}, and let {z_0} be either an element of {U \backslash S} or an isolated point of {S}.

  • (i) If {f} is holomorphic and non-zero at {z_0}, then the log-derivative {\frac{f'(z)}{f(z)}} is also holomorphic at {z_0}.
  • (ii) If {f} is holomorphic at {z_0} with a zero of order {m \geq 1}, then the log-derivative {\frac{f'(z)}{f(z)}} has a simple pole at {z_0} with residue {m}.
  • (iii) If {f} has a removable singularity at {z_0}, and is non-zero once the singularity is removed, then the log-derivative {\frac{f'(z)}{f(z)}} has a removable singularity at {z_0}.
  • (iv) If {f} has a removable singularity at {z_0}, and has a zero of order {m \geq 1} once the singularity is removed, then the log-derivative {\frac{f'(z)}{f(z)}} has a simple pole at {z_0} with residue {m}.
  • (v) If {f} has a pole of order {m\geq 1} at {z_0}, then the log-derivative {\frac{f'(z)}{f(z)}} has a simple pole at {z_0} with residue {-m}.

Proof: The claim (i) is obvious. For (ii), we use Taylor expansion to factor {f(z) = (z-z_0)^m g(z)} for some {g} holomorphic and non-zero near {z_0}, and then from (11) we have

\displaystyle \frac{f'(z)}{f(z)} = \frac{g'(z)}{g(z)} + \frac{m}{z-z_0}.

Since {\frac{g'}{g}} is holomorphic at {z_0}, the claim (ii) follows. The claim (v) is proven similarly using a factorisation {f(z) = g(z) / (z-z_0)^m}, and using (12) in place of (11). The claims (iii), (iv) then follow from (i), (ii) respectively after removing the singularity. \Box

Remark 30 Note that the lemma does not cover all possible singularity and zero scenarios. For instance, {f} could be identically zero, in which case the log-derivative is nowhere defined. If {f} has an essential singularity then the log-derivative can be a pole (as seen for instance by the example {f(z) = \exp( 1 / z^m )} for some {m \geq 1}) or another essential singularity (as can be seen for instance by the example {f(z) = \exp(\exp(1/z))}). Finally, if {f} has a non-isolated singularity, then the log-derivative could exhibit a wide range of behaviour (but probably will be quite wild as one approaches the singular set).

By combining the above lemma with the residue theorem, we obtain the argument principle:

Theorem 31 (Argument principle) Let {\gamma: [a,b] \rightarrow {\bf C}} be a simple closed anticlockwise contour. Let {U} be an open set containing {\gamma} and its interior. Let {f} be a meromorphic function on {U} that is holomorphic and non-zero on the image of {\gamma}. Suppose that after removing all the removable singularities of {f}, {f} has zeroes {z_1,\dots,z_k} in the interior of {\gamma} (of orders {d_1,\dots,d_k} respectively), and poles {w_1,\dots,w_l} in the interior of {\gamma} (of orders {e_1,\dots,e_l} respectively). ({f} is also allowed to have zeroes and poles in the exterior of {\gamma}.) Then we have

\displaystyle \sum_{j=1}^k d_j - \sum_{j=1}^l e_j = \frac{1}{2\pi i} \int_\gamma \frac{f'(z)}{f(z)}\ dz \ \ \ \ \ (13)

\displaystyle = W_{f \circ \gamma}(0)

where {f \circ \gamma: [a,b] \rightarrow {\bf C}} is the closed contour {t \mapsto f(\gamma(t))}.

Proof: The first equality of (13) follows from the residue theorem and Lemma 29. From the change of variables formula (Exercise 16(ix) of Notes 2) we have

\displaystyle \frac{1}{2\pi i} \int_\gamma \frac{f'(z)}{f(z)}\ dz = \frac{1}{2\pi i} \int_{f \circ \gamma} \frac{1}{z}\ dz

and the second identity also follows. \Box

We isolate the special case of the argument principle when there are no poles for special mention:

Corollary 32 (Special case of argument principle) Let {\gamma} be a simple closed anticlockwise contour, let {U} be an open set containing the image of {\gamma} and its interior, and let {f: U \rightarrow {\bf C}} be holomorphic. Suppose that {f} has no zeroes on the image of {\gamma}. Then the number of zeroes of {f} (counting multiplicity) in the interior of {\gamma} is equal to the winding number {W_{f \circ \gamma}(0)} of {f \circ \gamma} around the origin.

Recalling that the winding number is a homotopy invariant (Lemma 41 of Notes 3), we conclude that the number of zeroes of a holomorphic function {f} in the interior of a simple closed anticlockwise contour is also invariant with respect to continuous perturbations, so long as zeroes never cross the contour itself. More precisely:

Corollary 33 (Stability of number of zeroes) Let {U} be an open set. Let {\gamma_0: [a,b] \rightarrow U}, {\gamma_1: [a,b] \rightarrow U} be simple closed anticlockwise contours that are homotopic as closed curves via some homotopy {\gamma: [0,1] \times [a,b] \rightarrow U}; suppose also that {U} contains the interiors of {\gamma_0} and {\gamma_1}. Let {f_0, f_1: U \rightarrow {\bf C}} be holomorphic, and let {f: [0,1] \times U \rightarrow {\bf C}} be a continuous function such that {f(0,z) = f_0(z)} and {f(1,z) = f_1(z)} for all {z \in U}. Suppose that {f(s,\gamma(s,t)) \neq 0} for all {s \in [0,1]} and {t \in [a,b]} (i.e., at time {s}, the curve {\gamma(s,\cdot)} never encounters any zeroes of {f(s,\cdot)}). Then the number of zeroes (counting multiplicity) of {f_0} in the interior of {\gamma_0} equals the number of zeroes of {f_1} in the interior of {\gamma_1} (counting multiplicity).

Proof: By Corollary 32, it suffices to show that

\displaystyle W_{f_0 \circ \gamma_0}(0) = W_{f_1 \circ \gamma_1}(0).

But the curves {f_0 \circ \gamma_0: [a,b] \rightarrow {\bf C} \backslash \{0\}} and {f_1 \circ \gamma_1: [a,b] \rightarrow {\bf C} \backslash \{0\}} are homotopic as closed curves in {{\bf C} \backslash \{0\}}, using the homotopy {F: [0,1] \times [a,b] \rightarrow {\bf C} \backslash \{0\}} defined by

\displaystyle F( s, t ):= f(s,\gamma(s,t))

(note that this avoids the origin by hypothesis). The claim then follows from Lemma 41 of Notes 3. \Box

Informally, the above corollary asserts that zeroes of holomorphic functions cannot be created or destroyed, as long as they are confined within a closed contour.

Example 34 Let {\gamma} be the unit circle {\gamma = \gamma_{0,1,\circlearrowleft}}. The polynomial {f_0(z) := z^2} has a double zero at {0}, so (counting multiplicity) has two zeroes in the interior of {\gamma}. If we consider instead the perturbation {f_\varepsilon(z) = z^2 + \varepsilon} for some {\varepsilon>0}, this has simple zeroes at {+i\sqrt{\varepsilon}} and {-i\sqrt{\varepsilon}} respectively, so as long as {\varepsilon<1}, the holomorphic function {f_\varepsilon} also has two zeroes in the interior of {\gamma}; but as {\varepsilon} crosses {1}, the zeroes of {f_\varepsilon} pass through {\gamma}, and one no longer has any zeroes of {f_\varepsilon} in the interior of {\gamma}. The situation can be contrasted with the real case: the function {x \mapsto x^2+\varepsilon} has a double zero at the origin when {\varepsilon=0}, but as soon as {\varepsilon} becomes positive, the zeroes immediately disappear from the real line. Note that the stability of zeroes fails if we do not count zeroes with multiplicity; thus, as a general rule of thumb, one should always try to count zeroes with multiplicity when doing complex analysis. (Heuristically, one can think of a zero of order {m} as {m} simple zeroes that are “infinitesimally close together".)

Example 35 When one considers meromorphic functions instead of holomorphic ones, then the number of zeroes inside a region need not be stable any more, but the number of zeroes minus the number of poles will be stable. Consider for instance the meromorphic function {f_0(z) = \frac{z^2}{z^2}}, which has a removable singularity at {0} but no zeroes or poles. If we perturb it to {f_\varepsilon(z) := \frac{z^2+\varepsilon}{z^2}} for some {\varepsilon>0}, then we suddenly have a double pole at {0}, but this is balanced by two simple zeroes at {+i\sqrt{\varepsilon}} and {-i\sqrt{\varepsilon}}; in the limit as {\varepsilon \rightarrow 0} we see that the two zeroes “collide” with the double pole, annihilating both the zeroes and the poles.

A particularly useful special case of the stability of zeroes is Rouche’s theorem:

Theorem 36 (Rouche’s theorem) Let {\gamma} be a simple closed contour, and let {U} be an open set containing the image of {\gamma} and its interior. Let {f, g: U \rightarrow {\bf C}} be holomorphic. If one has {|g(z)| < |f(z)|} for all {z} in the image of {\gamma}, then {f} and {f+g} have the same number of zeroes (counting multiplicity) in the interior of {\gamma}.

Proof: We may assume without loss of generality that {\gamma} is anticlockwise. By hypothesis, {f} and {f+g} cannot have zeroes on the image of {\gamma}. The claim then follows from Corollary 33 with {\gamma_0=\gamma_1=\gamma}, {\gamma(s,t) := \gamma(t)}, {f_0(z) := f(z)}, {f_1(z) := f(z) + g(z)}, and {f(s,z) := f(z) + s g(z)}. \Box

Rouche’s theorem has many consequences for complex analysis. One basic consequence is the open mapping theorem:

Theorem 37 (Open mapping theorem) Let {U} be an open connected non-empty subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be holomorphic and not constant. Then {f(U)} is also open.

Proof: Let {z_0 \in U}. As {f} is not constant, the zeroes of {f(z)-f(z_0)} are isolated (Corollary 24 of Notes 3). Thus, for {r} sufficiently small, {f(z)-f(z_0)} is nonvanishing on the image of the circle {\gamma_{z_0,r,\circlearrowleft}}. Clearly {f(z)-f(z_0)} has at least one zero in the interior of this circle. Thus, by Rouche’s theorem, if {w} is sufficiently close to {f(z_0)}, then {f(z)-w} will also have at least one zero in the interior of this circle. In particular, {f(U)} contains a neighbourhood of {f(z_0)}, and the claim follows. \Box

Exercise 38 Use Rouche’s theorem to obtain another proof of the fundamental theorem of algebra, by showing that a polynomial {P(z) = a_n z^n + \dots + a_0} with {a_n \neq 0} and {n \geq 1} has exactly {n} zeroes (counting multiplicity) in the complex plane. (Hint: compare {P(z)} with {a_n z^n} inside some large circle {\gamma_{0,R,\circlearrowleft}}.)

Exercise 39 (Inverse function theorem) Let {U} be an open subset of {{\bf C}}, let {z_0 \in U}, and let {f: U \rightarrow {\bf C}} be a holomorphic function such that {f'(z_0) \neq 0}. Show that there exists a neighbourhood {V} of {z_0} in {U} such that the map {f: V \rightarrow f(V)} is a complex diffeomorphism; that is to say, it is holomorphic, invertible, and the inverse is also holomorphic. Finally, show that

\displaystyle (f^{-1})'(w) = \frac{1}{f'(f^{-1}(w))}

for all {w \in f(V)}. (Hint: one can either mimic the real-variable proof of the inverse function theorem using the contraction mapping theorem, or one can use Rouche’s theorem and the open mapping theorem to construct the inverse.)

Exercise 40 Let {U} be an open subset of {{\bf C}}, and {f: U \to {\bf C}} be a map. Show that the following are equivalent:

  • (i) {f} is a local complex diffeomorphism. That is to say, for every {z_0 \in U} there is a neighbourhood {V} of {z_0} in {U} such that {f(V)} is open and {f: V \rightarrow f(V)} is a complex diffeomorphism (as defined in the preceding exercise).
  • (ii) {f} is holomorphic on {U} and is a local homeomorphism. That is to say, for every {z_0 \in U} there is a neighbourhood {V} of {z_0} in {U} such that {f(V)} is open and {f: V \rightarrow f(V)} is a homeomorphism.
  • (iii) {f} is holomorphic on {U} and is a local injection. That is to say, for every {z_0 \in U} there is a neighbourhood {V} of {z_0} in {U} such that {f: V \rightarrow f(V)} is injective.
  • (iv) {f} is holomorphic on {U}, and the derivative {f'} is nowhere vanishing.

Exercise 41 (Hurwitz’s theorem) Let {U} be an open connected non-empty subset of {{\bf C}}, and let {f_n: U \rightarrow {\bf C}} be a sequence of holomorphic functions that converge uniformly on compact sets to a limit {f: U \rightarrow {\bf C}} (which is then necessarily also holomorphic, thanks to Theorem 34 of Notes 3). Prove the following two versions of Hurwitz’s theorem:

  • (i) If none of the {f_n} have any zeroes in {U}, show that either {f} also has no zeroes in {U}, or is identically zero.
  • (ii) If all of the {f_n} are univalent (that is to say, they are injective holomorphic functions), show that either {f} is also univalent, or is constant.

Exercise 42 (Bloch’s theorem) The purpose of this exercise is to establish a more quantitative variant of the open mapping theorem, due to Bloch; this will be useful later in this notes for proving the Picard and Montel theorems. Let {f: D(z_0,R) \rightarrow {\bf C}} be a holomorphic function on a disk {D(z_0,R)}, and suppose that {f'(z_0)} is non-zero

  • (i) Suppose that {|f'(z)| \leq 2 |f'(z_0)|} for all {z \in D(z_0,R)}. Show that there is an absolute constant {c>0} such that {f(D(z_0,R))} contains the disk {D(f(z_0), c|f'(z_0)| R)}. (Hint: one can normalise {z_0=0}, {R=1}, {f'(z_0)=1}. Use the higher order Cauchy integral formula to get some bound on {f''(z)} for {z} near the origin, and use this to approximate {f} by {z} near the origin. Then apply Rouche’s theorem.)
  • (ii) Without the hypothesis in (i), show that there is an absolute constant {c'>0} such that {f(D(z_0,R))} contains a disk of radius {c' |f'(z_0)| R}. (Hint: if one has {|f'(z)| \leq 2 |f'(z_0)|} for all {z \in D(z_0,R/4)}, then we can apply (i) with {R} replaced by {R/4}. If not, pick {z_1 \in D(z_0,R/4)} with {|f'(z_1)| > 2 |f'(z_0)|}, and start over with {z_0} replaced by {z_1} and {R} replaced by {R/2}. One cannot iterate this process indefinitely as it will create a singularity of {f} in {D(z_0,R)}.)

— 4. Branches of the complex logarithm —

We have refrained until now from discussing one of the most basic transcendental functions in complex analysis, the complex logarithm. In real analysis, the real logarithm {\ln: (0,+\infty) \rightarrow {\bf R}} can be defined as the inverse of the exponential function {\exp: {\bf R} \rightarrow (0,+\infty)}; it can also be equivalently defined as the antiderivative of the function {x \mapsto \frac{1}{x}}, with the initial condition {\ln(1) = 0}. (We use {\ln} here for the real logarithm in order to distinguish it from the complex logarithm below.)

Let’s see what happens when one tries to extend these definitions to the complex domain. We begin with the inversion of the complex exponential. From Euler’s formula we have that {e^{2\pi i} = 1}; more generally, we have {e^z = e^w} whenever {z = w + 2 k \pi i} for some integer {k}. In particular, the exponential function {\exp: {\bf C} \rightarrow {\bf C}} is not injective. Indeed, for any non-zero {z \in {\bf C}}, we have a multi-valued logarithm

\displaystyle \log(z) := \{ w: e^w = z \}

which, by Euler’s formula, can be written as

\displaystyle \log(z) = \ln |z| + i \mathrm{arg}(z)


\displaystyle \mathrm{arg}(z) := \{ \theta \in {\bf R}: \cos \theta + i \sin \theta = \frac{z}{|z|} \}

denotes all the possible arguments of {z} in polar form. These arguments are a coset of the group {2\pi {\bf Z} := \{ 2\pi k: k \in {\bf Z} \}}, and so the complex logarithm {\log(z)} is a coset of the group {2\pi i {\bf Z} := \{ 2\pi i k: k \in {\bf Z}\}}. For instance, if {z = 2i}, then

\displaystyle \mathrm{arg}(2i) = \{ \frac{\pi}{2} + 2 k \pi: k \in {\bf Z} \}


\displaystyle \log(2i) = \{ \ln 2 + \frac{\pi i}{2} + 2 k \pi i: k \in {\bf Z} \}.

The complex exponential never vanishes, so by our definitions we see that {\log(0)} is the empty set. As such, we will usually omit the origin from the domain {{\bf C}} when discussing the complex exponential.

Of course, one also encounters multi-valued functions in real analysis, starting when one tries to invert the squaring function {x \mapsto x^2}, as any given positive number {y} has two square roots. In the real case, one can eliminate this multi-valuedness by picking a branch of the square root function – a function which selects one of the multiple choices for that function at each point in the domain. In particular, we have the positive branch {x \mapsto \sqrt{x}} of the square root function on {[0,+\infty)}, as well as the negative branch {x \mapsto - \sqrt{x}}. One could also create more discontinuous branches of the square root function, for instance the function that sends {x} to {\sqrt{x}} for {0 \leq x \leq 5}, and {x} to {-\sqrt{x}} for {x > 5}.

Suppose now that we have a branch {f: {\bf C} \backslash \{0\} \rightarrow {\bf C}} of the logarithm function, thus

\displaystyle \exp(f(z))=z \ \ \ \ \ (14)

for any {z \in {\bf C} \backslash \{0\}}. If {f} is complex differentiable at some point {z_0 \in {\bf C} \backslash \{0\}}, then by differentiating (14) at {z_0} using the chain rule, we see that

\displaystyle f'(z_0) \exp( f(z_0) ) = 1

and hence by (14) again we have

\displaystyle f'(z_0) = \frac{1}{z_0}

(which is of course consistent with the real-variable formula {\frac{d}{dx} \log x = \frac{1}{x}}). If now {\gamma} is a closed contour in {{\bf C} \backslash \{0\}}, and {f} is differentiable on the entire image of {\gamma}, then the fundamental theorem of calculus then tells us that

\displaystyle \int_\gamma \frac{dz}{z} = \int_\gamma f'(z)\ dz = 0.

On the other hand, {\int_\gamma \frac{dz}{z}} is equal to {2\pi i W_\gamma(0)}. We thus conclude that for any branch {f} of the complex logarithm, the set on which {f} is complex differentiable cannot contain any closed curve that winds non-trivially around the origin. Thus for instance one cannot find a branch of {\log} that is holomorphic on all of {{\bf C} \backslash \{0\}}, or even on a neighbourhood of the unit circle (or any other curve going around the origin).

On the other hand, if {U} is a simply connected open subset of {{\bf C} \backslash \{0\}}, then from Cauchy’s theorem the function {\frac{1}{z}} is conservative on {U}. If we pick a point {z_0} in {U} and arbitrarily select a logarithm {w_0 \in \log(z_0)} of {z_0}, we can then use the fundamental theorem of calculus to find an antiderivative {f: U \rightarrow {\bf C}} of {\frac{1}{z}} on {U} with {f(z_0)=w_0}. By definition, {f} is holomorphic, and from the chain rule we have for all {z \in U} that

\displaystyle \frac{d}{dz} \exp(f(z)) = f'(z) \exp(f(z))

\displaystyle = \frac{1}{z} \exp(f(z))

and hence by the quotient rule

\displaystyle \frac{d}{dz} \frac{\exp(f(z))}{z} = 0.

As {U} is connected, {\frac{\exp(f(z))}{z}} must therefore be constant; by construction we have {\frac{\exp(f(z_0))}{z_0}=1}, and thus

\displaystyle \frac{\exp(f(z))}{z} = 1

for all {z \in U}. In other words, {f} is a branch of the complex logarithm.

Thus, for instance, the region {{\bf C} \backslash \{ -x: x \geq 0\}} formed by excluding the negative real axis from the complex plane is simply connected (it is star-shaped around {1}), and so must admit a holomorphic branch of the complex logarithm. One such branch is the standard branch {\mathrm{Log}: {\bf C} \backslash \{0\} \rightarrow {\bf C}} of the complex logarithm, defined as

\displaystyle \mathrm{Log}(z) := \ln |z| + i \mathrm{Arg}(z)

where {\mathrm{Arg}(z)} is the standard branch of the argument, defined as the unique argument in {\mathrm{arg}(z)} in the interval {(-\pi,\pi]}. This branch of the logarithm is continuous on {{\bf C} \backslash \{ -x: x \geq 0\}}, and hence (by the exercise below) is holomorphic on this region, and is thus an antiderivative of {1/z} here. Similarly if one replaces the negative real axis by other rays emenating from the origin (or indeed from arbitrary simple curves from zero to infinity, see Exercise 44 below.)

Exercise 43 Let {U} be a connected non-empty open subset of {{\bf C} \backslash \{0\}}.

  • (i) If {f: U \rightarrow {\bf C}} and {g: U \rightarrow {\bf C}} are continuous branches of the complex logarithm, show that there exists a natural number {k} such that {f(z) = g(z) + 2\pi i k} for all {z \in U}.
  • (ii) Show that any continuous branch {f: U \rightarrow {\bf C}} of the complex logarithm is holomorphic.
  • (iii) Show that there is a continuous branch {f: U \rightarrow {\bf C}} of the logarithm if and only if {0} and {\infty} lie in the same connected component of {({\bf C} \cup \{\infty\}) \backslash U}. (Hint: for the “if” direction, use a continuity argument to show that the winding number of any closed curve in {U} around {0} vanishes. For the “only if”, encircle the connected component of {0} in {({\bf C} \cup \{\infty\}) \backslash U} (which is a compact subset of {{\bf C}} by hypothesis) by a simple polygonal path in {U}.)

Exercise 44 Let {\gamma: [0,+\infty) \rightarrow {\bf C}} be a continuous injective map with {\gamma(0)=0} and {\gamma(t) \rightarrow \infty} as {t \rightarrow \infty}.

It is instructive to view the identity

\displaystyle \int_{\gamma_{0,1,\circlearrowleft}} \frac{dz}{z} = 2\pi i \ \ \ \ \ (15)

, through the lens of branches of the complex logarithm such as the standard branch {\mathrm{Log}}. From the fundamental theorem of calculus, one has

\displaystyle \int_\gamma \frac{dz}{z} = \mathrm{Log}(\gamma(b)) - \mathrm{Log}(\gamma(a))

for any curve {\gamma} that avoids the negative real axis. Of course, the contour {\gamma_{0,1,\circlearrowleft}} does not avoid this negative axis, but it can be approximated by (non-closed) contours that do. More precisely, one has

\displaystyle \int_{\gamma_{0,1,\circlearrowleft}} \frac{dz}{z} = \lim_{\varepsilon \rightarrow 0^+} \int_{\gamma_{[-\pi+\varepsilon,\pi-\varepsilon]}} \frac{dz}{z}

where {\gamma_{[-\pi+\varepsilon,\pi-\varepsilon]}: [-\pi+\varepsilon,\pi-\varepsilon] \rightarrow {\bf C}} is the map {t \mapsto e^{it}}. As each {\gamma_{[-\pi+\varepsilon,\pi-\varepsilon]}} avoids the negative real axis, we thus have

\displaystyle \int_{\gamma_{0,1,\circlearrowleft}} \frac{dz}{z} = \lim_{\varepsilon \rightarrow 0^+} \mathrm{Log}( e^{i (\pi-\varepsilon)}) - \mathrm{Log}( e^{i (-\pi+\varepsilon)}).

We observe that {\mathrm{Log}} has a jump discontinuity of {2\pi i} on the negative real axis, and specifically

\displaystyle \lim_{\varepsilon \rightarrow 0^+} \mathrm{Log}(e^{i(\pi-\varepsilon)}) = i \pi


\displaystyle \lim_{\varepsilon \rightarrow 0^+} \mathrm{Log}(e^{i(-\pi+\varepsilon)}) = -i \pi,

which gives an alternate derivation of the identity (15). More generally, the identity

\displaystyle \int_{\gamma} \frac{dz}{z} = 2\pi i W_\gamma(0)

for any closed curve {\gamma} avoiding the origin can be interpreted using the standard branch of the logarithm as a version of the Alexander numbering rule (Exercise 55 of Notes 3): each crossing of {\gamma} across the branch cut triggers a jump up or down in the count towards the winding number, depending on whether the crossing was in the anticlockwise or clockwise direction.

One can use branches of the complex logarithm to create branches of the {n^{\mathrm{th}}} root functions {z \mapsto z^{1/n}} for natural numbers {n>1}. As with the complex exponential, the function {z \mapsto z^n} is not injective, and so {z^{1/n}} is multivalued (see Exercise 15 of Notes 0). One cannot form a continuous branch of this function on {{\bf C} \backslash \{0\}} for any {n \geq 2}, as the corresponding branch of {1/z^{1/n}} would then contradict the quantisation of order of singularities (Exercise 13). However, on any domain {U} where there is a holomorphic branch {f: U \rightarrow {\bf C}} of the complex logarithm, one can define a holomorphic branch {g: U \rightarrow {\bf C}} of the {n^{\mathrm{th}}} function by the formula

\displaystyle g(z) := \exp( f(z) / n ).

It is easy to see that {g} is indeed holomorphic with {g(z)^n = z} for all {z \in U}. Thus for instance we have the standard branch {z \mapsto \exp( (\mathrm{Log} z) / n )} of the {n^{\mathrm{th}}} root function, which is holomorphic away from the negative real axis. More generally, one can define a “standard branch of {z \mapsto z^\alpha}” for any complex {\alpha} by the formula {z \mapsto \exp( \alpha \mathrm{Log} z )}, for instance the standard branch of {i^i} can be computed to be {e^{-\pi/2}}.

The presence of branch cuts can prevent one from directly applying the residue theorem to calculate integrals involving branches of multi-valued functions. But in some cases, the presence of the branch cut can actually be exploited to compute an integral. The following exercise provides an example:

Exercise 45 Compute the improper integral

\displaystyle \int_0^\infty \frac{dx}{\sqrt{x} (x+1)} := \lim_{\varepsilon \rightarrow 0, R \rightarrow \infty} \int_\varepsilon^R \frac{dx}{\sqrt{x} (x+1)}

by applying the residue theorem to the function {\frac{f(z)}{z+1}} for some branch {f(z)} of {z \mapsto z^{-1/2}} with branch cut on the positive real axis, and using a “keyhole” contour that is a perturbation of

\displaystyle \gamma_{0,\varepsilon,\circlearrowright} + \gamma_{\varepsilon \rightarrow R} + \gamma_{0,R,\circlearrowleft} + \gamma_{R \rightarrow \varepsilon};

the key point is that the branch cut makes the contribution of (the perturbations) of {\gamma_{\varepsilon \rightarrow R}} and {\gamma_{R \rightarrow \varepsilon}} fail to cancel each other.

The construction of holomorphic branches of {\log z} can be extended to other logarithms:

Exercise 46 Let {U} be a simply connected subset of {{\bf C}}, and let {f: U \rightarrow {\bf C}} be a holomorphic function with no zeroes on {U}.

  • (i) Show that there exists a holomorphic branch {g: U \rightarrow {\bf C}} of the complex logarithm {\log f}, thus {\exp(g) = f}.
  • (ii) Show that for any natural number {n > 1}, there exists a holomorphic branch {h: U \rightarrow {\bf C}} of the root function {f^{1/n}}, thus {h^n = f}.

Actually, one can invert other non-injective holomorphic functions than the complex exponential, provided that these functions are a covering map. We recall this topological concept:

Definition 47 (Covering map) Let {f: M \rightarrow N} be a continuous map between two connected topological spaces {M, N}. We say that {f} is a covering map if, for each {z_0 \in N}, there exists an open neighbourhood {U} of {z_0} in {N} such that the preimage {f^{-1}(U)} is the disjoint union of open subsets {V_\alpha, \alpha \in A} of {M}, such that for each {\alpha \in A}, the map {f: V_\alpha \rightarrow U} is a homeomorphism. In this situation, we call {M} a covering space of {N}.

In complex analysis, one specialises to the situation in which {M,N} are Riemann surfaces (e.g. they could be open subsets of {{\bf C}}), and {f} is a holomorphic map. In that case, the homeomorphisms {f: V_\alpha \rightarrow U} are in fact complex diffeomorphisms, thanks to Exercise 40.

Example 48 The exponential map {\exp: {\bf C} \rightarrow {\bf C} \backslash \{0\}} is a covering map, because for any element of {{\bf C} \backslash \{0\}} written in polar form as {r e^{i\theta}}, one can pick (say) the neighbourhood

\displaystyle U := \{ s e^{i\alpha}: r/2 < s < 2r; \theta - \frac{\pi}{2} < \alpha < \theta + \frac{\pi}{2} \}

of {re^{i\theta}}, and observe that the preimage {\exp^{-1}(U) = \log U} of {U} is the disjoint union of the open sets

\displaystyle V_k := \{ x + i \alpha: \ln(r/2) < x < \ln(r); \theta + 2 k \pi - \frac{\pi}{2} < \alpha < \theta + \frac{\pi}{2} \}

for {k \in {\bf Z}}, and that the exponential map {\exp: V_k \rightarrow U} is a diffeomorphism. A similar calculation shows that for any natural number {n > 1}, the map {z \mapsto z^n} is a covering map from {{\bf C} \backslash \{0\}} to {{\bf C} \backslash \{0\}}. However, the map {z \mapsto z^n} is not a covering map from {{\bf C}} to {{\bf C}}, because it fails to be a local diffeomorphism at zero due to the vanishing derivative (here we use Exercise 40). One final (non-)example: the map {z \mapsto z^3} is not a covering map from the upper half-plane {\{ z \in {\bf C}: \mathrm{Im}(z)>0\}} to {{\bf C} \backslash \{0\}}, because the preimage of any small disk {D(1,r)} around {1} splits into two disconnected regions, and only one of them is homeomorphic to {D(1,r)} via the map {z \mapsto z^3}.

From topology we have the following lifting property:

Lemma 49 (Lifting lemma) Let {f: M \rightarrow N} be a continuous covering map between two path-connected and locally path-connected topological spaces {M,N}. Let {U} be a simply connected and path connected topological space, and let {g: U \rightarrow N} be continuous. Let {z_0 \in U}, and let {p \in M} be such that {f(p) = g(z_0)}. Then there exists a unique continuous map {h: U \rightarrow M} such that {g = f \circ h} and {h(z_0)=p}, which we call a lift of {g} by {f}.

Proof: We first verify uniqueness. If we have two continuous functions {h_1, h_2: U \rightarrow N} with {h_1(z_0) = h_2(z_0) = p} and {g = f \circ h_1 = f \circ h_2}, then the set {\Omega := \{ z \in U: h_1(z) = h_2(z) \}} is clearly closed in {N} and contains {z_0}. From the covering map property we also see that {\Omega} is open, and hence by connectedness we have {h_1=h_2} on all of {U}, giving the claim.

To verify existence of the lift, we first prove the existence of monodromy. More precisely, given any curve {\gamma:[a,b] \rightarrow U} with {\gamma(a) = z_0} we show that there exists a unique curve {\tilde \gamma: [a,b] \rightarrow M} such that {\tilde \gamma(a) = p} and {f \circ \tilde \gamma = g \circ \gamma} (the reader is encouraged to draw a picture to describe this situation). Uniqueness follows from the connectedness argument used to prove uniqueness of the lift {h}, so we turn to existence. As in previous notes, we rely on a continuity argument. Let {\Omega} be the set of all {a \leq t \leq b} for which there exists a curve {\tilde \gamma_{[a,t]}: [a,t] \rightarrow M} such that {f \circ \tilde \gamma_{[a,t]} = g \circ \gamma_{[a,t]}}, where {\gamma_{[a,t]}} is the restriction of {\gamma} to {[a,t]}. Clearly {\Omega} is closed in {[a,b]} and contains {a}; using the covering map property it is not difficult to show that {\Omega} is also open in {[a,b]}. Thus {\Omega} is all of {[a,b]}, giving the claim.

Now let {\gamma_0: [a,b] \rightarrow U}, {\gamma_1: [a,b] \rightarrow U} be homotopic curves with fixed endpoints, with initial point {\gamma_0(a)=\gamma_1(a)=z_0} and some terminal point {z_1}, and let {\gamma: [0,1] \times [a,b] \rightarrow U} be a homotopy. For each {s \in [0,1]}, we have a curve {\gamma_s: [a,b] \rightarrow U} given by {\gamma_s(t) := \gamma(s,t)}, and by the preceding paragraph we can associate a curve {\tilde \gamma_s: [a,b] \rightarrow M} such that {\tilde \gamma_s(a) = p} and {f \circ \tilde \gamma_s = g \circ \gamma_s}. Another application of the continuity method shows that for all {t \in [a,b]}, the map {s \mapsto \tilde \gamma_s(t)} is continuous; in particular, the map {b \rightarrow \tilde \gamma_s(b)}. On the other hand, {\tilde \gamma_s(b)} lies in {f^{-1}(g(z_1))}, which is a discrete set thanks to the covering map property. We conclude that {\tilde \gamma_s(b)} is constant in {s}, and in particular that {\tilde \gamma_0(b) = \tilde \gamma_1(b)}.

Since {U} is simply connected, any two curves {\gamma_0, \gamma_1:[a,b] \rightarrow U} with fixed endpoints are homotopic. We can thus define a function {h: U \rightarrow M} by declaring {h(z_1)} for any {z_1 \in U} to be the point {\tilde \gamma(1)}, where {\gamma: [0,1] \rightarrow U} is any curve from {z_0} to {z_1}, and {\tilde \gamma} is constructed as before. By construction we have {g = f \circ h}, and from the local path connectedness of {N} and the covering map property of {f} we can check that {h} is continuous. The claim follows. \Box

We can specialise this to the complex case and obtain

Corollary 50 (Holomorphic lifting lemma) Let {f: M \rightarrow N} be a holomorphic covering map between two path-connected Riemann surfaces {M,N}. Let {U} be a simply connected and path connected Riemann surface, and let {g: U \rightarrow N} be holomorphic. Let {z_0 \in U}, and let {p \in M} be such that {f(p) = g(z_0)}. Then there exists a unique holomorphic map {h: U \rightarrow M} such that {g = f \circ h} and {h(z_0)=p}, which we call a lift of {g} by {f}.


Proof:  A Riemann surface is automatically locally path-connected, and a connected Riemann surface is automatically path connected (observe that the set of all points on the surface that can be path-connected to a reference point {p} is open, closed, and non-empty).  Applying Lemma 49, we obtain all the required claims, except that the lift {h} produced is only known to be continuous rather than holomorphic. But then we can locally express {h} as the composition of one of the local inverses of {f} with {g}. Applying Exercise 40, these local inverses are holomorphic, and so {h} is holomorphic also. \Box

Remark 51 It is also possible to establish the above corollary using the monodromy theorem and analytic continuation.

Exercise 52 Establish Exercise 46 using Corollary 50.

Exercise 53 Let {U} be simply connected, and let {f: U \rightarrow {\bf C} \backslash \{-1,+1\}} be holomorphic and avoid taking the values {+1,-1}. Show that there exists a holomorphic function {g: U \rightarrow {\bf C}} such that {f = \cos g}. (This can be proven either through Corollary 50, or by using the quadratic formula to solve for {e^{ig}} and then applying Exercise 46.)

In some cases it is also possible to obtain lifts in non-simply connected domains:

Exercise 54 Show that there exists a holomorphic function {f: {\bf C} \backslash [0,1] \rightarrow {\bf C}} such that {\exp(f(z)) = \frac{z-1}{z}} for all {z \in {\bf C} \backslash [0,1]}. (Hint: use the Schwartz reflection principle, see Exercise 37 of Notes 3.)

As an illustration of what one can do with all this machinery, let us now prove the Picard theorems. We begin with the easier “little” Picard theorem.

Theorem 55 (Little Picard theorem) Let {f: {\bf C} \rightarrow {\bf C}} be entire and non-constant. Then {f({\bf C})} omits at most one point of {{\bf C}}.

The example of the exponential function {\exp: {\bf C} \rightarrow {\bf C}}, whose range omits the origin, shows that one cannot make any stronger conclusion about {f({\bf C})}.

Proof: Suppose for contradiction that we have an entire non-constant function {f: {\bf C} \rightarrow {\bf C}} such that {f({\bf C})} omits at least two points. After applying a linear transformation, we may assume that {f({\bf C})} avoids {0} and {1}, thus {f} takes values in {{\bf C} \backslash \{0,1\}}.

At this point, the most natural thing to do from a Riemann surface point of view would be to cover {{\bf C} \backslash \{0,1\}} by a bounded region, so that Liouville’s theorem may be applied. This can be done easily once one has the machinery of elliptic functions; but as we do not have this machinery yet, we will instead use a more ad hoc covering of {{\bf C} \backslash \{0,1\}} using the exponential and trigonometric functions to achieve a passable substitute for this strategy.

We turn to the details. Since {f({\bf C})} avoids {0}, we may apply Exercise 46 to write {f = \exp(2\pi i g)} for some entire {g: {\bf C} \rightarrow {\bf C}}. As {f({\bf C})} avoids {1}, {g({\bf C})} must avoid the integers {{\bf Z}}.

Next, we apply Exercise 53 to write {g = \cos(h)} for some entire {h: {\bf C} \rightarrow {\bf C}}. The set {h({\bf C})} must now avoid all complex numbers of the form {\pm i \cosh^{-1}(j) + 2 \pi k} for natural numbers {j} and integers {k}. In particular, if {C} is large enough, we see that {h({\bf C})} does not contain any disk of the form {D(w,C)}. Applying Bloch’s theorem (Exercise 42(ii)) in the contrapositive, we conclude that for any disk {D(z_0,R)} in {{\bf C}}, one has {|h'(z_0)|\leq C'/R} for some absolute constant {C'}. Sending {R} to infinity and using the fundamental theorem of calculus, we conclude that {h} is constant, hence {g} and {f} are also constant, a contradiction. \Box

Now we prove the more difficult “great” Picard theorem.

Theorem 56 (Great Picard theorem) Let {f: D(z_0,r) \backslash \{z_0\} \rightarrow {\bf C}} be holomorphic on a disk {D(z_0,r)} outside of a singularity at {z_0}. If this singularity is essential, then {f( D(z_0,r) \backslash \{z_0\} )} omits at most one point of {{\bf C}}.

Note that if one only has a pole at {z_0}, e.g. if {f(z) = (z-z_0)^{-m}} for some natural number {m}, then the conclusion of the great Picard theorem fails. This result easily implies both the little Picard theorem (because if {f: {\bf C} \rightarrow {\bf C}} is entire and non-polynomial, then {f(1/z)} has an essential singularity at the origin) and the Casorati-Weierstrass theorem (Theorem 11(iii)). By repeatedly passing to smaller neighbourhoods, one in fact sees that with at most one exception, every complex number {c \in {\bf C}} is attained infinitely often by a function holomorphic in a punctured disk around an essential singularity.

Proof: This will be a variant of the proof of the little Picard theorem; it would again be more natural to use elliptic functions, but we will use some passable substitutes for such functions concocted in an ad hoc fashion out of exponential and trigonometric functions.

Assume for contradiction that {f: D(z_0,r) \backslash \{z_0\} \rightarrow {\bf C}} has an essential singularity at {z_0} and avoids at least two points in {{\bf C}}. Applying linear transformations to both the domain and range of {f}, we may normalise {z_0=0}, {r=1}, and assume that {f} avoids {0} and {1}, thus we have a holomorphic map {f: D(0,1) \backslash \{0\} \rightarrow {\bf C} \backslash \{0,1\}} with an essential singularity at {0}.

The domain {D(0,1) \backslash \{0\}} is not simply connected, so we work instead with the function

\displaystyle F: \{ z \in {\bf C}: \mathrm{Re}(z) > 0 \} \rightarrow {\bf C} \backslash \{0,1\}

defined by

\displaystyle F(z) := f( \exp(-z) ).

Clearly {F} is holomorphic on the right-half plane {\{ z \in {\bf C}: \mathrm{Re}(z) > 0 \}} and avoids {0,1}. We also observe that {F} obeys the periodicity property

\displaystyle F(z + 2\pi i) = F(z). \ \ \ \ \ (16)

As the right-half plane is simply connected, we may (as before) express {F = \exp(2\pi i G)} for some holomorphic function {G: \{ z \in {\bf C}: \mathrm{Re}(z) > 0 \} \rightarrow {\bf C} \backslash {\bf Z}}, and then write {G = \cos(H)} for some holomorphic function {H: \{z \in {\bf C}: \mathrm{Re}(z) > 0 \} \rightarrow {\bf C}} that avoids all numbers of the form {\pm i \cosh^{-1}(j) + 2 \pi k} for natural numbers {j} and integers {k}. Using Bloch’s theorem as before, we see that for any disk {D(z_0,R)} in the right-half plane {\{z \in {\bf C}: \mathrm{Re}(z) > 0 \}}, we have {|H'(z_0)| \leq C / R} for some absolute constant {C}. We cannot set {R} to infinity any more, but we can make {R} as large as the real part of {z_0}, giving the bound

\displaystyle |H'(z_0)| \leq \frac{C}{\mathrm{Re}(z_0)}.

In particular, on integrating along a line segment from {2+iy} to {x+iy}, and using the boundedness of {H} on the compact set {\{ 2+iy: 0 \leq y \leq 2\pi\}}, we obtain a bound of the form

\displaystyle |H(x+iy)| \leq A \log x

for some {A > 0}, and all {x \geq 2} and {0 \leq y \leq 2\pi}. Taking cosines using the formula {\cos(z) = (e^z+e^{-z})/2}, we obtain a polynomial type bound

\displaystyle |G(x+iy)| \leq x^{A} \ \ \ \ \ (17)

for all {x \geq 2} and {0 \leq y \leq 2\pi}.

On the other hand, from (16) one has

\displaystyle \exp( G( z + 2\pi i ) ) = \exp( G(z) )

and hence

\displaystyle G(z+2\pi i) - G(z) \in 2 \pi i {\bf Z}

for all {z} in the right half-plane. The set {2\pi i {\bf Z}} is discrete, the function {z \mapsto G(z+2\pi i)-G(z)} is continuous, and the right half-plane is connected, so this function must in fact be constant. That is to say, there exists an integer {k} such that

\displaystyle G(z+2\pi i) - G(z) = 2\pi i k

for all {z} in the upper half plane. Equivalently, the function {G(z) - kz} is periodic with period {2\pi i}. From (17) and the triangle inequality we conclude that

\displaystyle |G(x+iy) - k(x+iy)| \leq x^A + 2 \pi |k| \ \ \ \ \ (18)

for all {x \geq 2} and {y \in {\bf R}}.

We now upgrade this bound on (18) by exploiting the quantisation of pole orders (Exercise 13). As the function {G(z)-kz} is periodic with period {2\pi i} on the right half-plane, we may write

\displaystyle G(z) - k z = g( \exp( -z ) ) \ \ \ \ \ (19)

for some function {g: D(0,1) \backslash \{0\} \rightarrow {\bf C}}, which is holomorphic thanks to the chain rule. From (18) we have

\displaystyle |g(z)| \leq \log^A \frac{1}{z} + 2\pi |k|

when {0 < |z| \leq e^{-2}}. Applying Exercise 13 (with, say, {m=0} and {\varepsilon=1/2}), we conclude that {g} has a removable singularity and is thus in particular bounded on (say) the disk {D(0,e^{-2})}. From (19) we conclude that {G(z)-kz} is bounded on the region {\{ z: \mathrm{Re}(z) \geq 2 \}}; taking exponentials, we conclude that {F(z) e^{-kz}} is also bounded on this region. Since {F(z) = f(\exp(-z))}, we conclude that {f(z) z^k} is bounded on {D(0,e^{-2}) \backslash \{0\}}, and thus by Riemann’s theorem (Exercise 35 from Notes 3) has a removable singularity at the origin. But by taking Laurent series, this implies that {f} has a pole of order at most {k} at the origin, contradicting the hypothesis that the singularity of {f} at the origin was essential. \Box

Exercise 57 (Montel’s theorem) Let {U} be an open subset of the complex plane. Define a holomorphic normal family on {U} to be a collection {{\mathcal F}} of holomorphic functions {f: U \rightarrow {\bf C}} with the following property: given any sequence {f_n} in {{\mathcal F}}, there exists a subsequence {f_{n_j}} which is uniformly convergent on compact sets (i.e., for every compact subset {K} of {U}, the sequence {f_{n_j}} converges uniformly on {K} to some limit). Similarly, define a meromorphic normal family to be a collection {{\mathcal F}} of meromorphic functions {f: U \rightarrow {\bf C} \cup \{\infty\}} such that for any sequence {f_n} in {{\mathcal F}}, there exists a subsequence {f_{n_j}: U \rightarrow {\bf C} \cup \{\infty\}} that are uniformly convergent on compact sets, using the metric on the Riemann sphere induced by the identification with the geometric sphere {\{ (z,t) \in {\bf C} \times {\bf R}: |z|^2 + (t-1/2)^2 = 1/4\}}. (More succinctly, normal families are those families of holomorphic or meromorphic functions that are precompact in the locally uniform topology.)

  • (i) (Little Montel theorem) Suppose that {{\mathcal F}} is a collection of holomorphic functions {f: U \rightarrow {\bf C}} that are uniformly bounded on compact sets (i.e., for each compact {K \subset U} there exists a constant {C_K} such that {|f(z)| \leq C_K} for all {f \in {\mathcal F}} and {z \in K}). Show that {{\mathcal F}} is a holomorphic normal family. (Hint: use the higher order Cauchy integral formula to establish some equicontinuity on this family on compact sets, then use the Arzelá-Ascoli theorem.
  • (ii) (Great Montel theorem) Let {z_0,z_1,z_2 \in {\bf C} \cup \{\infty\}} be three distinct elements of the Riemann sphere, and suppose that {{\mathcal F}} is a family of meromorphic functions {f: U \rightarrow ({\bf C} \cup \{\infty\}) \backslash \{z_0,z_1,z_2\}} which avoid the three points {z_0,z_1,z_2}. Show that {{\mathcal F}} is a meromorphic normal family. (Hint: use some elementary transformations to reduce to the case {z_0=0, z_1=1, z_2=\infty}. Then, as in the proof of the Picard theorems, express each element {f} of {{\mathcal F}} locally in the form {f = \exp(2\pi i \cos(h))} and use Bloch’s theorem to get some uniform bounds on {h'}.)

Exercise 58 (Harnack principle) Let {U} be an open connected subset of {{\bf C}}, and let {u_n: U \rightarrow {\bf R}} be a sequence of harmonic functions which is pointwise nondecreasing (thus {u_{n+1}(z) \geq u_n(z)} for all {z \in U} and {n \geq 1}). Show that {\sup_n u_n} is either infinite everywhere on {U}, or is harmonic. (Hint: work locally in a disk. Write each {u_n} on this disk as the real part of a holomorphic function {f_n}, and apply Montel’s theorem followed by the Hurwitz theorem to {e^{-f_n}}.) This result is known as Harnack’s principle.

Exercise 59

  • (i) Show that the function {z \mapsto \log|z|} is harmonic on {{\bf C} \backslash \{0\}} but has no harmonic conjugate.
  • (ii) Let {r>0}, and let {u: D(0,r) \backslash \{0\} \rightarrow {\bf R}} be a harmonic function obeying the bounds

    \displaystyle  |u(z)| \leq C_1 \log \frac{1}{|z|} + C_2

    for all {z \in D(0,r) \backslash \{0\}} and some constants {C_1,C_2}. Show that there exists a real number {c} and a harmonic function {w: D(0,r) \rightarrow {\bf R}} such that

    \displaystyle  u(z) = c \log |z| + w(z)

    for all {z \in D(0,r)}. (Hint: one can find a conjugate of {u} outside of some branch cut, say the negative real axis restricted to {B(0,r)}. Adjust {u} by a multiple of {\log |z|} until the conjugate becomes continuous on this branch cut.)

Exercise 60 (Local description of holomorphic maps) Let {f: U \rightarrow {\bf C}} be a holomorphic function on an open subset {U} of {{\bf C}}, let {z_0} be a point in {U}, and suppose that {f} has a zero of order {n} at {z_0} for some {n \geq 1}. Show that there exists a neighbourhood {V} of {z_0} in {U} on which one has the factorisation {f(z) = g(z)^n}, where {g: V \rightarrow {\bf C}} is holomorphic with a simple zero at {z_0} (and hence a complex diffeomorphism from a sufficiently small neighbourhood of {z_0} to a neighbourhood of {0}). Use this to give an alternate proof of the open mapping theorem (Theorem 37).

Exercise 61 (Winding number and lifting) Let {z_0 \in {\bf C}}, let {\gamma: [a,b] \rightarrow {\bf C} \backslash \{z_0\}} be a closed curve avoiding {z_0}, and let {k} be an integer. Show that the following are equivalent:

  • (i) {W_{\gamma}(z_0) = k}.
  • (ii) There exists a complex number {w} and a curve {\eta: [a,b] \rightarrow {\bf C}} from {w} to {w+2k\pi i} such that {\gamma(t) = z_0 + \exp(\eta(t))} for all {t \in [a,b]}.
  • (iii) {\gamma} is homotopic up to reparameterisation as closed curves in {{\bf C} \backslash \{z_0\}} to the curve {\gamma_{z_0,r,m \circlearrowleft}: [0,2\pi] \rightarrow {\bf C} \backslash \{z_0\}} that maps {t} to {z_0 + r e^{imt}} for some {r>0}.

Filed under: 246A - complex analysis, math.AT, math.CA, math.CV, math.DG Tagged: argument principle, branch cut, complex logarithm, Riemann sphere, Rouche's theorem, singularity

David Hogg#GaiaSprint, day 4

Today was another impressive day at the Sprint. Jonathan Bird (Vanderbilt) got together a break-out session to talk about low-hanging projects in Gaia DR1 that no-one is currently doing, just to record ideas and inspire conversation. That led to this impressive list! Not everything on that list is low-hanging (and not everything in this telegraphic document is really comprehensible), but there are lots of Gaia projects that could be done right now.

Meanwhile, Adrian Price-Whelan (Princeton) noticed that thousands (yes, thousands) of the co-moving stellar pairs found by Semyeong Oh (Princeton) and us have both members observed by RAVE-on. He started making plots of their differences in velocity and abundances. It looks like there are some interlopers (more than we expect from naive contamination estimates), but a big core of pairs that have both identical velocities and identical abundances. Exciting! Now if only we can convince Keith Hawkins (Columbia) to measure detailed abundances...?

In the afternoon, Jackie Faherty (AMNH), David Rodriguez (AMNH), and Brian Abbott (AMNH) came to show us a visualization tool with the TGAS data uploaded. The most fun visualization was the one that runs the clock forwards and backwards on the proper motions! They are also looking forward to putting Gaia data on the dome of the Rose Center Planetarium!

In the evening check-in, there were some impressive results. Doug Finkbeiner (Harvard) showed us his pip-installable and software-operable tools (built with Greg Green) to access the 3-d dust map built from the PanSTARRS data. Jason Sanders (Cambridge) compared age–velocity relationships expected from toy models with that observed in the TGAS+RAVE data, where he estimates ages using isochrone fitting and photometry. He finds heating at very short ages, which is apparently not surprising. Dan Foreman-Mackey (UW) showed fits that he and Tim Morton (Princeton) have been doing to get better parameters for exoplanet host stars and the input catalog to the Kepler mission. They are literally doing the entire input catalog, because this is necessary for populations studies. One thing they find is that some conclusions about planet insolation (think: habitability) will change in this era of Gaia.

I mentioned PanSTARRS above, but I should note that Finkbeiner could not actually work on the PanSTARRS data at the Gaia Sprint, because we had rules about open-ness and data sharing, which you can read on the meeting page. I can't adequately say just how appreciative we all are of the Gaia DPAC teams for making their data public. I should also say how appreciative we all are of the other surveys and collaborations and tool builders who make their data and software public for us all to use. Of course the data and tool releasers benefit from these releases enormously, but these releases also require a certain level of bravery, honesty, and time commitment; it isn't easy.

Robert HellingMandatory liability for software is a horrible idea

Over the last few days, a number of prominent web sites including Twitter, Snapchat and Github were effectively unreachable for an extended period of time. As became clear, the problem was that DynDNS, a provider of DNS services for these sites was under a number of very heavy DDoS (distributed denial of service) attack that were mainly coming from compromised internet of things devices, in particular web cams.

Even though I do not see a lot of benefit from being able to change the color of my bedroom light via internet, I love the idea to have lots of cheap devices (I continue to have a lot of fun with C.H.I.P.s, full scale Linux computers with a number of ports for just 5USD, also for Subsurface, in particular those open opportunities for the mobile version), there are of course concerns how one can economically have a stable update cycle for those, in particular once they are build into black-box customer devices.

Now, after some dust settled comes of course the question "Who is to blame?" and should be do anything about this. Of course, the manufacturer of the web cam made this possible through far from perfect firmware. Also, you could blame DynDNS for not being able to withstand the storms that from time to time sweep the internet (a pretty rough place after all) or the services like Twitter to have a single point of failure in DynDNS (but that might be hard to prevent given the nature of the DNS system).

More than once I have now heard a call for new laws that would introduce a liability for the manufacturer of the web cam as they did not provide firmware updates in time that prevent these devices from being owned and then DDoSing around on the internet.

This, I am convinced, would be a terrible idea: It would make many IT businesses totally uneconomic. Let's stick for example with the case at hand. What is the order of magnitude of damages that occurred to the big companies like Twitter? They probably lost ad revenue of about a weekend. Twitter recently made $6\cdot 10^8\$ $ per quarter, which averages to 6.5 million per day. Should the web cam manufacturer (or OEM or distributor) now owe Twitter 13 million dollars? I am sure that would cause immediate bankruptcy. Or just the risk that this could happen would prevent anybody from producing web cams or similar things in the future. As nobody can produce non-trivial software that is free of bugs. You should strive to weed out all known bugs and provide updates, of course, but should you be made responsible if you couldn't? Responsible in a financial sense?

What was the damage cause by the heart bleed bug? I am sure this was much more expensive. Who should pay for this? OpenSSL? Everybody that links against OpenSSL? The person that committed the wrong patch? The person that missed it code review?

Even if you don't call up these astronomic sums and have fixed fine (e.g. an unfixed vulnerability that gives root access to an attacker from the net costs 10000$) that would immediately stop all open source development. If you give away your software for free, do you really want to pay fines if not everything is perfect? I surely wouldn't.

For that reason, the GPL has the clauses (and other open source licenses have similar ones) stating

(capitalization in the original). Of course, there is "required by applicable law" but I cannot see people giving you software for free if you later make them pay fines.

And for course, it is also almost impossible to make exceptions in the law for this. For example, a "non-commercial" exception does not help as even though you do not charge for open source software a lot of it is actually provided with some sort of commercial interest.

Yes, I can understand the tendency to make creators of defective products that don't give a damn about an update path responsible for the stuff they ship out. And I have the greatest sympathy for consumer protection laws. But here, there collateral damage would be huge (we might well lose the whole open source universe every small software company except the few big one that can afford the herds of lawyers to defend against these fines).

Note that I only argue for mandatory liability. It should of course always be a possibility that a provider of software/hardware give some sort of "fit for purpose" guarantee to its customers or a servicing contract where they promise to fix bugs (maybe so that the customer can fulfill their liabilities to their customers herself). But in most of the cases, the provider will charge for that. And the price might be higher than currently that for a light bulb with an IP address.

The internet is a rough place. If you expose your service to it better make sure you can handle every combination of 0s and 1s that comes in from there or live with it. Don't blame the source of the bits (no matter how brain dead the people at the other end might be).

Terence TaoAnother problem about power series

By an odd coincidence, I stumbled upon a second question in as many weeks about power series, and once again the only way I know how to prove the result is by complex methods; once again, I am leaving it here as a challenge to any interested readers, and I would be particularly interested in knowing of a proof that was not based on complex analysis (or thinly disguised versions thereof), or for a reference to previous literature where something like this identity has occured. (I suspect for instance that something like this may have shown up before in free probability, based on the answer to part (ii) of the problem.)

Here is a purely algebraic form of the problem:

Problem 1 Let {F = F(z)} be a formal function of one variable {z}. Suppose that {G = G(z)} is the formal function defined by

\displaystyle  G := \sum_{n=1}^\infty \left( \frac{F^n}{n!} \right)^{(n-1)}

\displaystyle  = F + \left(\frac{F^2}{2}\right)' + \left(\frac{F^3}{6}\right)'' + \dots

\displaystyle  = F + FF' + (F (F')^2 + \frac{1}{2} F^2 F'') + \dots,

where we use {f^{(k)}} to denote the {k}-fold derivative of {f} with respect to the variable {z}.

  • (i) Show that {F} can be formally recovered from {G} by the formula

    \displaystyle  F = \sum_{n=1}^\infty (-1)^{n-1} \left( \frac{G^n}{n!} \right)^{(n-1)}

    \displaystyle  = G - \left(\frac{G^2}{2}\right)' + \left(\frac{G^3}{6}\right)'' - \dots

    \displaystyle  = G - GG' + (G (G')^2 + \frac{1}{2} G^2 G'') - \dots.

  • (ii) There is a remarkable further formal identity relating {F(z)} with {G(z)} that does not explicitly involve any infinite summation. What is this identity?

To rigorously formulate part (i) of this problem, one could work in the commutative differential ring of formal infinite series generated by polynomial combinations of {F} and its derivatives (with no constant term). Part (ii) is a bit trickier to formulate in this abstract ring; the identity in question is easier to state if {F, G} are formal power series, or (even better) convergent power series, as it involves operations such as composition or inversion that can be more easily defined in those latter settings.

To illustrate Problem 1(i), let us compute up to third order in {F}, using {{\mathcal O}(F^4)} to denote any quantity involving four or more factors of {F} and its derivatives, and similarly for other exponents than {4}. Then we have

\displaystyle  G = F + FF' + (F (F')^2 + \frac{1}{2} F^2 F'') + {\mathcal O}(F^4)

and hence

\displaystyle  G' = F' + (F')^2 + FF'' + {\mathcal O}(F^3)

\displaystyle  G'' = F'' + {\mathcal O}(F^2);

multiplying, we have

\displaystyle  GG' = FF' + F (F')^2 + F^2 F'' + F (F')^2 + {\mathcal O}(F^4)


\displaystyle  G (G')^2 + \frac{1}{2} G^2 G'' = F (F')^2 + \frac{1}{2} F^2 F'' + {\mathcal O}(F^4)

and hence after a lot of canceling

\displaystyle  G - GG' + (G (G')^2 + \frac{1}{2} G^2 G'') = F + {\mathcal O}(F^4).

Thus Problem 1(i) holds up to errors of {{\mathcal O}(F^4)} at least. In principle one can continue verifying Problem 1(i) to increasingly high order in {F}, but the computations rapidly become quite lengthy, and I do not know of a direct way to ensure that one always obtains the required cancellation at the end of the computation.

Problem 1(i) can also be posed in formal power series: if

\displaystyle  F(z) = a_1 z + a_2 z^2 + a_3 z^3 + \dots

is a formal power series with no constant term with complex coefficients {a_1, a_2, \dots} with {|a_1|<1}, then one can verify that the series

\displaystyle  G := \sum_{n=1}^\infty \left( \frac{F^n}{n!} \right)^{(n-1)}

makes sense as a formal power series with no constant term, thus

\displaystyle  G(z) = b_1 z + b_2 z^2 + b_3 z^3 + \dots.

For instance it is not difficult to show that {b_1 = \frac{a_1}{1-a_1}}. If one further has {|b_1| < 1}, then it turns out that

\displaystyle  F = \sum_{n=1}^\infty (-1)^{n-1} \left( \frac{G^n}{n!} \right)^{(n-1)}

as formal power series. Currently the only way I know how to show this is by first proving the claim for power series with a positive radius of convergence using the Cauchy integral formula, but even this is a bit tricky unless one has managed to guess the identity in (ii) first. (In fact, the way I discovered this problem was by first trying to solve (a variant of) the identity in (ii) by Taylor expansion in the course of attacking another problem, and obtaining the transform in Problem 1 as a consequence.)

The transform that takes {F} to {G} resembles both the exponential function

\displaystyle  \exp(F) = \sum_{n=0}^\infty \frac{F^n}{n!}

and Taylor’s formula

\displaystyle  F(z) = \sum_{n=0}^\infty \frac{F^{(n)}(0)}{n!} z^n

but does not seem to be directly connected to either (this is more apparent once one knows the identity in (ii)).

Filed under: 246A - complex analysis, math.CV, math.RA, question, Uncategorized Tagged: identity, power series

John BaezOpen and Interconnected Systems

Brendan Fong finished his thesis a while ago, and here it is!

• Brendan Fong, The Algebra of Open and Interconnected Systems, Ph.D. thesis, Department of Computer Science, University of Oxford, 2016.

This material is close to my heart, since I’ve informally served as Brendan’s advisor since 2011, when he came to Singapore to work with me on chemical reaction networks. We’ve been collaborating intensely ever since. I just looked at our correspondence, and I see it consists of 880 emails!

At some point I gave him a project: describe the category whose morphisms are electrical circuits. He took up the challenge much more ambitiously than I’d ever expected, developing powerful general frameworks to solve not only this problem but also many others. He did this in a number of papers, most of which I’ve already discussed:

• Brendan Fong, Decorated cospans, Th. Appl. Cat. 30 (2015), 1096–1120. (Blog article here.)

• Brendan Fong and John Baez, A compositional framework for passive linear circuits. (Blog article here.)

• Brendan Fong, John Baez and Blake Pollard, A compositional framework for Markov processes. (Blog article here.)

• Brendan Fong and Brandon Coya, Corelations are the prop for extraspecial commutative Frobenius monoids. (Blog article here.)

• Brendan Fong, Paolo Rapisarda and Paweł Sobociński,
A categorical approach to open and interconnected dynamical systems.

But Brendan’s thesis is the best place to see a lot of this material in one place, integrated and clearly explained.

I wanted to write a summary of his thesis. But since he did that himself very nicely in the preface, I’m going to be lazy and just quote that! (I’ll leave out the references, which are crucial in scholarly prose but a bit off-putting in a blog.)


This is a thesis in the mathematical sciences, with emphasis on the mathematics. But before we get to the category theory, I want to say a few words about the scientific tradition in which this thesis is situated.

Mathematics is the language of science. Twinned so intimately with physics, over the past centuries mathematics has become a superb—indeed, unreasonably effective—language for understanding planets moving in space, particles in a vacuum, the structure of spacetime, and so on. Yet, while Wigner speaks of the unreasonable effectiveness of mathematics in the natural sciences, equally eminent mathematicians, not least Gelfand, speak of the unreasonable ineffectiveness of mathematics in biology and related fields. Why such a difference?

A contrast between physics and biology is that while physical systems can often be studied in isolation—the proverbial particle in a vacuum—biological systems are necessarily situated in their environment. A heart belongs in a body, an ant in a colony. One of the first to draw attention to this contrast was Ludwig von Bertalanffy, biologist and founder of general systems theory, who articulated the difference as one between closed and open systems:

Conventional physics deals only with closed systems, i.e. systems which are considered to be isolated from their environment. […] However, we find systems which by their very nature and definition are not closed systems. Every living organism is essentially an open system. It maintains itself in a continuous inflow and outflow, a building up and breaking down of components, never being, so long as it is alive, in a state of chemical and thermodynamic equilibrium but maintained in a so-called ‘steady state’ which is distinct from the latter.

While the ambitious generality of general systems theory has proved difficult, von Bertalanffy’s philosophy has had great impact in his home field of biology, leading to the modern field of systems biology. Half a century later, Dennis Noble, another great pioneer of systems biology and the originator of the first mathematical model of a working heart, describes the shift as one from reduction to integration.

Systems biology […] is about putting together rather than taking apart, integration rather than reduction. It requires that we develop ways of thinking about integration that are as rigorous as our reductionist programmes, but different. It means changing our philosophy, in the full sense of the term.

In this thesis we develop rigorous ways of thinking about integration or, as we refer to it, interconnection.

Interconnection and openness are tightly related. Indeed, openness implies that a system may be interconnected with its environment. But what is an environment but comprised of other systems? Thus the study of open systems becomes the study of how a system changes under interconnection with other systems.

To model this, we must begin by creating language to describe theinterconnection of systems. While reductionism hopes that phenomena can be explained by reducing them to “elementary units investigable independently of each other” (in the words of von Bertalanffy), this philosophy of integration introduces as an additional and equal priority the investigation of the way these units are interconnected. As such, this thesis is predicated on the hope that the meaning of an expression in our new language is determined by the meanings of its constituent expressions together with the syntactic rules combining them. This is known as the principle of compositionality.

Also commonly known as Frege’s principle, the principle of compositionality both dates back to Ancient Greek and Vedic philosophy, and is still the subject of active research today. More recently, through the work of Montague in natural language semantics and Strachey and Scott in programming language semantics, the principle of compositionality has found formal expression as the dictum that the interpretation of a language should be given by a homomorphism from an algebra of syntactic representations to an algebra of semantic objects. We too shall follow this route.

The question then arises: what do we mean by algebra? This mathematical question leads us back to our scientific objectives: what do we mean by system? Here we must narrow, or at least define, our scope. We give some examples. The investigations of this thesis began with electrical circuits and their diagrams, and we will devote significant time to exploring their compositional formulation. We discussed biological systems above, and our notion of system
includes these, modelled say in the form of chemical reaction networks or Markov processes, or the compartmental models of epidemiology, population biology, and ecology. From computer science, we consider Petri nets, automata, logic circuits, and the like. More abstractly, our notion of system encompasses matrices and systems of differential equations.

Drawing together these notions of system are well-developed diagrammatic representations based on network diagrams— that is, topological graphs. We call these network-style diagrammatic languages. In abstract, by ‘system’ we shall simply mean that which can be represented by a box with a collection of terminals, perhaps of different types, through which it interfaces with the surroundings. Concretely, one might envision a circuit diagram with terminals, such as


The algebraic structure of interconnection is then simply the structure that results from the ability to connect terminals of one system with terminals of another. This graphical approach motivates our language of interconnection: indeed, these diagrams will be the expressions of our language.

We claim that the existence of a network-style diagrammatic language to represent a system implies that interconnection is inherently important in understanding the system. Yet, while each of these example notions of system are well-studied in and of themselves, their compositional, or algebraic, structure has received scant attention. In this thesis, we study an algebraic structure called a ‘hypergraph category’, and argue that this is the relevant algebraic structure for modelling interconnection of open systems.

Given these pre-existing diagrammatic formalisms and our visual intuition, constructing algebras of syntactic representations is thus rather straightforward. The semantics and their algebraic structure are more subtle.

In some sense our semantics is already given to us too: in studying these systems as closed systems, scientists have already formalised the meaning of these diagrams. But we have shifted from a closed perspective to an open one, and we need our semantics to also account for points of interconnection.

Taking inspiration from Willems’ behavioural approach and Deutsch’s constructor theory, in this thesis I advocate the following position. First, at each terminal of an open system we may make measurements appropriate to the type of terminal. Given a collection of terminals, the universum is then the set of all possible measurement outcomes. Each open system has a collection of terminals, and hence a universum. The semantics of an open system is the subset of measurement outcomes on the terminals that are permitted by the system. This is known as the behaviour of the system.

For example, consider a resistor of resistance r. This has two terminals—the two ends of the resistor—and at each terminal, we may measure the potential and the current. Thus the universum of this system is the set \mathbb{R}\oplus\mathbb{R}\oplus\mathbb{R}\oplus\mathbb{R}, where the summands represent respectively the potentials and currents at each of the two terminals. The resistor is governed by Kirchhoff’s current law, or conservation of charge,
and Ohm’s law. Conservation of charge states that the current flowing into one terminal must equal the current flowing out of the other terminal, while Ohm’s law states that this current will be proportional to the potential difference, with constant of proportionality 1/r. Thus the behaviour of the resistor is the set

\displaystyle{   \big\{\big(\phi_1,\phi_2,     -\tfrac1r(\phi_2-\phi_1),\tfrac1r(\phi_2-\phi_1)\big)\,\big\vert\,     \phi_1,\phi_2 \in \mathbb{R}\big\} }

Note that in this perspective a law such as Ohm’s law is a mechanism for partitioning behaviours into possible and impossible behaviours.

Interconnection of terminals then asserts the identification of the variables at the identified terminals. Fixing some notion of open system and subsequently an algebra of syntactic representations for these systems, our approach, based on the principle of compositionality, requires this to define an algebra of semantic objects and a homomorphism from syntax to semantics. The first part of this thesis develops the mathematical tools necessary to pursue this vision for modelling open systems and their interconnection.

The next goal is to demonstrate the efficacy of this philosophy in applications. At core, this work is done in the faith that the right language allows deeper insight into the underlying structure. Indeed, after setting up such a language for open systems there are many questions to be asked: Can we find a sound and complete logic for determining when two syntactic expressions have the same semantics? Suppose we have systems that have some property, for example controllability. In what ways can we interconnect controllable systems so that the combined system is also controllable? Can we compute the semantics of a large system quicker by computing the semantics of subsystems and then composing them? If I want a given system to achieve a specified trajectory, can we interconnect another system to make it do so? How do two different notions of system, such as circuit diagrams and signal flow graphs, relate to each other? Can we find homomorphisms between their syntactic and semantic algebras? In the second part of this thesis we explore some applications in depth, providing answers to questions of the above sort.

Outline of the thesis

The thesis is divided into two parts. Part I, comprising
Chapters 1 to 4, focuses on mathematical foundations. In it we develop the theory of hypergraph categories and a powerful tool for constructing and manipulating them: decorated corelations. Part II, comprising Chapters 5 to 7, then discusses applications of this theory to examples of open systems.

The central refrain of this thesis is that the syntax and semantics of network-style diagrammatic languages can be modelled by hypergraph categories. These are introduced in Chapter 1. Hypergraph categories are symmetric monoidal categories in which every object is equipped with the structure of a special commutative Frobenius monoid in a way compatible with the monoidal product. As we will rely heavily on properties of monoidal categories, their functors, and their graphical calculus, we begin with a whirlwind review of these ideas. We then provide a definition of hypergraph categories and their functors, a strictification theorem, and an important example: the category of cospans in a category with finite colimits.

A cospan is a pair of morphisms

X \to N \leftarrow Y

with a common codomain. In Chapter 2 we introduce the idea of a ‘decorated cospan’, which equips the apex N with extra structure. Our motivating example is cospans of finite sets decorated by graphs, as in this picture:

Here graphs are a proxy for expressions in a network-style diagrammatic language. To give a bit more formal detail, let \mathcal C be a category with finite colimits, writing its as coproduct as +, and let (\mathcal D, \otimes) be a braided monoidal category. Decorated cospans provide a method of producing a hypergraph category from a lax braided monoidal functor

F\colon (\mathcal C,+) \to (\mathcal D, \otimes)

The objects of these categories are simply the objects of \mathcal C, while the morphisms are pairs comprising a cospan X \rightarrow N \leftarrow Y in \mathcal C together with an element I \to FN in \mathcal D—the so-called decoration. We will also describe how to construct hypergraph functors between decorated cospan categories. In particular, this provides a useful tool for constructing a hypergraph category that captures the syntax of a network-style diagrammatic language.

Having developed a method to construct a category where the morphisms are expressions in a diagrammatic language, we turn our attention to categories of semantics. This leads us to the notion of a corelation, to which we devote Chapter 3. Given a factorisation system (\mathcal{E},\mathcal{M}) on a category \mathcal{C}, we define a corelation to be a cospan X \to N \leftarrow Y such that the copairing of the two maps, a map X+Y \to N, is a morphism in \mathcal{E}. Factorising maps X+Y \to N using the factorisation system leads to a notion of equivalence on cospans, and this helps us describe when two diagrams are equivalent. Like cospans, corelations form hypergraph categories.

In Chapter 4 we decorate corelations. Like decorated cospans,
decorated corelations are corelations together with some additional structure on the apex. We again use a lax braided monoidal functor to specify the sorts of extra structure allowed. Moreover, decorated corelations too form the morphisms of a hypergraph category. The culmination of our theoretical work is to show that every hypergraph category and every hypergraph functor can be constructe using decorated corelations. This implies that we can use decorated corelations to construct a semantic hypergraph category for any network-style diagrammatic language, as well as a hypergraph functor from its syntactic category that interprets each diagram. We also discuss how the intuitions behind decorated corelations guide construction of these categories and functors.

Having developed these theoretical tools, in the second part we turn to demonstrating that they have useful applications. Chapter 5 uses corelations to formalise signal flow diagrams representing linear time-invariant discrete dynamical systems as morphisms in a category. Our main result gives an intuitive sound and fully complete equational theory for reasoning about these linear time-invariant systems. Using this framework, we derive a novel structural characterisation of controllability, and consequently provide a methodology for analysing controllability of networked and interconnected systems.

Chapter 6 studies passive linear networks. Passive linear
networks are used in a wide variety of engineering applications, but the best studied are electrical circuits made of resistors, inductors and capacitors. The goal is to construct what we call the ‘black box functor’, a hypergraph functor from a category of open circuit diagrams to a category of behaviours of circuits. We construct the former as a decorated cospan category, with each morphism a cospan of finite sets decorated by a circuit diagram on the apex. In this category, composition describes the process of attaching the outputs of one circuit to the inputs of another. The behaviour of a circuit is the relation it imposes between currents and potentials at their terminals. The space of these currents and potentials naturally has the structure of a symplectic vector space, and the relation imposed by a circuit is a Lagrangian linear relation. Thus, the black box functor goes from our category of circuits to the category of symplectic vector spaces and Lagrangian linear relations. Decorated corelations provide a critical tool for constructing these hypergraph categories and the black box functor.

Finally, in Chapter 7 we mention two further research directions. The first is the idea of a ‘bound colimit’, which aims to describe why epi-mono factorisation systems are useful for constructing corelation categories of semantics for open systems. The second research direction pertains to applications of the black box functor for passive linear networks, discussing the work of Jekel on the inverse problem for electric circuits and the work of Baez, Fong, and Pollard on open Markov processes.

October 23, 2016

David Hogg#GaiaSprint, day 3

(As usual, these blog notes are only biased, imperfect, personal highlights. They are not minutes of the meeting in any sense!) Anthony Brown (Leiden) kicked off the day by comparing the all-sky image of the Gaia TGAS catalog with the all-sky image of the stars that Gaia uses to set its attitude. This latter catalog is close to a random sampling of stars, so it makes a beautiful all-sky image.

Yesterday's check-in meeting continued this morning with Bovy showing the Oort constants. He claimed that he needed something to do while his data files unzipped, so he decided to measure the Oort constants, including constant C, which he claims has never really been measured before! This continues the theme of the awesomeness of the Gaia data: You measure things that have never before been possible while your files are unzipping. Bovy also gave us a tiny reminder of what the Oort constants are. Years ago, Bovy and I (more-or-less) failed to measure these constants in the SDSS data.

Daniel Michalik (Lund) came in by phone to tell us about the construction of the TGAS Catalog, and Alcione Mora (ESAC) told us about the Gaia Archive and how to use it. In Michalik's talk I was reminded that there are two small circles on the sky (small as in not great) where there will be close to 200 observations per star; these are great places to concentrate observing programs: Why wait to after Gaia to do the follow-up observing on the amazing time-domain astrophysics that will be discovered in those sky regions.

I spent my sprinting time working with Price-Whelan on the mid-plane of the Milky Way disk, with Bird on the age-velocity relationship, including a generative model for the ages, and with Ness on the causal relationships between metallicity, age, and vertical kinematics. On the latter, the quesion is: Is heating “caused” by age or by metallicity? (Or maybe some more sophisticated question than that.) The answer seems to be that in some parts of abundance space it is clearly age, and in others it is clearly metallicity. I hope this holds up!

At the evening check-in session, Ruth Angus (Columbia) showed that, of Semyeong Oh's comoving pairs of stars that both have gyrochronology ages, they seem (usually) to show the same-ish age. It is early, but it looks like a possible confirmation of the effectiveness of the gyrochronology, possibly in parts of the H-R diagram where it hasn't been well tested previously.

After that, Vasily Belokurov (Cambridge) blew us all away by punking the Gaia DR1 uncertainty model to find time-variable sources in the billion-star catalog. He then found a bridge of variable stars connecting the LMC to the SMC! That made me afraid, very afraid.

BackreactionThe concordance model strikes back

Two weeks ago, I summarized a recent paper by McGaugh et al who reported a correlation in galactic structures. The researchers studied a data-set with the rotation curves of 153 galaxies and showed that the gravitational acceleration inferred from the rotational velocity (including dark matter), gobs, is strongly correlated to the gravitational acceleration from the normal matter (stars and gas), gbar.

Figure from arXiv:1609.05917 [astro-ph.GA] 

This isn’t actually new data or a new correlation, but a new way to look at correlations in previously available data.

The authors of the paper were very careful not to jump to conclusions from their results, but merely stated that this correlation requires some explanation. That galactic rotation curves have surprising regularities, however, has been evidence in favor of modified gravity for two decades, so the implication was clear: Here is something that the concordance model might have trouble explaining.

As I remarked in my previous blogpost, while the correlation does seem to be strong, it would be good to see the results of a simulation with the concordance model that describes dark matter, as usual, as a pressureless, cold fluid. In this case too one would expect there to be some relation. Normal matter forms galaxies in the gravitational potentials previously created by dark matter, so the two components should have some correlation with each other. The question is how much.

Just the other day, a new paper appeared on the arxiv, which looked at exactly this. The authors of the new paper analyzed the result of a specific numerical simulation within the concordance model. And they find that the correlation in this simulated sample is actually stronger than the observed one!

Figure from arXiv:1610.06183 [astro-ph.GA]

Moreover, they also demonstrate that in the concordance model, the slope of the best-fit curve should depend on the galaxies’ redshift (z), ie the age of the galaxy. This would be a way to test which explanation is correct.

Figure from arXiv:1610.06183 [astro-ph.GA]

I am not familiar with the specific numerical code that the authors use and hence I am not sure what to make of this. It’s been known for a long time that the concordance model has difficulties getting structures on galactic size right, especially galactic cores, and so it isn’t clear to me just how many parameters this model uses to work right. If the parameters were previously chosen so as to match observations already, then this result is hardly surprising.

McGaugh, one of the authors of the first paper, has already offered some comments (ht Yves). He notes that the sample size of the galaxies in the simulation is small, which might at least partly account for the small scatter. He also expresses himself skeptical of the results: “It is true that a single model does something like this as a result of dissipative collapse. It is not true that an ensemble of such models are guaranteed to fall on the same relation.”

I am somewhat puzzled by this result because, as I mentioned above, the correlation in the McGaugh paper is based on previously known correlations, such as the brightness-velocity relation which, to my knowledge, hadn’t been explained by the concordance model. So I would find it surprising should the results of the new paper hold up. I’m sure we’ll hear more about this in the soon future.

Jordan EllenbergThe greatest Cub/Indian

Congratulations to the Cubs, the Indians, and their fanbases, one of which will enjoy a long-awaited championship!

Now here’s the question.  Which player in baseball history was the best combined Cub/Indian?  My methodology, as it was last year, is to draw the top 200 position players and pitchers from each team by Wins Above Replacement, using the Baseball Reference Play Index.  Then I find the players with the highest value of

(WAR for team 1 * WAR for team 2)

Now I have to admit I couldn’t actually think of a player who played for both the Cubs and the Indians!  And this was borne out by the Play Index results:  there were only five position players and no pitchers who ranked in the top 200 all-time contributors to each team.  Pretty surprising, considering how long both teams have been around!  And here are your top five Cub/Indians:

  1.  Riggs Stephenson (193.6)
  2.  Andre Thornton (98.8)
  3.  Jose Cardenal (47.4)
  4.  Mel Hall (9.0)
  5.  Mitch Webster (7.8)

I almost wonder whether I did something wrong here.  There was so much more overlap last year between the Royals and the Mets!  But until you tell me otherwise, it’s the Riggs Stephenson Series.

October 22, 2016

Terence Tao246A, Notes 5: conformal mapping

In the previous set of notes we introduced the notion of a complex diffeomorphism {f: U \rightarrow V} between two open subsets {U,V} of the complex plane {{\bf C}} (or more generally, two Riemann surfaces): an invertible holomorphic map whose inverse was also holomorphic. (Actually, the last part is automatic, thanks to Exercise 40 of Notes 4.) Such maps are also known as biholomorphic maps or conformal maps (although in some literature the notion of “conformal map” is expanded to permit maps such as the complex conjugation map {z \mapsto \overline{z}} that are angle-preserving but not orientation-preserving, as well as maps such as the exponential map {z \mapsto \exp(z)} from {{\bf C}} to {{\bf C} \backslash \{0\}} that are only locally injective rather than globally injective). Such complex diffeomorphisms can be used in complex analysis (or in the analysis of harmonic functions) to change the underlying domain {U} to a domain that may be more convenient for calculations, thanks to the following basic lemma:

Lemma 1 (Holomorphicity and harmonicity are conformal invariants) Let {\phi: U \rightarrow V} be a complex diffeomorphism between two Riemann surfaces {U,V}.

  • (i) If {f: V \rightarrow W} is a function to another Riemann surface {W}, then {f} is holomorphic if and only if {f \circ \phi: U \rightarrow W} is holomorphic.
  • (ii) If {U,V} are open subsets of {{\bf C}} and {u: V \rightarrow {\bf R}} is a function, then {u} is harmonic if and only if {u \circ \phi: U \rightarrow {\bf R}} is harmonic.

Proof: Part (i) is immediate since the composition of two holomorphic functions is holomorphic. For part (ii), observe that if {u: V \rightarrow {\bf R}} is harmonic then on any ball {B(z_0,r)} in {V}, {u} is the real part of some holomorphic function {f: B(z_0,r) \rightarrow {\bf C}} thanks to Exercise 62 of Notes 3. By part (i), {f \circ \phi: B(z_0,r) \rightarrow {\bf C}} is also holomorphic. Taking real parts we see that {u \circ \phi} is harmonic on each ball {B(z_0,r)} in {V}, and hence harmonic on all of {V}, giving one direction of (ii); the other direction is proven similarly. \Box

Exercise 2 Establish Lemma 1(ii) by direct calculation, avoiding the use of holomorphic functions. (Hint: the calculations are cleanest if one uses Wirtinger derivatives, as per Exercise 27 of Notes 1.)

Exercise 3 Let {\phi: U \rightarrow V} be a complex diffeomorphism between two open subsets {U,V} of {{\bf C}}, let {z_0} be a point in {U}, let {m} be a natural number, and let {f: V \rightarrow {\bf C} \cup \{\infty\}} be holomorphic. Show that {f: V \rightarrow {\bf C} \cup \{\infty\}} has a zero (resp. a pole) of order {m} at {\phi(z_0)} if and only if {f \circ \phi: U \rightarrow {\bf C} \cup \{\infty\}} has a zero (resp. a pole) of order {m} at {z_0}.

From Lemma 1(ii) we can now define the notion of a harmonic function {u: M \rightarrow {\bf R}} on a Riemann surface {M}; such a function {u} is harmonic if, for every coordinate chart {\phi_\alpha: U_\alpha \rightarrow V_\alpha} in some atlas, the map {u \circ \phi_\alpha^{-1}: V_\alpha \rightarrow {\bf R}} is harmonic. Lemma 1(ii) ensures that this definition of harmonicity does not depend on the choice of atlas. Similarly, using Exercise 3 one can define what it means for a holomorphic map {f: M \rightarrow {\bf C} \cup \{\infty\}} on a Riemann surface {M} to have a pole or zero of a given order at a point {p_0 \in M}, with the definition being independent of the choice of atlas.

In view of Lemma 1, it is thus natural to ask which Riemann surfaces are complex diffeomorphic to each other, and more generally to understand the space of holomorphic maps from one given Riemann surface to another. We will initially focus attention on three important model Riemann surfaces:

  • (i) (Elliptic model) The Riemann sphere {{\bf C} \cup \{\infty\}};
  • (ii) (Parabolic model) The complex plane {{\bf C}}; and
  • (iii) (Hyperbolic model) The unit disk {D(0,1)}.

The designation of these model Riemann surfaces as elliptic, parabolic, and hyperbolic comes from Riemannian geometry, where it is natural to endow each of these surfaces with a constant curvature Riemannian metric which is positive, zero, or negative in the elliptic, parabolic, and hyperbolic cases respectively. However, we will not discuss Riemannian geometry further here.

All three model Riemann surfaces are simply connected, but none of them are complex diffeomorphic to any other; indeed, there are no non-constant holomorphic maps from the Riemann sphere to the plane or the disk, nor are there any non-constant holomorphic maps from the plane to the disk (although there are plenty of holomorphic maps going in the opposite directions). The complex automorphisms (that is, the complex diffeomorphisms from a surface to itself) of each of the three surfaces can be classified explicitly. The automorphisms of the Riemann sphere turn out to be the Möbius transformations {z \mapsto \frac{az+b}{cz+d}} with {ad-bc \neq 0}, also known as fractional linear transformations. The automorphisms of the complex plane are the linear transformations {z \mapsto az+b} with {a \neq 0}, and the automorphisms of the disk are the fractional linear transformations of the form {z \mapsto e^{i\theta} \frac{\alpha - z}{1 - \overline{\alpha} z}} for {\theta \in {\bf R}} and {\alpha \in D(0,1)}. Holomorphic maps {f: D(0,1) \rightarrow D(0,1)} from the disk {D(0,1)} to itself that fix the origin obey a basic but incredibly important estimate known as the Schwarz lemma: they are “dominated” by the identity function {z \mapsto z} in the sense that {|f(z)| \leq |z|} for all {z \in D(0,1)}. Among other things, this lemma gives guidance to determine when a given Riemann surface is complex diffeomorphic to a disk; we shall discuss this point further below.

It is a beautiful and fundamental fact in complex analysis that these three model Riemann surfaces are in fact an exhaustive list of the simply connected Riemann surfaces, up to complex diffeomorphism. More precisely, we have the Riemann mapping theorem and the uniformisation theorem:

Theorem 4 (Riemann mapping theorem) Let {U} be a simply connected open subset of {{\bf C}} that is not all of {{\bf C}}. Then {U} is complex diffeomorphic to {D(0,1)}.

Theorem 5 (Uniformisation theorem) Let {M} be a simply connected Riemann surface. Then {M} is complex diffeomorphic to {{\bf C} \cup \{\infty\}}, {{\bf C}}, or {D(0,1)}.

As we shall see, every connected Riemann surface can be viewed as the quotient of its simply connected universal cover by a discrete group of automorphisms known as deck transformations. This in principle gives a complete classification of Riemann surfaces up to complex diffeomorphism, although the situation is still somewhat complicated in the hyperbolic case because of the wide variety of discrete groups of automorphisms available in that case.

We will prove the Riemann mapping theorem in these notes, using the elegant argument of Koebe that is based on the Schwarz lemma and Montel’s theorem (Exercise 57 of Notes 4). The uniformisation theorem is however more difficult to establish; we discuss some components of a proof (based on the Perron method of subharmonic functions) here, but stop short of providing a complete proof.

The above theorems show that it is in principle possible to conformally map various domains into model domains such as the unit disk, but the proofs of these theorems do not readily produce explicit conformal maps for this purpose. For some domains we can just write down a suitable such map. For instance:

Exercise 6 (Cayley transform) Let {{\bf H} := \{ z \in {\bf C}: \mathrm{Im} z > 0 \}} be the upper half-plane. Show that the Cayley transform {\phi: {\bf H} \rightarrow D(0,1)}, defined by

\displaystyle  \phi(z) := \frac{z-i}{z+i},

is a complex diffeomorphism from the upper half-plane {{\bf H}} to the disk {D(0,1)}, with inverse map {\phi^{-1}: D(0,1) \rightarrow {\bf H}} given by

\displaystyle  \phi^{-1}(w) := i \frac{1+w}{1-w}.

Exercise 7 Show that for any real numbers {a<b}, the strip {\{ z \in {\bf C}: a < \mathrm{Re}(z) < b \}} is complex diffeomorphic to the disk {D(0,1)}. (Hint: use the complex exponential and a linear transformation to map the strip onto the half-plane {{\bf H}}.)

Exercise 8 Show that for any real numbers $latex {a<b0, a < \theta < b \}}&fg=000000$ is complex diffeomorphic to the disk {D(0,1)}. (Hint: use a branch of either the complex logarithm, or of a complex power {z \mapsto z^\alpha}.)

We will discuss some other explicit conformal maps in this set of notes, such as the Schwarz-Christoffel maps that transform the upper half-plane {{\bf H}} to polygonal regions. Further examples of conformal mapping can be found in the text of Stein-Shakarchi.

— 1. Maps between the model Riemann surfaces —

In this section we study the various holomorphic maps, and conformal maps, between the three model Riemann surfaces {{\bf C} \cup \{\infty\}}, {{\bf C}}, and {D(0,1)}.

From Exercise 19 of Notes 4, we know that the only holomorphic maps {f: {\bf C} \cup \{\infty\} \rightarrow {\bf C} \cup \{\infty\}} from the Riemann sphere to itself take the form of a rational function {f(z) = P(z) / Q(z)} away from the zeroes of {Q} (and from {\infty}), with these singularities all being removable, and with {Q} not identically zero. We can of course reduce to lowest terms and assume that {P} and {Q} have no common factors. In particular, if {f} is to take values in {{\bf C}} rather than {{\bf C} \cup \{\infty\}}, then {Q} can have no roots (since {f} will have a pole at these roots) and so by the fundamental theorem of calculus {Q} is constant and {f} is a polynomial; in order for {f} to have no pole at infinity, {f} must then be constant. Thus the only holomorphic maps from {{\bf C} \cup \{\infty\}} to {{\bf C}} are the constants; in particular, the only holomorphic maps from {{\bf C} \cup \{\infty\}} to {D(0,1)} are the constants. In particular, {{\bf C} \cup \{\infty\}} is not complex diffeomorphic to {{\bf C}} or {D(0,1)} (this is also topologically obvious since the Riemann sphere is compact, and {{\bf C}} and {D(0,1)} are not).

Exercise 9 More generally, show that if {M} is a compact Riemann surface and {N} is a connected non-compact Riemann surface, then the only holomorphic maps from {M} to {N} are the constants. (Hint: use the open mapping theorem, Theorem 37 of Notes 4.)

Now we consider complex automorphisms of the Riemann sphere {{\bf C} \cup \{\infty\}} to itself. There are some obvious examples of such automorphisms:

  • Translation maps {z \mapsto z + c} for some {c \in {\bf C}}, with the convention that {\infty} is mapped to {\infty};
  • Dilation maps {z \mapsto \lambda z} for some {\lambda \in {\bf C} \backslash \{0\}}, with the convention that {\infty} is mapped to {\infty}; and
  • The inversion map {z \mapsto 1/z}, with the convention that {\infty} is mapped to {0}.

More generally, given any complex numbers {a,b,c,d} with {ad-bc \neq 0}, we can define the Möbius transformation (or fractional linear transformation) {z \mapsto \frac{az+b}{cz+d}} for {z \neq \infty, -d/c}, with the convention that {-d/c} is mapped to {\infty} and {\infty} is mapped to {a/c} (where we adopt the further convention that {a/0=\infty} for non-zero {a}). For {c=0}, this is an affine transformation {z \mapsto \frac{a}{d} z + \frac{b}{d}}, which is clearly a composition of a translation and dilation map; for {c \neq 0}, this is a combination {z \mapsto \frac{a}{c} - \frac{ad-bc}{cz+d}} of translations, dilations, and the inversion map. Thus all Möbius transformations are formed from composition of the translations, dilations, and inversions, and in particular are also automorphisms of the Riemann sphere; it is also easy to see that the Möbius transformations are closed under composition, and are thus the group generated by the translations, dilations, and inversions.

One can interpret the Möbius transformations as projective linear transformations as follows. Recall that the general linear group {GL_2({\bf C})} is the group of {2 \times 2} matrices {\begin{pmatrix} a & b \\ c & d \end{pmatrix}} with non-vanishing determinant {ad-bc}. Clearly every such matrix generates a Möbius transformation {z \mapsto \frac{az+b}{cz+d}}. However, two different elements of {GL_2({\bf C})} can generate the same Möbius transformation if they are scalar multiples of each other. If we define the projective linear group {PGL_2({\bf C})} to be the quotient group of {GL_2({\bf C})} by the group of scalar invertible matrices, then we may identify the set of Möbius transformations with {PGL_2({\bf C})}. The group {GL_2({\bf C})} acts on the space {{\bf C}^2} by the usual map

\displaystyle  \begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} z \\ w \end{pmatrix} = \begin{pmatrix} az+bw \\ cz+dw \end{pmatrix}.

If we let {{\bf CP}^1} be the complex projective line, that is to say the space of one-dimensional subspaces of {{\bf C}^2}, then {GL_2({\bf C})} acts on this space also, with the action of the scalars being trivial, so we have an action of {PGL_2({\bf C})} on {{\bf CP}^1}. We can identify the Riemann sphere {{\bf C} \cup \{\infty\}} with the complex projective line by identifying each {c \in {\bf C} \subset {\bf C} \cup \{\infty\}} with the one-dimensional subspace {\{ (cw,w): w \in {\bf C} \}} of {{\bf C}^2}, and identifying {\infty \in {\bf C} \cup \{\infty\}} with {\{ (z,0): z \in {\bf C}\}}. With this identification, one can check that the action of {PGL_2({\bf C})} on {{\bf CP}^1} has become identified with the action of the group of Möbius transformations on {{\bf C} \cup \{\infty\}}. (In particular, the group of Möbius transformations is isomorphic to {PGL_2({\bf C})}.)

There are enough Möbius transformations available that their action on the Riemann sphere is not merely transitive, but is in fact {3}-transitive:

Lemma 10 ({3}-transitivity) Let {z_1,z_2,z_3} be distinct elements of the Riemann sphere {{\bf C} \cup \{\infty\}}, and let {w_1,w_2,w_3} also be three distinct elements of the Riemann sphere. Then there exists a unique Möbius transformation {T} such that {T(z_j) = w_j} for {j=1,2,3}.

Proof: We first show existence. As the Möbius transformations form a group, it suffices to verify the claim for a single choice of {z_1,z_2,z_3}, for instance {z_1 = 0, z_2 = 1, z_3 = \infty}. If {w_3=\infty} then the affine transformation {z \mapsto w_1 + z(w_2-w_1)} will have the desired properties. If {w_3 \neq \infty}, we can use translation and inversion to find a Möbius transformation {S} that maps {w_3} to {\infty}; applying the previous case with {w_1,w_2,w_3} with {S(w_1), S(w_2), S(w_3)} and then applying {S^{-1}}, we obtain the claim.

Now we prove uniqueness. By composing on the left and right with Möbius transforms we may assume that {z_1=w_1=0, z_2=w_2=1, z_3=w_3=\infty}. A Möbius transformation {z \mapsto \frac{az+b}{cz+d}} that fixes {0,1,\infty} must obey the constraints {b=0, a+b=c+d, c=0} and so must be the identity, as required. \Box

Möbius transformations are not 4-transitive, thanks to the invariant known as the cross-ratio:

Exercise 11 Define the cross-ratio {[z_1,z_2; z_3,z_4]} between four distinct points {z_1,z_2,z_3,z_4} on the Riemann sphere {{\bf C} \cup \{\infty\}} by the formula

\displaystyle  [z_1,z_2; z_3,z_4] = \frac{(z_1-z_3)(z_2-z_4)}{(z_2-z_3)(z_1-z_4)}

if all of {z_1,z_2,z_3,z_4} avoid {\infty}, and extended continuously to the case when one of the points equals {\infty} (e.g. {[z_1,z_2;z_3,\infty] = \frac{z_1-z_3}{z_2-z_3}}.

  • (i) Show that an injective map {T: {\bf C} \cup \{\infty\} \rightarrow {\bf C} \rightarrow \{\infty\}} is a Möbius transform if and only if it preserves the cross-ratio, that is to say that {[T(z_1),T(z_2);T(z_3),T(z_4)] = [z_1,z_2;z_3,z_4]} for all distinct points {z_1,z_2,z_3,z_4 \in {\bf C} \cup \{\infty\}}. (Hint: for the “only if” part, work with the basic Möbius transforms. For the “if” part, reduce to the case when {T} fixes three points, such as {0,1,\infty}.)
  • (ii) If {z_1,z_2,z_3,z_4} are distinct points in {{\bf C} \cup \{\infty\}}, show that {z_1,z_2,z_3,z_4} lie on a common extended line (i.e., a line in {{\bf C}} together with {\infty}) or circle in {{\bf C}} if and only if the cross-ratio {[z_1,z_2;z_3,z_4]} is real. Conclude that a Möbius transform will map an extended line or circle to an extended line or circle.

As one quick application of Möbius transformations, we have

Proposition 12 {{\bf C} \cup \{\infty\}} is simply connected.

Proof: We have to show that any closed curve {\gamma} in {{\bf C} \cup \{\infty\}} is contractible to a point in {{\bf C} \cup \{\infty\}}. By deforming {\gamma} locally into line segments in either of the two standard coordinate charts of {{\bf C} \cup \{\infty\}} we may assume that {\gamma} is the concatenation of finitely many such line segments; in particular, {\gamma} cannot be a space-filling curve (as one can see from e.g. the Baire category theorem) and thus avoids at least one point in {{\bf C} \cup \{\infty\}}. If {\gamma} avoids {\infty} then it lies in {{\bf C}} and can thus be contracted to a point in {{\bf C}} (and hence in {{\bf C} \cup \{\infty\}}) since {{\bf C}} is convex. If {\gamma} avoids any other point {z_0}, then we can apply a Möbius transformation to move {z_0} to {\infty}, contract the transformed curve to a point, and then invert the Möbius transform to contract {\gamma} to a point in {{\bf C} \cup \{\infty\}}. \Box

Exercise 13 (Jordan curve theorem in the Riemann sphere) Let {\gamma: [a,b] \rightarrow {\bf C} \cup \{\infty\}} be a simple closed curve in the Riemann sphere. Show that the complement of {\gamma([a,b])} in {{\bf C} \cup \{\infty\}} is the union of two disjoint simply connected open subsets of {{\bf C} \cup \{\infty\}}. (Hint: one first has to exclude the possibility that {\gamma} is space-filling. Do this by verifying that {\gamma([a,b])} is homeomorphic to the unit circle.)

It turns out that there are no other automorphisms of the Riemann sphere than the Möbius transformations:

Proposition 14 (Automorphisms of Riemann sphere) Let {T: {\bf C} \cup \{\infty\} \rightarrow {\bf C} \cup \{\infty\}} be a complex diffeomorphism. Then {T} is a Möbius transformation.

Proof: By Lemma 10 and composing {T} with a Möbius transformation, we may assume without loss of generality that {T} fixes {0,1,\infty}. From Exercise 19 of Notes 4 we know that {T} is a rational function {T(z) = P(z)/Q(z)} (with all singularities removed); we may reduce terms so that {P,Q} have no common factors. Since {T} is bijective and fixes {\infty}, it has no poles in {{\bf C}}, and hence {Q} can have no roots; by the fundamental theorem of algebra, this makes {Q} constant. Similarly, {P} has no zeroes other than {0}, and so must be a monomial; as {T} also fixes {1}, it must be of the form {T(z) = z^n} for some natural number {n}. But this is only injective if {n=1}, in which case {T} is clearly a Möbius transformation. \Box

Now we look at holomorphic maps on {{\bf C}}. There are plenty of holomorphic maps from {{\bf C}} to {{\bf C}}; indeed, these are nothing more than the entire functions, of which there are many (indeed, an entire function is nothing more than a power series with an infinite radius of convergence). There are even more holomorphic maps from {{\bf C}} to {{\bf C} \cup \{\infty\}}, as these are just the meromorphic functions on {{\bf C}}. For instance, any ratio {f/g} of two entire functions, with {g} not identically zero, will be meromorphic on {{\bf C}}. On the other hand, from Liouville’s theorem (Theorem 28 of Notes 3) we see that the only holomorphic maps from {{\bf C}} to {D(0,1)} are the constants. In particular, {{\bf C}} and {D(0,1)} are not complex diffeomorphic (despite the fact that they are diffeomorphic over the reals, as can be seen for instance by using the projection {z \mapsto \frac{z}{\sqrt{1+|z|^2}}}).

The affine maps {z \mapsto az+b} with {a \in {\bf C} \backslash \{0\}} and {b \in {\bf C}} are clearly complex automorphisms on {{\bf C}}. In analogy with Proposition 14, these turn out to be the only automorphisms:

Proposition 15 (Automorphisms of complex plane) Let {T: {\bf C} \rightarrow {\bf C}} be a complex diffeomorphism. Then {T} is an affine transformation {T(z) = az+b} for some {a \in {\bf C} \backslash \{0\}} and {b \in {\bf C}}.

Proof: By the open mapping theorem (Theorem 37 of Notes 4), {T(D(0,1))} is open, and hence {T} avoids the non-empty open set {T(D(0,1))} on {{\bf C} \backslash D(0,1)}. By the Casorati-Weierstrass theorem (Theorem 11 of Notes 4), we conclude that {T} does not have an essential singularity at infinity. Thus {T} extends to a holomorphic function from {{\bf C} \cup \{\infty\}} to {{\bf C} \cup \{\infty\}}, hence by Exercise 19 of Notes 4 is rational. As the only pole of {T} is at infinity, {T} is a polynomial; as {T} is a diffeomorphism, the derivative has no zeroes and is thus constant by the fundamental theorem of algebra. Thus {T} must be affine, and the claim follows. \Box

Exercise 16 Let {f: {\bf C} \rightarrow {\bf C} \cup \{\infty\}} be an injective holomorphic map. Show that {f} is a Möbius transformation (restricted to {{\bf C}}).

We remark that injective holomorphic maps are often referred to as univalent functions in the literature.

Finally, we consider holomorphic maps on {D(0,1)}. There are plenty of holomorphic maps from {D(0,1)} to {{\bf C}} (indeed, these are just the power series with radius of convergence at least {1}), and even more holomorphic maps from {D(0,1)} to {{\bf C} \cup \{\infty\}} (for instance, one can take the quotient of two holomorphic functions {f,g: D(0,1) \rightarrow {\bf C}} with {g} non-zero). There are also many holomorphic maps from {D(0,1)} to {D(0,1)}, for instance one can take any bounded holomorphic function {f: D(0,1) \rightarrow {\bf C}} and multiply it by a small constant. However, we have the following fundamental estimate concerning such functions, the Schwartz lemma:

Lemma 17 (Schwarz lemma) Let {f: D(0,1) \rightarrow D(0,1)} be a holomorphic map such that {f(0)=0}. Then we have {|f(z)| \leq |z|} for all {z \in D(0,1)}. In particular, {|f'(0)| \leq 1}.

Furthermore, if {|f(z)|=|z|} for some {z \in D(0,1) \backslash \{0\}}, or if {|f'(0)|=1}, then there exists a real number {\theta} such that {f(z) = e^{i \theta} z} for all {z \in D(0,1)}.

Proof: By the factor theorem (Corollary 22 of Notes 3), we may write {f(z) = z g(z)} for some holomorphic {g: D(0,1) \rightarrow {\bf C}}. On any circle {\{ z: |z| = 1-\varepsilon \}} with {0 < \varepsilon < 1}, we have {|f(z)| <1} and hence {|g(z)| < \frac{1}{1-\varepsilon}}; by the maximum principle we conclude that {|g(z)| \leq \frac{1}{1-\varepsilon}} for all {z \in D(0,1-\varepsilon)}. Sending {\varepsilon} to zero, we conclude that {|g(z)| \leq 1} for all {z \in D(0,1)}, and hence {|f(z)| \leq |z|} and {|f'(0)| = 1}.

Finally, if {|f(z)|=|z|} for some {z \in D(0,1) \backslash \{0\}} or {|f'(0)|=1}, then {|g(z)|} equals {1} for some {z \in D(0,1)}, and hence by a variant of the maximum principle (see Exercise 18 below) we see that {g} is constant, giving the claim. \Box

Exercise 18 (Variant of maximum principle) Let {U} be a connected Riemann surface, and let {z_0} be a point in {U}.

  • (i) If {u: U \rightarrow {\bf R}} is a harmonic function such that {u(z) \leq u(z_0)} for all {z \in U}, then {u(z) = u(z_0)} for all {z \in U}.
  • (ii) If {f: U \rightarrow {\bf C}} is a holomorphic function such that {|f(z)| \leq |f(z_0)|} for all {z \in U}, then {f(z) = f(z_0)} for all {z \in U}.

(Hint: use Exercise 17 of Notes 3 .)

One can think of the Schwarz lemma as follows. Let {{\mathcal H}_0} denote the collection of holomorphic functions {f: D(0,1) \rightarrow D(0,1)} with {f(0)=0}. Inside this collection we have the rotations {R_\theta: D(0,1) \rightarrow D(0,1)} for {\theta \in {\bf R}} defined by {R_\theta(z) :=e^{i\theta} z}. The Schwarz lemma asserts that these rotations “dominate” the remaining functions {f} in {{\mathcal H}_0} in the sense that {|f(z)| \leq |R_\theta(z)|} on {D(0,1) \backslash \{0\}}, and in particular {|f'(z)| \leq |R'_\theta(z)|}; furthermore these inequalities are strict as long as {f} is not one of the {R_\theta}.

As a first application of the Schwarz lemma, we characterise the automorphisms of the disk {D(0,1)}. For any {\alpha \in D(0,1)}, one can check that the Möbius transformation {z \mapsto \frac{z-\alpha}{1-\overline{\alpha} z}} preserves the boundary of the disk {D(0,1)} (since {1 - \overline{\alpha} z = z \overline{z-\alpha}} when {|z|=1}), and maps the point {\alpha} to the origin, and thus maps the disk {D(0,1)} to itself. More generally, for any {\alpha \in D(0,1)} and {\theta \in {\bf R}}, the Möbius transformation {z \mapsto e^{i\theta} \frac{z-\alpha}{1-\overline{\alpha} z}} is an automorphism of the disk {D(0,1)}. It turns out that these are the only such automorphisms:

Theorem 19 (Automorphisms of disk) Let {f: D(0,1) \rightarrow D(0,1)} be a complex diffeomorphism. Then there exists {\alpha \in D(0,1)} and {\theta \in {\bf R}} such that {f(z) = e^{i\theta} \frac{z-\alpha}{1-\overline{\alpha} z}} for all {z \in D(0,1)}. If furthermore {f(0)=0}, then we can take {\alpha=0}, thus {f(z) =e^{i\theta} z} for {z \in D(0,1)}.

Proof: First suppose that {f(0)=0}. By the Schwarz lemma applied to both {f} and its inverse {f^{-1}}, we see that {|f'(0)|, |(f^{-1})'(0)| \leq 1}. But by the inverse function theorem (or the chain rule), {(f^{-1})'(0) = 1/f'(0)}, hence {|f'(0)|=1}. Applying the Schwarz lemma again, we conclude that {f(z) = e^{i\theta} z} for some {\theta}, as required.

In the general case, there exists {\alpha \in D(0,1)} such that {f(\alpha) = 0}. If one then applies the previous analysis to {f \circ g^{-1}}, where {g: D(0,1) \rightarrow D(0,1)} is the automorphism {g(z) := \frac{z-\alpha}{1-\overline{\alpha} z}}, we obtain the claim. \Box

Exercise 20 (Automorphisms of half-plane) Let {f: {\bf H} \rightarrow {\bf H}} be a complex diffeomorphism from the upper half-plane {{\bf H} := \{z \in {\bf C}: \mathrm{Im}(z) > 0 \}} to itself. Show that there exist real numbers {a,b,c,d} with {ad-bc = 1} such that {f(z) = \frac{az+b}{cz+d}} for {z \in {\bf H}}. Conclude that the automorphism gr