Linear Algebra Done Right
Posted by David Corfield
– guest post by Tom Leinster –
One of the best things about the n-Category Café is the willingness of its clientele to re-examine notions that are usually thought of as too basic to bother with — things that you might think were beneath your dignity.
Another great thing about the n-Café is the generosity of its proprietors in letting me post here! Thanks to them.
I’m not going to talk about fancy linear algebra: just the good old theory of finite-dimensional vector spaces, like you probably met when you were an undergraduate. Or rather, not quite like what you met…
The inspiration is Sheldon Axler’s textbook Linear Algebra Done Right. Unfortunately you can’t download it, but you can download an earlier article making the same main points, Down with determinants!.
I’ve just given you a dirty great hint as to what’s behind that provocative first title. For Axler, ‘done right’ means ‘done without determinants’. He claims that determinants should be delayed until as late as possible in a linear algebra course - and he shows that ‘as late as possible’ is a lot later than most people think. He doesn’t introduce them himself until Chapter 10 of his 10-chapter book. In fact, he goes as far as to claim that the only part of undergraduate mathematics where determinants are needed is the change of variables formula for multi-variable integrals.
He makes his point in the Introduction:
The audacious title of this book deserves an explanation. Almost all linear algebra books use determinants to prove that every linear map from a finite-dimensional complex vector space to itself has an eigenvalue. Determinants are difficult, nonintuitive, and often defined without motivation. To prove the theorem about existence of eigenvalues on complex vector spaces, most books must define determinants, prove that a linear map is not invertible if and only if its determinant equals 0, and then define the characteristic polynomial. This tortuous (torturous?) path gives students little feeling for why eigenvalues must exist. In contrast, the determinant-free proof presented here […] is simpler and clearer and offers more insight.
I’ll give you the proof below.
I think most people would agree that Axler’s line of development is very nice and represents a good way of teaching. Nevertheless, perhaps rather fewer would join in wholeheartedly with the cry of ‘Down with determinants!’ After all, determinants are polynomials, and once you have a polynomial you can engage the power of algebraic geometry. For example, it’s useful to know that the invertible matrices are Zariski-dense in the space of all matrices, which follows from the fact that the invertible matrices are those whose determinant is nonzero. Determinants are also re-emerging in all sorts of fancy settings, such as quasideterminants, hyperdeterminants, and multidimensional determinants.
I’m interested to know what other readers think. What are determinants good for?
This is a long post, so I’ll split the rest into two sections. First I’ll tell you more about what Axler does. Then I’ll tell you why his book could also have been called Dynamics In A Finite-Dimensional Vector Space.
Axler’s Linear Algebra
First, here’s the proof that Axler mentioned. Our task is to show that if is a nontrivial, finite-dimensional vector space, then every endomorphism of has an eigenvalue. Choose a nonzero vector . The vectors can’t all be linearly independent. So there’s a dependence relation, which is of the form for some nonzero polynomial . Since we’re over the complex numbers, splits into linear factors, and Since , not all the maps can be injective: in other words, some is an eigenvalue.
This is a ‘dynamical’ proof: it works by considering the trajectory of under iterated application of . It’s similar to the proof that every element of a finite group has finite order.
If you like that, you’ll probably like Axler’s whole approach. Here’s some more. Given an endomorphism of a vector space , its eventual kernel is the union of the subspaces and its eventual image is the intersection of the subspaces You can show that if is finite-dimensional then You might think this doesn’t say anything about the familiar concerns of linear algebra - but wait a moment.
Fix an endomorphism of , a finite-dimensional complex vector space. For , write , and call its elements eventual -eigenvectors. (The ‘eventual’ terminology is mine, not Axler’s; he calls them ‘generalized eigenvectors’, which I don’t find very evocative.) Using the results above, you can show that In particular, this implies that the dimensions of the ’s add up to : writing , In particular, is nonzero for only finitely many values of . But is nonzero if and only if is nontrivial, if and only if is an eigenvalue. So has only finitely many eigenvalues!
I’m omitting the proofs, but it’s as easy to find them yourself as to look them up.
In fact, is the mysterious quantity known as the algebraic multiplicity of . It’s usually defined as the power of appearing in the characteristic polynomial , which doesn’t give much of an idea of what it really is.
Well, it’s not mysterious any more! The algebraic multiplicity of is simply the dimension of , the space of eventual -eigenvectors. Clearly it’s at least as big as the dimension of , the space of ordinary -eigenvectors. This inequality is usually phrased as
If you already know about Jordan normal form, you can see that the two descriptions of algebraic multiplicity are equivalent. In fact, you come naturally to Jordan normal form if you walk just a little further along the Axlerian path.
Despite Axler’s anti-determinant rhetoric, this approach does in fact give a very nice way into determinants. You simply define (Look, no matrices!) Similarly, the characteristic polynomial is and the Cayley-Hamilton Theorem becomes a pushover.
Easy exercise: find a similar formula for trace.
Dynamics in a Finite-Dimensional Vector Space
A dynamical system is a system that changes through time. The central questions of dynamics are things like: what do the trajectories/orbits look like? Are there fixed points? Are there cycles? Do all trajectories ‘settle down’ or can there be chaos? If you start with points close together, do they necessarily exhibit similar long-term behaviour?
I want to talk about dynamical systems in some generality - so much generality that it won’t be clear what all of these questions mean. This will take us back to Linear Algebra Done Right! I’ll stick to discrete-time dynamical systems, where time is modelled on the integers, not the real numbers.
Sloganeering,
A dynamical system is an endomorphism.
Thus, we take a point in whatever space the endomorphism is defined on, repeatedly apply the endomorphism to it, and ask what the trajectory might look like.
For example, I have before me a paper that defines an ‘abstract dynamical system’ as a continuous map from a metric space to itself. (Here at the -Category Café, we hardly notice the word ‘abstract’ tacked onto the front.)
For another example, complex dynamics is about what happens when you iterate a rational function (over ), or more generally, what happens when you iterate a holomorphic self-map of a Riemann surface. This is ‘more generally’ because a rational function is the same thing as a holomorphic self-map of the Riemann sphere . Here we meet Julia sets, the Mandelbrot set, etc.
For another example, you might call combinatorial dynamics the study of iterating a function from a finite set to itself. There’s a surprising amount to be said here, and it’s curiously absent from common knowledge. I first noticed this when reading Milnor’s book Dynamics in One Complex Variable, where now and again he needs some chewy little result in combinatorial dynamics. For instance, did you know that for any endomorphism of a finite set, the set contains precisely one idempotent? It’s not especially hard to prove - in fact, it’s among the most basic results in combinatorial dynamics. But what’s interesting is that so few people just know it off the top of their heads. Contrast the vast amount we all know about permutations of finite sets.
And for another example, take dynamics of a vector space! This is what Axler is really doing. You can see from the stuff above that we’re concerned with iterated applications of the endomorphism , and eventual behaviour of a vector . That’s the changed point of view that allows him to do without determinants.
For instance, take the proof that every linear endomorphism has an eigenvalue. The first step is to show that every trajectory eventually ‘repeats itself’, in the weak sense that after a finite number of steps, all the points we obtain are in the linear span of those listed previously. This is a rough analogue of the topological statement ‘every trajectory is attracted to a cycle’.
I’d like to know whether it’s fruitful to take other standard statements in dynamics and formulate linear-algebraic analogues. In other words, can the analogy between dynamical systems and Axlerian linear algebra be pushed further?
Re: Linear Algebra Done Right
I’m really torn about Axler. My first course at the University of Maryland was the upper-level linear algebra, and we used Axler. Overall, I’d say that the theme is to resist using a basis, which I wholly support.
The problem then is how to do determinants without a basis, and I really dislike how Axler ends up handling them. The “right” way (in my opinion) is to look at the top degree of the exterior algebra rather than to introduce the characteristic polynomial and then back into determinants, but this is pretty sophisticated even for senior math majors. In fact, Dr. Adams tacked on a section from Lang’s Algebra at the end of the course to cover this approach and promptly lost all but three of us.