*Linear Algebra Done Right*

#### Posted by David Corfield

* – guest post by Tom Leinster – *

One of the best things about the *n*-Category Café is the
willingness of its clientele to re-examine notions that are usually
thought of as too basic to bother with — things that you might think
were beneath your dignity.

Another great thing about the *n*-Café is the generosity of its
proprietors in letting me post here! Thanks to them.

I’m not going to talk about fancy linear algebra:
just the good old theory of finite-dimensional vector spaces, like you
probably met when you were an undergraduate. Or rather, not
*quite* like what you met…

The inspiration is Sheldon Axler’s textbook *Linear Algebra Done
Right*. Unfortunately you can’t download it, but you
*can* download an earlier article making the same main points,
Down with determinants!.

I’ve just given you a dirty great hint as to what’s behind that provocative first title. For Axler, ‘done right’ means ‘done without determinants’. He claims that determinants should be delayed until as late as possible in a linear algebra course - and he shows that ‘as late as possible’ is a lot later than most people think. He doesn’t introduce them himself until Chapter 10 of his 10-chapter book. In fact, he goes as far as to claim that the only part of undergraduate mathematics where determinants are needed is the change of variables formula for multi-variable integrals.

He makes his point in the Introduction:

The audacious title of this book deserves an explanation. Almost all linear algebra books use determinants to prove that every linear map from a finite-dimensional complex vector space to itself has an eigenvalue. Determinants are difficult, nonintuitive, and often defined without motivation. To prove the theorem about existence of eigenvalues on complex vector spaces, most books must define determinants, prove that a linear map is not invertible if and only if its determinant equals 0, and then define the characteristic polynomial. This tortuous (torturous?) path gives students little feeling for why eigenvalues must exist. In contrast, the determinant-free proof presented here […] is simpler and clearer and offers more insight.

I’ll give you the proof below.

I think most people would agree that Axler’s line of development is very nice and represents a good way of teaching. Nevertheless, perhaps rather fewer would join in wholeheartedly with the cry of ‘Down with determinants!’ After all, determinants are polynomials, and once you have a polynomial you can engage the power of algebraic geometry. For example, it’s useful to know that the invertible $n\times n$ matrices are Zariski-dense in the space of all $n\times n$ matrices, which follows from the fact that the invertible matrices are those whose determinant is nonzero. Determinants are also re-emerging in all sorts of fancy settings, such as quasideterminants, hyperdeterminants, and multidimensional determinants.

I’m interested to know what other readers think. What are determinants good for?

This is a long post, so I’ll split the rest into two sections. First
I’ll tell you more about what Axler does. Then I’ll tell you why
his book could also have been called *Dynamics In A
Finite-Dimensional Vector Space*.

*Axler’s Linear Algebra*

First, here’s the proof that Axler mentioned. Our task is to show that if $V$ is a nontrivial, finite-dimensional vector space, then every endomorphism $T$ of $V$ has an eigenvalue. Choose a nonzero vector $v$. The vectors $v, T(v), T^2(v), T^3(v), \ldots$ can’t all be linearly independent. So there’s a dependence relation, which is of the form $p(T)(v) = 0$ for some nonzero polynomial $p$. Since we’re over the complex numbers, $p$ splits into linear factors, and $(T - \lambda_1) \cdots (T - \lambda_k) (v) = 0.$ Since $v \neq 0$, not all the maps $(T - \lambda_i)$ can be injective: in other words, some $\lambda_i$ is an eigenvalue.

This is a ‘dynamical’ proof: it works by considering the trajectory of $v$ under iterated application of $T$. It’s similar to the proof that every element of a finite group has finite order.

If you like that, you’ll probably like Axler’s whole approach. Here’s
some more. Given an endomorphism $S$ of a vector space $V$, its
**eventual kernel** $evKer(S)$ is the union of the subspaces
$Ker(S) \subseteq Ker(S^2) \subseteq Ker(S^3) \subseteq \cdots,$
and its **eventual image** $evIm(S)$ is the intersection of the
subspaces
$Im(S) \supseteq Im(S^2) \supseteq Im(S^3) \supseteq \cdots.$
You can show that if $V$ is finite-dimensional then
$V = evKer(S) \oplus evIm(S).$
You might think this doesn’t say anything about the familiar concerns
of linear algebra - but wait a moment.

Fix an endomorphism $T$ of $V$, a finite-dimensional complex vector
space. For $\lambda \in \mathbb{C}$, write $E_\lambda = evKer(T -
\lambda)$, and call its elements **eventual
$\lambda$-eigenvectors**. (The ‘eventual’ terminology is mine, not
Axler’s; he calls them ‘generalized eigenvectors’, which I don’t find
very evocative.) Using the results above, you can show that
$V = \oplus_{\lambda \in \mathbb{C}} E_\lambda.$
In particular, this implies that the dimensions of the $E_\lambda$’s
add up to $\dim V$: writing $m(\lambda) = \dim(E_\lambda)$,
$\dim V = \sum_{\lambda \in \mathbb{C}} m(\lambda).$
In particular, $m(\lambda)$ is nonzero for only finitely many values
of $\lambda$. But $m(\lambda)$ is nonzero if and only if $evKer(T -
\lambda)$ is nontrivial, if and only if $\lambda$ is an eigenvalue.
So $T$ has only finitely many eigenvalues!

I’m omitting the proofs, but it’s as easy to find them yourself as to look them up.

In fact, $m(\lambda)$ is the mysterious quantity known as the **algebraic
multiplicity** of $\lambda$. It’s usually defined as the power of
$(x - \lambda)$ appearing in the characteristic polynomial
$\chi_T(x)$, which doesn’t give much of an idea of what it really
*is*.

Well, it’s not mysterious any more! The algebraic multiplicity of $\lambda$ is simply the dimension of $evKer(T - \lambda)$, the space of eventual $\lambda$-eigenvectors. Clearly it’s at least as big as the dimension of $Ker(T - \lambda)$, the space of ordinary $\lambda$-eigenvectors. This inequality is usually phrased as $\text{algebraic multiplicity} \geq \text{geometric multiplicity}.$

If you already know about Jordan normal form, you can see that the two descriptions of algebraic multiplicity are equivalent. In fact, you come naturally to Jordan normal form if you walk just a little further along the Axlerian path.

Despite Axler’s anti-determinant rhetoric, this approach does in fact give a very nice way into determinants. You simply define $\det T = \prod_{\lambda \in \mathbb{C}} \lambda^{m(\lambda)}.$ (Look, no matrices!) Similarly, the characteristic polynomial is $\chi_T(x) = \prod_{\lambda \in \mathbb{C}} (x - \lambda)^{m(\lambda)},$ and the Cayley-Hamilton Theorem becomes a pushover.

Easy exercise: find a similar formula for trace.

*Dynamics in a Finite-Dimensional Vector Space*

A dynamical system is a system that changes through time. The central questions of dynamics are things like: what do the trajectories/orbits look like? Are there fixed points? Are there cycles? Do all trajectories ‘settle down’ or can there be chaos? If you start with points close together, do they necessarily exhibit similar long-term behaviour?

I want to talk about dynamical systems in some generality - so much
generality that it won’t be clear what all of these questions mean.
This *will* take us back to *Linear Algebra Done Right*!
I’ll stick to discrete-time dynamical systems, where time is modelled
on the integers, not the real numbers.

Sloganeering,

A dynamical system is an endomorphism.

Thus, we take a point in whatever space the endomorphism is defined on, repeatedly apply the endomorphism to it, and ask what the trajectory might look like.

For example, I have before me a paper that defines an ‘abstract dynamical system’ as a continuous map from a metric space to itself. (Here at the $n$-Category Café, we hardly notice the word ‘abstract’ tacked onto the front.)

For another example, complex dynamics is about what happens when you iterate a rational function (over $\mathbb{C}$), or more generally, what happens when you iterate a holomorphic self-map of a Riemann surface. This is ‘more generally’ because a rational function is the same thing as a holomorphic self-map of the Riemann sphere $\mathbb{C} \cup \{\infty\}$. Here we meet Julia sets, the Mandelbrot set, etc.

For another example, you might call *combinatorial dynamics* the
study of iterating a function from a finite set to itself. There’s a
surprising amount to be said here, and it’s curiously absent from
common knowledge. I first noticed this when reading Milnor’s book
Dynamics
in One Complex Variable,
where now and again he needs
some chewy little result in combinatorial dynamics. For instance,
did you know that for any endomorphism $f$ of a finite set, the set
$\{ f, f^2, f^3, \ldots \}$
contains precisely one idempotent? It’s not especially hard to prove
- in fact, it’s among the most basic results in combinatorial
dynamics. But what’s interesting is that so few people just know it
off the top of their heads. Contrast the vast amount we all know
about *permutations* of finite sets.

And for another example, take dynamics of a vector space! This is
what Axler is really doing. You can see from the stuff above that
we’re concerned with *iterated* applications of the endomorphism
$T$, and *eventual* behaviour of a vector $v$. That’s the
changed point of view that allows him to do without determinants.

For instance, take the proof that every linear endomorphism has an eigenvalue. The first step is to show that every trajectory $v, T(v), T^2(v), T^3(v), \ldots$ eventually ‘repeats itself’, in the weak sense that after a finite number of steps, all the points we obtain are in the linear span of those listed previously. This is a rough analogue of the topological statement ‘every trajectory is attracted to a cycle’.

I’d like to know whether it’s fruitful to take other standard statements in dynamics and formulate linear-algebraic analogues. In other words, can the analogy between dynamical systems and Axlerian linear algebra be pushed further?

## Re: Linear Algebra Done Right

I’m really torn about Axler. My first course at the University of Maryland was the upper-level linear algebra, and we used Axler. Overall, I’d say that the theme is to resist using a

basis, which I wholly support.The problem then is how to do determinants without a basis, and I really dislike how Axler ends up handling them. The “right” way (in my opinion) is to look at the top degree of the exterior algebra rather than to introduce the characteristic polynomial and then back into determinants, but this is pretty sophisticated even for senior math majors. In fact, Dr. Adams tacked on a section from Lang’s

Algebraat the end of the course to cover this approach and promptly lost all but three of us.