Originally published at 狗和留美者不得入内. You can comment here or there.
Jordan normal form states that every square matrix is similar to a Jordan normal form matrix, one of the form
where each of the is square of the form
.
This is constructed via generalized eigenvectors. One can observe that each block matrix corresponds to an invariant subspace, and generalized eigenvectors (of a matrix) are a set of chains, each of which is its own invariant subspace.
We let be any linear transformation from
to
, where
is of course a vector space.
It is more common knowledge that is the mechanism used to solve for eigenvectors. Let us first observe that
is such that also
since
commutes with
and that this extends to
for any natural number
. This gives us a way to identify larger invariant subspaces.
Let .
is obvious, and there will be some and a smallest
at which
. Afterwards, the
must all be equal. If not, there will be in the intermediary on iterating
against a vector from which
is contradicted.
Next, we observe that . Suppose not. Then, we would have some
but not in
.
Rank nullity theorem says that the remainder after pulling out for some eigenvalue
is
. We can run the same algorithm on that for another eigenvalue. So this is resolved by induction.
The result is that if has distinct eigenvalues
, there are
such that the domain of
is
.
Now does correspond to a irreducible invariant subspace necessarily? No, as there is a difference between algebraic and geometric multiplicity.
Now we will show, as the second part of the proof, to be invoked on the components in the direct sum decomposition from the preceding first part of the proof that if is nilpotent, meaning that
for some
, then there are
and
such that
is a basis (linearly independent by definition) for the domain of
, with
and
. (Note that here
is the smallest with
.)
For any eigenvalue, there is an eigenvector space associated with it. Take its preimage with respect to . Do this successively until nothing remains, which will be at the
th iteration. In particular, take
to be the basis of the eigenvector space. For each one of these that has non-empty preimage, we take the element with the kernel projected out. This accumulates a set of vectors of the format specified. It has to be exhaustive with respect to the vector space under the nilpotence assumption, from which termination is also guaranteed. It remains to show that these are linearly independent. We can using our eigenvector space as our base case take an inductive hypothesis where the vectors accumulated prior to the
th iteration are linearly independent. Now we show that the vector set remains so after adding in the ones obtained from taking preimage. We note that first, the added ones are linearly independent themselves (if a nontrivial linear combinations gives zero, applying
would violate our inductive hypothesis). There is also that a nontrivial linear combination of the newly added ones cannot equal a linear combination of the rest. To show this, assume otherwise, and apply
just enough times (
times) for one side to disappear. The other side should be a linear combination with respect to our designated basis of the eigenvector space, which cannot disappear. This concludes our construction.
Essentially we have chains (of vectors) which terminate when an element is no longer found after our preimage operation. Applying to this , we see that for some element
in our chain,
, where
is the previous element of the chain, with
signifying that we are the at an eigenvector (non-generalized), at the front. Ones along the superdiagonal correspond to the
coefficient of
above.