Linear algebra: eigenvalues and some decompositions
Table of Contents
Introduction
In this chapter, we will show various results relating the form of the minimal polynomial of a \(T \in \mc{L}(V)\) to particular decompositions of \(V\) and \(T\). A decomposition of \(T\) is a set of linear transformations \(T_1, \ldots, T_k\) such that \(T = T_1 + \cdots + T_k\) and a decomposition of \(V\) is a set of subspaces \(V_i\) of \(V\) such that \(V = V_1 + \cdots + V_k\). The forms of these turn out to be related to the set of generalized eigenvectors of \(T\).
Preliminary results regarding polynomials
Definition: Let \(R\) be a commutative ring and \(f(x) \in R[x]\). An element \(a\) of \(R\) is said to be a root (or zero) of the polynomial \(f(x)\) iff \(f(a) = 0_R\), that is, if the induced function \(f: R \to R\) maps \(a\) to \(0_R\) (Hungerford, 2013, p. 106).
Theorem (the factor theorem): Let \(F\) be a field, \(f(x) \in F[x]\), and \(a \in F\). Then \(a\) is a root of the polynomial \(f(x)\) if and only if \(x - a\) is a factor of \(f(x)\) in \(F[x]\). (Hungerford, 2013, p. 107).
Definition: A field \(K\) is algebraically closed iff every non-constant polynomial over \(K\) splits into linear (monic) factors (Evans, 1999, p. 1).
Theorem (Bezout's Lemma): Let \(F\) be a field and \(a(x), b(x) \in F[x]\), not both zero. Then there is a unique greatest common divisor \(d(x)\) of \(a(x)\) and \(b(x)\). Furthermore, there are (not necessarily unique) polynomials \(u(x)\) and \(v(x)\) such that \(d(x) = a(x) u(x) + b(x) v(x)\) (Hungerford, 2013, p. 97).
We can generalize Bezout's Lemma.
Generalization of Bezout's Lemma
We now generalize Bezout's lemma to the case of \(k\) polynomials.
Theorem: Let \(f_1(x), \ldots, f_k(x)\) have a gcd \(d(x)\). Then there exist polynomials \(s_1(x), \ldots, s_k(x)\) such that \(s_1(x) f_1(x) + \cdots + s_k(x) f_k(x) = d(x)\).
Proof:
Assume the statement of the theorem holds for \(k - 1\) polymonials. Let \(f_1(x), \ldots, f_{k-1}(x)\) have gcd \(g(x)\). Then there exist \(s_1(x), \ldots, s_{k-1}(x)\) such that
Let the gcd of \(f_1(x), \ldots, f_{k}(x)\) be \(d(x)\).
[Claim: \(d(x)\) is the gcd of \(g(x)\) and \(f_k(x)\).
Proof: \(d(x)\) divides \(f_k(x)\). Since \(g(x)\) is the greatest common divisor of \(f_1(x)\), ..., \(f_{k-1}(x)\), every polynomial that divides all of those polynomials will divides \(g(x)\). Therefore since \(d(x)\) divides all those polynomials, it divides \(g(x)\). It remains to show that for any \(l(x)\) that divides \(g(x)\) and \(f_k(x)\), \(l(x)\) divides \(d(x)\). Assume \(l(x)\) divides \(g(x)\) and \(f_k(x)\). Since \(l(x)\) divides \(g(x)\) and \(g(x)\) divides all of \(f_1(x)\), ..., \(f_{k-1}(x)\), \(l(x)\) divides all of \(f_1(x)\), ..., \(f_{k-1}(x)\). Thus \(l(x)\) divides all of \(f_1(x), \ldots, f_k(x)\). Since \(d(x)\) is the gcd of \(f_1(x), \ldots, f_k(x)\), \(l(x) \mid d(x)\). Therefore \(d(x)\) is the gcd of \(g(x)\) and \(f_k(x)\).]
Thus applying Bezout's Lemma to \(g(x)\) and \(f_k(x)\), there exist polynomials \(a(x)\) and \(b(x)\) such that \(a(x) g(x) + b(x) f_k(x) = d(x)\).
Substituting in (1) for \(g(x)\), we have
\(a(x) [t_1(x) f_1(x) + \cdots + t_{k-1}(x) f_{k-1}(x)] + b(x) f_k(x) = d(x)\).
Thus we have that there exist \(s_1(x), \ldots, s_k(x)\) such that
and the theorem is proven. \(\square\)
Notes on the polynomial of a linear transformation
Note on the identity transform appearing in place of 1 in a polynomial of a transformation
Note that \(1(x) = 1\), i.e. \(1(x) = a_0 x_0 + a_1 x^1 + \cdots\), where \(a_1 = 1\) and \(\forall\, i > 1 \quad a_i = 0\).
Using this, we have
\(1(T) = a_0 T^0 + 0 = a_o I = 1 \cdot I = I\).
Preliminary definitions and theorems
Definition (Surowski, 1997, p. 43): Let \(\mb{F}\) be a field, let \(V\) be a vector space over \(\mb{F}\), and let \(T \in \mc{L}(V)\). \(\lambda \in \mb{F}\) is an eigenvalue with eigenvector \(v\) in \(V\) iff \(v \neq 0\) and \(T(v) = \lambda v\).
Observe that
Claim: Let \(A \in M_n(\mb{F})\). If \(\text{dim}(V) = n\), then \(\text{det}(xI - A)\) is a polynomial of degree \(n\).
Proof:
Let \((A)_{ij} = a_{ij}\). First note that
Let \(S_n\) be the set of all permutations on \([n]\). For a permutation \(\sigma \in S_n\), let \(\varepsilon(\sigma)\) be defined as in Proposition 6.2 (Lang, p. 166).
Applying Theorem 7.2 of Lang (1997, p. 171, Chapter VI, Section 7),
Let \(\sigma \in S_n\). If \(\sigma(i) = i\), \(a_{\sigma(i), i} = x - a_{ii}\). If \(\sigma(i) \neq i\), \(a_{\sigma(i), i} = - a_{\sigma(i), i}\).
So \(\forall\, \sigma \in S_n \quad \exists\, k \in [n] \quad \text{deg}(\varepsilon(\sigma) \cdot \prod_{i=1}^n a_{\sigma(i), i}) = k\).
Note that when \(\sigma = \text{id}\), \(\text{deg}(\varepsilon(\sigma) \cdot \prod_{i=1}^n a_{\sigma(i), i}) = n\), and when \(\sigma \neq \text{id}\), \(\text{deg}(\varepsilon(\sigma) \cdot \prod_{i=1}^n a_{\sigma(i), i}) < n\).
Therefore the term \(c \cdot x^n\) is not cancelled out in the summation by any other term.
Therefore, \(\text{det}(xI - A)\) is a polynomial of degree \(n\). \(\square\)
Definition: The characteristic polynomial of \(T \in \mc{L}(V)\) with respect to a basis \(\mc{A}\) is is \(c_T(x) = \text{det}([T]_{\mc{A}} - x I)\).
Claim: The characteristic polynomial of \(T\) is the same for any basis.
Proof: This proof is from Lang, 1987, p. 205. First note that
\([T]_{\mc{B}} = [\text{id}]_{\mc{B}\mc{A}} [T]_{\mc{A}} [\text{id}]_{\mc{A}\mc{B}} = P^{-1} [T]_{\mc{A}} P\).
By the properties of the determinant,
\(\square\)
Theorem: \(\alpha\) is a root of \(c_T(x)\) if and only if \(\alpha\) is an eigenvalue of \(T\).
Proof: A factor \((x - \alpha)^f\) (for some \(f \in \mb{N}\)) appears in \(c_T(x)\) if and only if \(a\) is a root of \(c_T(x)\), which occurs iff \(c_T(\alpha) = \text{det}([T]_{\mc{A}} - \alpha I) = 0\), which occurs if and only if \(\alpha\) is an eigenvalue of \(T\). \(\square\)
The primary decomposition theorem
The minimal polynomial of a linear transformation
Definition (Surowski, 1997, p. 55): Let \(V\) be a finite dimensional vector space over a field \(\mb{F}\). The minimal polynomial of \(T \in \mc{L}(V)\), denoted \(m_T(x)\), is the monic polynomial of least degree in the set of polynomials \(\{0 \neq g(x) \in \mb{F}[x] ;\, g(T) = 0\}\).
The minimal polynomial always exists (see Surowski, 1997, p. 59, Proposition 2.2.9).
We will now discuss the form of the minimal polynomial.
Eigenvalues and the minimal polynomial
Theorem (irreducible unique factorization of polynomials) (Surowski, 1997, p. 53): If \(f(x) \in \mb{F}[x]\), then \(f(x)\) can be uniquely factored as
where \(p_1(x), \ldots, p_r(x)\) are distinct irreducible polynomials and \(e_1, \ldots, e_r\) are positive integers.
This Theorem is proved in abstract algebra.
By this theorem, we can write \(m_T(x) = p_1(x)^{e_1} \cdots p_k(x)^{e_k}\) for some \(k \in \mb{N}\), some factors \(p_1, \ldots, p_k\), and some nonnegative integers \(e_1, \ldots, e_k\).
We now make a claim about the form of some of these factors.
Claim: Let \((V, \mb{F})\) be a vector space, \(p(x) \in \mb{F}[x]\), and let \(\lambda_i\) be an eigenvalue of \(T: V \to V\). Then there exists \(i \in [k]\), exists some \(e_i \in \mb{N}\) such that \(p_i(x)^{e_i} = (x - \lambda_i)^{e_i}\).
Proof:
Note that for an eigenvalue \(\lambda_i\), there exists \(v_i \neq 0\) such that \(T v_i = \lambda_i v_i\). Let \(p(x) \in \mb{F}[x]\). Since \(T v_i = \lambda_i v_i\), it follows that \(T^m v_i = \lambda_i^m v_i\). Therefore
so thus \(p(T) v_i = p(\lambda) v_i\).
This is true for all polynomials \(p(x)\). Therefore the only polynomial for which \(p(T) v_i = 0\) is a polynomial for which \(p(\lambda_i) = 0\). The monic polynomial of smallest degree for which this happens is \(x - \lambda_i\). Therefore \(m_{T,v_i}(x) = x - \lambda_i\). Since \(m_{T,v_i}(x) \mid m_T(x)\), \(x - \lambda_i\) appears to some power in \(m_T(x)\). \(\square\)
Claim: Every root of \(m_T(x)\) is an eigenvalue of \(T\).
Proof: By the factor theorem, \(a\) is a root of \(m_T(x)\) if and only if it appears as a monic factor of \(m_T(x)\), so \(m_T(x) = (x - a) p(x)\) for some \(p(x)\) with \(\text{deg}(p) < \text{deg}(m_T)\). We have \(m_T(T) = (T - aI) p(T) = 0\). Note that if \(\inv{(T - aI)}\) exists, then we could multiply both sides of \((T - aI) p(T) = 0\) by \((T - aI)^{-1}\), and we would have \(p(T) = 0\), so \(m_T\) would no longer be the minimal polynomial of \(T\), which is a contradiction. Therefore \(T - aI\) is not invertible, so \(\text{det}(T - aI) = 0\). Therefore \(a\) is an eigenvalue of \(T\). \(\square\)
We note that if \(\mb{F}\) is not algebraically closed, some factors may be irreducible to degree 1. The classic example is \(x^2 + 1\) in \(\mb{R}[x]\).
Summing up: for \((V, \mb{F})\) and \(T: V \to V\), \(m_T(x) = p_1(x)^{e_1} \cdots p_r(x)^{e_r}\), and every monic factor \(p_i(x) = (x - a_i)^{e_1}\) has \(a_i\) as an eigenvalue of \(T\). If \(\mb{F}\) is not algebraically closed, some factors may not be monic.
Defining subspaces
We will now define subspaces \(V_1, \ldots, V_k\) such that \(V = V_1 \oplus \cdots \oplus V_k\) and which will have nice properties relative to \(T\) and some idempotents \(P_1, \ldots, P_k\) we will define.
For all \(i \in [k]\), define \(V_i := \text{ker } p(T)^{e_i}\)
Claim: \(T(V_i) \subseteq V_i\).
Proof: Let \(x \in V_i\). Consider \(p(T)^{e_i} Tx = T p(T)^{e_i} x = T(0) = 0\), so \(Tx \in V_i\). Therefore \(T(V_i) \subseteq V_i\). \(\square\)
Claim: \(V = V_1 + \cdots + V_k\)
Proof: Define
Let \(v \in V\). Observe that \(p_i(T)^{e_i} q_i(T) v = m_T(T) v = 0\) since \(m_T(T) = 0\). Therefore \(q_i(T)v \in \text{ker } p_i(T)^{e_i}\).
Note that \(v = I_V(v) = \sum_{i=1}^{k} s_i(T) q_i(T) v\). Since \(q_i(T) v \in V_i\), since \(V_i\) is \(T\)-invariant we have \(s_i(T) q_i(T) v \in V_i\). Thus
Recall that for \(U, W \subseteq V\), \(U\) and \(W\) subspaces
We have \(U + W \subseteq V\), \(U + W\) a subspace of \(V\). Note that if \(\forall\, v \in V \quad \exists\, u \in U, w \in W \quad v = U + W\), then \(V \subseteq U + W\) and we have \(V = U + W\).
Extending this definition to \(k\) subspaces, we see that \(V = V_1 + \cdots + V_k\). \(\square\)
Claim: \(V = V_1 \oplus \cdots \oplus V_k\).
Proof: See Surowski (1997, p. 64).
One detail in that proof is:
[Mini-result: Let \(w_1, \ldots, w_k\) be such that \(w_1 + \cdots + w_k = 0\). Then for all \(i \in [k]\) we have \(w_i = 0\).
Proof: Let \(j \neq i\), \(w_j \in V_j = \text{ker } p_j(T)^{e_j}\), \(p_j(x)\) a factor in \(q_i(x)\). So \(q_i(T) w_j = (\text{other terms}) \cdot p_j(T)^{e_j} w_j = 0\).
Therefore
In the proof, Surowski showed that \(w_i = t(T) q_i(T) w_i\), so it follows that \(w_i = 0\).]
Let \(v = v_1 + \cdots + v_k\) and also \(v = v_1^\prime + \cdots + v_k^\prime\). Then
\(0 = v - v^\prime = (v_1 - v_1^\prime) + \cdots + (v_k - v_k^\prime)\)
By the above mini-result, we have for all \(i\), \(v_i - v_i^\prime = 0 \,::\, v_i = v_i^\prime\), so the representation is unique. Therefore \(V = V_1 \oplus \cdots \oplus V_k\).
Defining idempotents
For all \(i \in [k]\), define \(P_i := s_i(T) q_i(T)\)
Claim: \(P_i\) is an idempotent, and for all \(i \neq j\), \(P_i P_j = 0\).
Proof:
By the definition of \(P_i\), we have that \(P_1 + \cdots P_k = I_V\)
Let \(i \neq j\). Consider \(P_i P_j = s_i(T) q_i(T) s_j(T) q_j(T)\).
For all \(i \neq j\), \(q_i(x) q_j(x)\) has all factors of \(m_T(x)\), so \(m_T(x) \mid s_i(x) q_i(x) s_j(x) q_j(x)\), so since \(m_T(T) = 0\) and \(m_T(T)\) appears in \(s_i(T) q_i(T) s_j(T) q_j(T)\), we have \(P_i P_j = 0\).
Note that
Therefore for all \(i\), \(P_i\) is an idempotent.
Claim: \(P_i(V)\) is \(T\)-invariant.
Proof: Let \(x \in P_i V\). Therefore \(\exists\, u \in V \quad x = P_i u\). Thus \(Tx = T P_i u = P_i Tu\). Since \(Tu \in V\), we have \(Tx \in P_i(V)\).
Therefore \(T(P_i(V)) \subseteq P_i(V)\).
Thus, \(P_i(V)\) is \(T\)-invariant.
Claim: \(P_1 V + \cdots + P_k V = V\)
Proof: Let \(v \in V\).
Therefore \(V = P_1 V + \cdots + P_k V\).
Claim:
Proof:
Assume \(v \in P_i V \cap \sum_{j; j \neq i} P_j V\). Then of course \(v \in \sum_{j; j \neq i} P_j(V)\). Let \(U = P_1 V + \cdots + P_{i-1} V + P_{i+1} V + \cdots + P_k V\). Then \(v \in U\) implies
Then note that for each \(j \in \{1, \ldots, i-1, i+1, \ldots, k\}\) we have \(\exists\, u_j \in V \quad x_j = P_j u_j\).
Therefore
and the statement is proved.
\(\square\)
Claim: \(v \in P_i V \implies v = P_i(v)\).
Proof: \(v \in P_i V\) so \(\exists\, w \in V \quad v = P_i w\).
\(P_i(v) = P_i(P_i w) = P_i^2(w) = P_i(w)\)
\(\therefore\, v = P_i(w) = P_i(v)\).
\(\square\)
Result: \(V = P_1 V \oplus \cdots \oplus P_k V\).
Proof: Surowski (1997, p. 65-66).
Result: \(\forall i \in [k] \quad P_i V = \text{ker } p_i(T)^{e_i}\).
Proof: Surowski (1997, p. 65-66).
Stating the primary decomposition theorem
Theorem (Surowski, p. 63-66): Let \(T \in \mc{L}(V)\), \(\text{dim }V = n\), \(m_T(x) = m_T(x) = p_1(x)^{e_1} \cdots p_k(x)^{e_k}\) where \(k \leq n\) and \(p_1(x), \ldots, p_k(x)\) are distinct irreducible polynomials in \(\mb{F}[x]\). For each \(i \in [k]\) set \(V_i := \text{ker } p_i(T)^{e_i}\). Then each \(V_i\) is \(T\)-invariant and \(V = V_1 \oplus \cdots \oplus V_k\).
Furthermore, letting for all \(i \in [k] \quad q_i(x) := m_T(x) / p_i(x)^{e_i}\) applying the generalization of Bezout's lemma (resulting in polynomials \(s_1, \ldots, s_k\) such that \(\sum_{i=1}^k s_i(x) q_i(x) = 1\)), the functions \(P_i := s_i(T) q_i(T)\) are orthogonal idempotents commuting with \(T\) such that \(V = P_1 V \oplus \cdots \oplus P_k V\) and \(P_i V = \text{ker } p(T)^{e_i}\).
Triangularization of operators
For various reasons, we seek a basis \(\mc{A}\) such that \([T]_{\mc{A}}\) has a "nice" form. This comes from finding bases for each of \(V_i\).
In the case where \(\mb{F}\) is not algebraically closed, we may have \(V_i\) that look like, for example, \(\text{ker } ((T^2 + I)^2)\). Since the polynomial is not monic, this has nothing directly to do with eigenvalues. Even when \(\mb{F}\) is not algebraically closed, it is possible to find a basis leading to \([T]_{\mc{A}}\) being in what is called Frobenius normal form. We will not discuss this here.
We assume from this point on that \(\mb{F}\) is algebraically closed. We already know that (1) for every eigenvalue \(\lambda_i\) of \(T\) there is a factor \(p_i(x) = (x - \lambda_i)\), (2) every root of \(T\) is an eigenvalue, and (3) every root appears as a linear factor.
\(\mb{F}\) being algebraically closed means that every factor is monic. Thus every \(V_i\) looks something like \(V_i = \text{ker }(T - 3I)^2\). where \(p_i(x) = (x - \lambda_i)^{e_i}\) with \(\lambda_i = 3\) and \(e_i = 2\). \(v_i\) is the corresponding eigenvector to \(\lambda_i\).
Definition: For an algebraically closed field, if \(e_i = 1\) we call \(V_i\) the eigenspace of \(T\) corresponding to \(\lambda_i\). If \(e_i \geq 1\) we call \(V_i\) the generalized eigenspace of \(T\) corresponding to \(\lambda_i\) (Axler, 2024, p. 164, p. 308).
What are the dimensions of these generalized eigenspaces?
Theorem: The exponent of each \(x - \lambda_i\) in \(c_T(x)\) equals the dimension of the generalized eigenspace of \(V_i\)
Proof:
First note that \(\sum_{i=1}^k \text{dim } V_i = \text{dim } V = n\). Denote \(T_i = T|_{V_i}\). Since \(V_i\) is \(T\)-invariant, \(T_i: V_i \to V_i\). By the definition of \(V_i\), for all \(v \in V_i\) we have \((T_i - \lambda_i I)^{e_i} v = 0\), thus \(T_i - \lambda_i I\) is nilpotent (definition: an operator \(T\) is nilpotent iff some power of it equals zero, see Axler 2024, p. 303). A nilpotent operator has only one eigenvalue: \(0\) (Axler, 2024, p. 304), so \(T_i - \lambda_i I\) has only one eigenvalue, \(0\). This holds iff the only eigenvalue of \(T_i\) is \(\lambda_i\). The characteristic polynomial of \(T_i\) is therefore \(c_{T_i}(x) = (x - \lambda_i)^{g_i}\) for some \(g_i \in \mb{N}\). Recall that the degree of the characteristic polynomial of a linear operator is the dimension of the vector space on which it operates. Therefore \(g_i = \text{dim } V_i\). We can perform this analysis for all \(i \in [k]\).
For each \(i \in [k]\), choose a basis \(\mc{A}_i\) for \(V_i\). Consider \([T]_{\mc{A}}\) where \(\mc{A} = \mc{A}_1 \cup \cdots \cup \mc{A}_k\) and where the order is preserved. Then \([T]_{\mc{A}} = \text{diag}(A_1, \ldots, A_k)\) where \(A_1, \ldots, A_k\) are square matrices, i.e. \([T]_{\mc{A}}\) is block diagonal. By a result from Horn and Johnson (1985, p. 24)
Therefore the exponent of each \(x - \lambda_i\) in \(c_T(x)\) equals the dimension of the generalized eigenspace of \(V_i\). \(\square\)
The form of the matrix representation
We continue under our assumption that \(\mb{F}\) is algebraically closed. We note from the above proof of the theorem regarding dimensions of generalized eigenspaces that there was a list of ordered bases such that their "ordered union" \(\mc{A}\) resulted in a block-diagonal \([T]_{\mc{A}}\). In fact, we have the following Theorem:
Theorem: For \(\mb{F}\) algebraically closed, \(V\) a vector space over \(\mb{F}\) with \(\text{dim } V = n\), and \(T \in \mc{L}(V)\), there exists an ordered basis \(\mc{A} = (v_1, \ldots, v_n)\) such that \([T]_{\mc{A}}\) is upper-triangular.
Proof:
This proof comes roughly from Wang & Wong (2019, p. 65-66).
We use induction on \(\text{dim } V\).
If \(\text{dim } V = 1\), then any basis \(\mc{B}\) is of the form \(\mc{B} = \{v_1\}\). Thus for any \(T \in \mc{L}(V)\) there exists some \(c \in \mb{R}\) such that \([T]_{\mc{B}} = (c)\), which is upper-triangular.
Fix \(n \in \mb{N}\). Now assume that the Theorem holds for all \(V\) with dimension \(\text{dim } < n\). Let \(V\) be a vector space such that \(\text{dim } V = n\). Since \(\mb{F}\) is algebraically closed, \(c_T(x)\) consists of all monic factors \(p_i(x)\), so there is at least one eigenvalue eigenvector pair for \(T\): call it \((\lambda_1, v_1)\). Extend \(\{v_1\}\) to an ordered basis \((v_1, w_2, \ldots, w_n)\) of \(V\).
Let \(W = \langle v_1 \rangle\). It follows that \(\text{dim}(V / W) = n - 1\). There exists a well-defined \(\overline{T} \in \mc{L}(V / W)\), defined by \(\overline{T}(v + W) = Tv + W\). By the induction hypothesis, there exists an ordered basis \(\mc{B} = (\overline{v_2}, \ldots, \overline{v_n})\) of \(V / W\) such that \([\overline{T}]_{\mc{B}}\) is upper-triangular.
For each \(j \in \{2, \ldots, n\}\) choose \(v_j \in V\) such that \(v_j + W = \overline{v_j}\). We will show that \(v_1, \ldots, v_n\) is a basis of \(V\). Consider \(a_1 v_1 + \cdots + a_n v_n = 0\): call this our "original equation". Applying \(\pi_W\) to both sides we get \(a_1 \pi_W(v_1) + \cdots + a_n \pi_W(v_n) = 0_{V/W}\), where \(0_{V/W} = 0 + W\).
Since \(v_1 \in W\), \(\pi_W(v_1) = v_1 + W = 0_{V/W}\).
We defined \(\pi_W(v_j) = v_j + W = \overline{v_j}\).
Therefore the equation becomes
\(a_2 \overline{v_2} + \cdots + a_n \overline{v_n} = 0_{V/W}\).
Since \((\overline{v_2}, \ldots, \overline{v_n}) = \mc{B}\) is a basis for \(V / W\), the vectors are linearly independent, which means \(a_2 = \cdots = a_n = 0\). Therefore our original equation becomes \(a_1 v_1 = 0\). Since \(v_1\) is an eigenvector, \(v_1 \neq 0\), implying \(a_1 = 0\). Therefore \((v_1, \ldots, v_n)\) are linearly independent and thus form a basis for \(V\), call it \(\mc{A}\).
We now verify that \([T]_{\mc{A}}\) is upper-triangular.
Since \(Tv_1 = \lambda_1 v_1\), \([T(v_1)_{\mc{A}}] = [\lambda_1, 0, \ldots, 0]^\top\).
Let \(j \in \{2, \ldots, n\}\).
where the sum stops at \(j\) due to the upper-triangular structure of \([\overline{T}]_{\mc{B}}\). From this, it follows that \(Tv_j - \left(\sum_{i=2}^{j} a_{ij} v_j\right) \in W\).
Since \(W = \langle v_1 \rangle\), there exists \(c_j \in \mb{F}\) such that \(Tv_j - \sum_{i=2}^{j} a_{ij} v_i = c_j v_1\), so \(Tv_j = c_j v_1 + \sum_{i=2}^{j} a_{ij} v_i\).
Therefore \([T]_{\mc{A}}\) is upper-triangular.
\(\square\)
Theorem: For any \(\mc{A}\) such that \(A = [T]_{\mc{A}}\) is upper triangular, for all \(i \in [n]\), \((A)_{ii}\) is an eigenvalue of \(T\).
Proof: By an above result, \(c_T(x)\) is the same for any basis. Consider \(c_T(x) = \text{det } ([T]_{\mc{A}} - xI)\). Since \([T]_{\mc{A}} - xI\) is upper-triangular, \(\text{det } ([T]_{\mc{A}} - xI) = \prod_{j=1}^n (a_{jj} - x)\). We already know that \(c_T(x) = \prod_{i=1}^k p_i(x)^{f_i}\) where for all \(i\in [k]\) we have \(p_i(x) = x - \lambda_i\) where \(\lambda_i\) is an eigenvalue of \(T\). Therefore each \(a_{jj}\) equals \(\lambda_i\) for some \(i \in [k]\).
\(\square\)
We note that in the above proof, we completely ignored the existence of the generalized eigenspaces we have been working with. It turns out there is an upper triangular form called the Jordan form that has some relationship to the generalized eigenspaces, but we will not examine it here.
Also note that this result is similar to the Schur decomposition (see Linear algebra: more decompositions).
Summing up
In this section we found that if \(\mb{F}\) is algebraically closed and \(\text{dim } V < n\), then for all \(T \in \mc{L}(V)\) there exists an ordered basis \(\mc{A}\) such that \([T]_{\mc{A}}\) is upper-triangular and where every diagonal element of \([T]_{\mc{A}}\) is an eigenvalue. The eigenvalues can repeat, depending on the values of the \(f_i\).
Diagonalization of operators
We have seen that for \(\mb{F}\) algebraically closed that any \(T \in \mc{L}(V)\) is triangularizable. We would like to push this further. We ask the question: when does there exist an ordered basis \(\mc{A}\) of \(V\) such that \([T]_{\mc{A}}\) is a diagonal matrix?
The form of the matrix \([T]_{\mc{A}}\) has depended on two things: the value of \(k\) (which is the number of distinct eigenvalues) and the powers \(f_i\) appearing in \(c_T(x)\).
We now ask: what if we stipulate that for all \(i \in [k]\), \(e_i = 1\)?
Theorem (Wang & Wong, 2019. p. 74): If \(m_T(x) = p_1(x)^{e_1} \cdots p_k(x)^{e_k}\) is such that for all \(i \in [k]\), \(e_i = 1\), then there exists a basis \(\mc{A}\) such that \([T]_{\mc{A}}\) is a diagonal matrix; furthermore, this basis \(\mc{A}\) is a basis of eigenvectors of \(T\), and the diagonal elements of \([T]_{\mc{A}}\) are all eigenvalues of \(T\).
Proof (Wang & Wong, 2019. p. 74):
Recall that \(V = V_1 \oplus \cdots \oplus V_k\) where \(V_i = \text{ker } p_(T)^{e_i}\). Since \(\mb{F}\) is algebraically closed \(p_i(x) = (x - \lambda_i)\) for some eigenvalue \(\lambda_i\). Further, since for all \(i \in [k]\), \(e_i = 1\), we have for all \(i \in [k]\) that \(V_i = \text{ker } (T - \lambda_i I)\) (i.e. \(V_i\) is an eigenspace of \(T\), namely the eigenspace of \(T\) corresponding to \(\lambda_i\)). Thus every vector in \(V_i\) is an eigenvector of \(T\) corresponding to \(\lambda_i\). Thus we can choose a basis \(\mc{A}_i\) for \(V_i\) that consists of eigenvectors of \(T\). Let \(\mc{A}\) be the ordered union of the \(\mc{A}_i\): it is a basis of \(V\) consisting of eigenvectors \(v_1, \ldots, v_n\) of \(T\). Now observe that \([T]_{\mc{A}} = [[T(v_1)]_{\mc{A}}, \ldots, [T(v_n)]_{\mc{A}}]\) where \([T(v_j)]_{\mc{A}} = [0, \ldots, 0, \lambda_i, 0, \ldots, 0]^\top\) where \(\lambda_i\) is the eigenvalue associated with \(v_j\), and where \(\lambda_i\) is in the \(j\)th position of the vector. Thus \([T]_{\mc{A}}\) is a diagonal matrix. with eigenvalues as its diagonal entries. \(\square\)
Thus we have shown that if \(\mb{F}\) is algebraically closed and all \(e_i = 1\), \(T\) is diagonalizable, in fact is diagonalizable with an ordered basis of eigenvectors and having diagonal elements all equal to eigenvalues.
A relationship between the form of generalized eigenspaces and the coefficients in the minimal and characteristic polynomials
Note that even if all \(e_i = 1\), if \(k < n\) then there exists at least one \(i \in [k]\) such that \(f_i > 1\), so there is at least one generalized eigenspace of dimension greater than \(1\).
If \(e_i = 1\) but \(f_i > 1\), we have that \(V_i = \text{ker } (x - \lambda_i)\), so every vector in \(V_i\) is an eigenvector of \(T\). If \(e_i > 1\), we would have \(V_i = \text{ker } (x - \lambda_i)^{e_i}\) so there are vectors in \(V_i\) that are not eigenvectors of \(T\). Summing up: for generalized eigenspace \(i\),
| $e_i$ | $f_i$ | contains |
|---|---|---|
| $1$ | $> 1$ | only eigenvectors |
| $> 1$ | $> 1$ | eigenvectors and other vectors |
So \(T\) is diagonalizable iff every generalized eigenspace contains only eigenvectors.
Diagonalization in a further sense
If we stipulate further that all \(e_i = 1 = f_i\), we have not only that every generalized eigenspace contains only eigenvectors, but that the dimension of each generalized eigenspce is 1. Further, since the degree of \(c_T(x)\) is \(n\), we must have that \(k = n\), so there are \(n\) distinct eigenvalues.
Spectral decomposition
Corollary (of primary decomoposition theorem) (spectral decomposition) (Surowski, 1997, p. 66): Let \(\mb{F}\) be an algebraically closed field, let \(\text{dim } V = n\), and let \(m_T(x) = p_1(x)^{e_1} \cdots p_k(x)^{e_k}\), and let \(T \in \mc{L}(V)\) be diagonalizable. Then, given the definitions of the orthogonal idempotents \(P_1, \ldots, P_k\), we have \(T = \lambda_1 P_1 + \cdots + \lambda_k P_k\).
Proof: The primary decomposition theorem of course holds. \(T\) being diagonalizable means for all \(i \in [k]\), \(e_i = 1\). This implies for all \(i \in [k]\) that \(P_i V = \text{ker } p_i(T) = \text{ker } (T - \lambda_i I_V)\) for an eigenvalue \(\lambda_i\) of \(T\). Thus as explained above, every vector in each \(V_i\) is an eigenvector of \(\lambda_i\).
Let \(v = v_1 + \cdots + v_k\), \(v_i \in P_i V\). We know that for each \(i \in [k]\), \(Tv_i = \lambda_i v_i\), so \(Tv = \lambda_1 v_1 + \cdots + \lambda_k v_k\).
[Claim: For all \(i \in [k]\), \(P_i v = v_i\)
Proof: Fix \(i\). For \(j = i\), \(v_j \in P_i V\), so \(v_j = P_i v_j\) since \(\exists\, u \in P_i V \quad v_j = P_i u\).
\(\therefore\, P_i v_j = P_i^2 u = P_i u = v_j\).
For \(j \neq i\), \(v_j \in P_j v\), so \(P_i v_j = P_i P_j v = 0v = 0\)
\(\therefore\, P_i v = v_i\). \(\square\)]
Using the claim and substituting \(v_i = P_i v\) for all \(i\), we have \(Tv = \lambda P_1 v + \cdots + \lambda_k P_k v\).
Since we showed that this holds for an arbitrary \(v \in V\), we conclude that \(T = \lambda_1 P_1 + \cdots + \lambda_k P_k\).
\(\square\)
The formula \(T = \lambda_1 P_1 + \cdots + \lambda_k P_k\) is known as the "spectral decomposition" of the linear operator \(T\), since the set of eigenvalues of a linear operator \(T\) is sometimes referred to as the "spectrum" of \(T\).
References
Axler, Sheldon. (2024). Linear algebra done right (Fourth edition). Self-published.
Evans, Leonard. (1999). Infinite extensions. Self-published.
Hungerford, Thomas W. (2013). Abstract algebra: an introduction. Brooks/Cole.
Lang, Serge. (1987). Linear algebra (Third edition). Springer.
Surowski, David. (1997). Advanced Linear Algebra [Lecture notes].
Wang, Jie and Daniel Wong. (2019). MAT 3040 - Advanced Linear Algebra [Lecture notes]
How to cite this article
Wayman, Eric Alan. (2026). Linear algebra: eigenvalues and some decompositions. Eric Alan Wayman's technical notes. https://ericwayman.net/notes/linear-algebra-eigenvalues-decompositions/
@misc{wayman2026linear-algebra-eigenvalues-decompositions,
title={Linear algebra: eigenvalues and some decompositions},
author={Wayman, Eric Alan},
journal={Eric Alan Wayman's technical notes},
url={https://ericwayman.net/notes/linear-algebra-eigenvalues-decompositions/},
year={2026}
}