Linear algebra: more decompositions

Table of Contents

Introduction

We often use the term "factorization" to refer to representing a matrix as a product of other matricies, and the term "decomposition" to refer to representing a transformation as a sum of other transformations. However, the two can be used interchangeably.

Preliminary definition: positive operators

Definition (Axler, 2024, p. 251): \(T \in \mc{L}(V)\) is called positive iff \(T\) is self-adjoint and \(\forall\, v \in V \quad (v, Tv) \geq 0\).

An important property of positive operators is the following:

Lemma (Axler, 2024, p. 252): If \(T \in \mc{L}(V)\) is positive, then all eigenvalues of \(T\) are non-negative.

Proof (Axler, 2024, p. 252): Let \(\lambda\) be an eigenvalue of \(T\) with \(v\) its corresponding eigenvector. Observe that \(0 \leq (v, Tv) = (v, \lambda v) = \lambda (v, v)\). Since \((v, v) \geq 0\), \(\lambda \geq 0\). \(\square\)

Schur's theorem and the Schur factorization

Schur's theorem

Theorem (Axler, 2024, p. 204): Let \(V\) be a finite dimensional inner product space over \(\mb{C}\) and \(T \in \mc{L}(V)\). There then exists an orthonormal basis \(\mc{A}\) such that \([T]_{\mc{A}}\) is upper-triangular.

The Schur factorization of a matrix

Definition: \(G \in M_n(\mb{C})\) has a Schur factorization iff there exist unitary \(U\) and upper-triangular matrix \(H\) such that \(H = U G U^\ast\) (Trefethen and Bau, p. 187).

Theorem: Every matrix in \(M_n(\mb{C})\) has a Schur factorization.

Proof: Let \(G \in M_n(\mb{C})\). Let \(\mc{E} = (e_1, \ldots, e_n)\) be the standard basis of \(\mb{C}\). Let \(T:\mb{C}^n \to \mb{C}^n\) be such that \(G = [T]_{\mc{E}}\) (i.e. let \((G)_{ij} = c_{ij}\) be given. Define \(T\) by \(T(v_j) = \sum_{i=1}^n c_{ij} v_i\). Then \(G = [T]_{\mc{E}}\).). Since Schur's theorem applies to this \(T\), i.e. there exists an orthonormal ordered basis \(\mc{A}\) such that \([T]_{\mc{A}}\) is upper-triangular, we can write \([T]_{\mc{A}} = [\text{id}]_{\mc{A}\mc{E}} [T]_{\mc{E}} [\text{id}]_{\mc{E}\mc{A}}\).

By a Theorem from Linear algebra: inner product spaces, \([\text{id}]_{\mc{E}\mc{A}}\) is unitary; since \([\text{id}]_{\mc{E}\mc{A}}\) is unitary we have \([\text{id}]_{\mc{A}\mc{E}} = \inv{([\text{id}]_{\mc{E}\mc{A}})} = ([\text{id}]_{\mc{E}\mc{A}})^\ast\), so letting \(U = [\text{id}]_{\mc{E}\mc{A}}\) we have \(H = U G U^\ast\). \(\square\)

Process for finding the Schur factorization of a matrix

Say we begin with a matrix \(G\). Let \(\mc{E}\) be the standard basis for \(\mb{R}^n\). Interpret \(G\) as \(G = [T]_{\mc{E}}\) for some \(T\).

Find an eigenvector \((u_1)_{\mc{E}}\) of \([T]_{\mc{E}}\) either using exact methods or a numerical method such as the QR Algorithm (Trefethen & Bau, 1997, p. 216). Extend this list to an orthonormal basis \(\mc{A}_1 = (u_1, v_2, \ldots, v_n)\) using Gram-Schmidt. The first column of \([T]_{\mc{A}_1}\) is thus \((\lambda_1, 0, \ldots, 0)\). Consider \(T|_{\langle v_1 \rangle^\perp}\). Repeat the process. Upon finishing this process, we have obtained \(\mc{A} = (u_1, \ldots, u_n)\), a basis such that \([T]_\mc{A}\) is upper-triangular with \(\lambda_i\) on the diagonals.

We have found \((u_1)_{\mc{E}}, \ldots, (u_n)_{\mc{E}}\): note that these are the columns of \(U = [\text{id}]_{\mc{E}\mc{A}}\). Thus we have \(H = U G U^\ast\).

The QR and Cholesky factorizations of matrices

Assume \(\mb{F}\) equals \(\mb{C}\) or \(\mb{R}\).

If \(T\) is invertible, we have the QR factorization:

Let \(\mc{E}\) denote the standard basis. Assume a matrix \(A\) equals \([T]_\mc{E}\).

Then there exist unique matrices \(Q\) and \(R\) such that \(Q\) is unitary, \(R\) is upper triangular with only positive numbers on its diagonal, and \(A = QR\)

(Axler, 2024, p. 264)

The proof (Axler, 2024, p. 264) shows how the matrices can be found. It uses the Gram-Schmidt procedure.

If \(T\) is positive definite (and thus invertible), we have the Cholesky factorization:

Assume a matrix \(B\) equals \([T]_\mc{E}\).

Then there exists a unique upper-triangular matrix \(R\) with only positive numbers on its diagonal such that \(B = R^\ast R\)

(Axler, 2024, p. 267).

The QR factorization is used in finding the Cholesky factorization (in fact, since there exists an invertible square matrix \(A\) such that \(B = A^\ast A\), taking the QR factorization of \(A\), namely \(A = QR\), gives \(B = R^\ast Q^\ast Q R = R^\ast R\) (Axler, 2024, p. 267).

Singular value decomposition

Results regarding \(T^{\ast} T\)

Claim (Axler, 2024, p. 270): If \(T \in \mc{L}(V, W)\), then \(T^\ast T\) is a positive operator on \(V\).

Proof (Axler, 2024, p. 270): Observe by the above property of the adjoint that \((T^\ast T)^\ast = T^\ast T\), so \(T^\ast T\) is self-adjoint. Further, for \(v \in V\),

$$ (v, (T^\ast T)v) = ((T^\ast T)^\ast v, v) = ((T^\ast T) v, v) = (Tv, Tv) = \norm{Tv}^2 \geq 0. $$

Thus \(T^\ast T\) is positive. \(\square\)

We will now show some properties of \(T^{\ast} T\) which we will rely upon.

Theorem (Axler, 2024, p. 270): Let \(V\) and \(W\) be finite dimensional complex inner product spaces and let \(T \in \mc{L}(V, W)\). Then

  1. \(T^{\ast} T\) is a positive operator on \(V\)
  2. \(\text{ker } T^{\ast} T = \text{ker } T\)
  3. \(\text{Im } T^{\ast} T = \text{Im } T^{\ast}\)
  4. \(\text{dim } \text{Im } T = \text{dim } \text{Im } T^{\ast} = \text{dim } \text{Im } T^{\ast} T\)

Proof (Axler, 2024, p. 270):

The following is practically verbatim from Axler.

(1) \((T^{\ast} T)^{\ast} = T^{\ast} (T^{\ast})^{\ast} = T^{\ast} T\), so \(T\) is self-adjoint.

Let \(v \in V\). Then

$$ ((T^{\ast} T)v, v) = (T^{\ast}(Tv), v) = (Tv, Tv) = \norm{Tv}^2 \geq 0 $$

so \(T^{\ast} T\) is a positive operator on \(V\).

(2)

(\(\subseteq\))

Let \(v \in \text{ker } T^{\ast} T\). Then

$$ \norm{Tv}^2 = (Tv, Tv) = (T^{\ast} T v, v) = (0, v) = 0 $$

so \(Tv = 0\). Therefore \(v \in \text{ker } T\).

(\(\supseteq\))

Let \(v \in \text{ker } T\). Then \(T^{\ast} Tv = 0\), so \(v \in \text{ker } T^{\ast} T\).

(3)

Since \(T^{\ast} T\) is self-adjoint (see (1)), we have

$$ \text{Im } T^{\ast} T = (\text{ker } (T^{\ast} T)^{\ast})^{\perp} = (\text{ker } T^{\ast} T)^{\perp} = (\text{ker } T)^{\perp} = \text{Im } T^{\ast} $$

(4)

$$ \text{dim } \text{Im } T = (\text{ker } T^{\ast})^{\perp} = \text{dim } W - \text{dim } \text{ker } T^{\ast} = \text{dim } \text{Im } T^{\ast} $$

where the last step is by the rank-nullity theorem.

Since by (3) we have \(\text{Im } T^{\ast} T = \text{Im } T^{\ast}\), we have the second equality in the statement of this part of the theorem.

Singular values

Let \(V\) and \(W\) be inner product spaces over \(\mb{C}\) with \(\text{dim } V = n\).

Consider \(T^{\ast} T \in \mc{L}(V)\). Since \(T^\ast T\) is self-adjoint, it is diagonalizable in an orthonormal basis of eigenvectors \(\mc{A} = (v_1, \ldots, v_n)\). Recall that all eigenvalues of a positive operator are non-negative (real) numbers. Thus for each \(v_i\) we can write the corresponding eigenvalue as \(s_i^2\).

Permuting the ordering so that \(s_1 \geq s_2 \geq \cdots \geq s_n\), the values in the list \((s_1, \ldots, s_n)\) are called the singular values of \(T\) (Axler, 2024, p. 271).

Claim (Axler, 2024, p. 100e): The number of positive singular values of \(T\) equals \(\text{dim }\text{Im } T\).

Proof:

First observe that since \(T^{\ast} T\) is self-adjoint, there exists an orthonormal basis of \(V\) consisting of eigenvectors of \(T^{\ast} T\) (see Linear algebra: inner product spaces).

We have \(k \leq n\) distinct eigenvalues of \(T^{\ast} T\), call them \(\lambda_1, \ldots, \lambda_k\). We have \(n\) eigenvectors \(v_1, \ldots, v_n\). For \(i \in [n]\), denote the corresponding eigenvalue of \(v_i\) by \(\lambda^{(i)}\). Let \(R_0 = \{i \in [n] ;\, \lambda^{(i)} = 0\}\), and let \(R_1 = [n] \setminus R_0\).

For all \(v \in V\) we can write \(v = \sum_{i \in [n]} c_i v_i\). Thus

$$ \begin{align*} (T^{\ast} T) v = \sum_{i \in [n]} c_i (T^{\ast} T) v_i = \sum_{i \in [n]} c_i \lambda^{(i)} v_i = \sum_{i \in R_1} c_i \lambda^{(i)} v_i \end{align*} $$

So for all \(w \in \text{Im } (T^{\ast} T)\) there exist a set \(S = \{v_i ;\, i \in R_1\}\) of linearly independent vectors such that a linear combination of these vectors equals \(w\). Therefore \(\langle S \rangle = \text{Im } (T^{\ast} T)\), so \(|S| = \text{dim }\text{Im } (T^{\ast} T) = \text{dim }\text{Im } T\), which equals the number of non-zero (positive) eigenvalues of \(T^\ast T\).

\(\square\)

Let \(r\) be the number of positive singular values (where of course \(r \leq n\); "\(r\)" stands for the word "rank").

Defining the SVD

Since \(v_1, \ldots, v_n\) are eigenvectors of \(T^\ast T\), for each \(i \in [n]\) we have \(T^{\ast} T v_i = s_i^2 v_i\). For \(i > r\) we have \(T^{\ast} T v_i = 0\).

Now, for each \(i \in [r]\), define \(w_i = (T v_i) / s_i \in W\). Note that

$$ \begin{align*} (w_i, w_j) & = \frac{1}{s_i s_j} (Tv_i, Tv_j) = \frac{1}{s_i s_j} (v_i, T^{\ast} Tv_j) = \frac{1}{s_i s_j} (v_i, s_j^2 v_j) \\ & = \frac{s_j}{s_i} (v_i, v_j) \\ & = \delta_{ij} \end{align*} $$

Therefore \((w_1, \ldots, w_r)\) are orthonormal.

For \(v \in V\), by the usual decomposition we can write

$$ \begin{align*} Tv & = T((v_1, v) v_1 + \cdots (v_n, v) v_n) \\ & = (v_1, v) Tv_1 + \cdots + (v_n, v) Tv_n \\ & = s_1 (v_1, v) w_1 + \cdots + s_r (v_r, v) w_r \end{align*} $$

We thus have orthonormal vectors \((v_1, \ldots, v_r) \in V\) and \((w_1, \ldots, w_r) \in W\) such that

$$ T(\bullet) = T((v_1, \bullet) w_1 + \cdots (v_r, \bullet) w_r) $$

which is a decomposition of \(T\). We call this the singular value decomposition or SVD of \(T\).

The SVD of a matrix: two forms

We will show two different versions of "the" SVD of a matrix. We will begin with the one that comes directly from the SVD itself, and show a modified version that takes less storage space.

Assume \(M \in M_{m,n}(\mb{F})\) where \(\mb{F} = \mb{C}\) or \(\mb{R}\). Assume \(\text{rank } M = m \geq 1\). Let \(\mc{E} = (e_1, \ldots, e_n)\) be the standard basis for \(\mb{F}^n\) and \(\mc{F} = (f_1, \ldots, f_m)\) be the standard basis for \(\mb{F}^m\). Let \(T\) be the linear transformation such that \([T]_{\mc{F}\mc{E}} = M\), i.e. writing \((M)_{ij} = m_{ij}\), we define \(T\) by \(T(e_j) = \sum_{i=1}^{n} m_{ij} f_i\). Recall that \(\text{dim } \text{Im } T = \text{rank } [T]\) (Axler, 2024, p. 90) in whatever bases we choose to represent \(T\).

Note that \(r \leq m\) since the codomain of \(T\) is \(W\) and \(\text{dim } W = m\), and \(r \leq n\) since the codomain of \(T^{\ast} T\) is \(V\) and \(\text{dim }V = n\). Thus \(r \leq \min(m, n)\).

Let \(\mc{A} = (v_1, \ldots, v_n)\). If \(r < m\), extend \((w_1, \ldots, w_r)\) to \((w_1, \ldots, w_r, w_{r+1}, \ldots, w_m)\) to a basis of \(W\); either way, let \(\mc{B} = (w_1, \ldots, w_r, w_{r+1}, \ldots, w_m)\).

By the change of basis formula,

$$ [T]_{\mc{F}\mc{E}} = [\text{id}]_{\mc{F}\mc{B}} [T]_{\mc{B}\mc{A}} [\text{id}]_{\mc{A}\mc{E}} $$

It is hard to find \([\text{id}]_{\mc{A}\mc{E}}\) but much easier to find \([\text{id}]_{\mc{E}\mc{A}} = [(v_1)_{\mc{E}}, \ldots, (v_n)_{\mc{E}}]\), so we can find \([\text{id}]_{\mc{E}\mc{A}}\) and take its inverse.

Since \(\mc{A}\) and \(\mc{E}\) are both orthonormal bases, by an earlier theorem, \([\text{id}]_{\mc{E}\mc{A}}\) is unitary, so \(\inv{[\text{id}]_{\mc{E}\mc{A}}} = {[\text{id}]_{\mc{E}\mc{A}}}^{\ast}\).

Thus

$$ {[\text{id}]_{\mc{E}\mc{A}}}^{\ast} = \begin{pmatrix} — \overline{(v_1)_{\mc{E}}}^\top — \\ \vdots \\ — \overline{(v_n)_{\mc{E}}}^\top — \end{pmatrix} $$

Of course, \([\text{id}]_{\mc{F}\mc{B}} = [(w_1)_{\mc{F}}, \ldots, (w_m)_{\mc{F}}]\).

And, letting \(\Sigma = \text{diag}(s_1, \ldots, s_r)\), \([T]_{\mc{B}\mc{A}}\) is an \(m \times n\) matrix of the form

$$ \Sigma, \begin{pmatrix} \Sigma & O_{r \times (n-r)} \end{pmatrix}, \begin{pmatrix} \Sigma\\ O_{(m-r) \times r} \end{pmatrix}, \text{or } \begin{pmatrix} \Sigma & O_{r \times (n-r)} \\ O_{(m-r) \times r} & O_{(m-r) \times (n-r)} \end{pmatrix} $$

where the form \([T]_{\mc{B}\mc{A}}\) takes depends on the values of \(m\) and \(n\).

Therefore

$$ \begin{align*} M & = \begin{pmatrix}(w_1)_{\mc{F}}, \ldots, (w_m)_{\mc{F}}\end{pmatrix} \begin{pmatrix} \Sigma & O_{n > r} \\ O_{m > r} & O_{m > r \,\land\, n > r} \end{pmatrix} \begin{pmatrix} — \overline{(v_1)_{\mc{E}}}^\top — \\ \vdots \\ — \overline{(v_n)_{\mc{E}}}^\top — \end{pmatrix} \end{align*} $$

Assume for a moment that \(m > r\) and \(n > r\). Then

$$ \begin{align*} M & = [s_1 (w_1)_{\mc{F}}, \ldots, s_r (w_r)_{\mc{F}}, O] \begin{pmatrix} — \overline{(v_1)_{\mc{E}}}^\top — \\ \vdots \\ — \overline{(v_r)_{\mc{E}}}^\top — \end{pmatrix} \\ & = s_1 (w_1)_{\mc{F}} (v_1)_{\mc{E}}^{\top} + \cdots + s_m (w_m)_{\mc{F}} (v_m)_{\mc{E}}^{\top} \end{align*} $$

This is similar to the formula for the singular value decomposition itself. We note that the zero matrices have no real function and that we could eliminate them, writing from the beginning:

$$ \begin{align*} M & = \begin{pmatrix}(w_1)_{\mc{F}}, \ldots, (w_r)_{\mc{F}}\end{pmatrix} \Sigma \begin{pmatrix} — \overline{(v_1)_{\mc{E}}}^\top — \\ \vdots \\ — \overline{(v_r)_{\mc{E}}}^\top — \end{pmatrix} \end{align*} $$

Process for finding the SVD of a matrix

In Numpy, "the decomposition is performed using LAPACK routine _gesdd" (NumPy developers, 2026). Trefethen and Bau (1997, Lecture 31) explain how the choice of computation method can be made with performance considerations in mind.

LU factorization of a matrix

Let \(\mb{F}\) be a field. Let \(A \in M_n(\mb{F})\).

Using a permutation matrix \(P\) if necessary, there exist lower and upper triangular matrices \(L\) and \(U \in M_n(\mb{F})\) such that \(PA = LU\) (Trefethen & Bau, 1997, p. 147-160).

Assuming for a moment that no permutation of \(A\) is necessary. We perform Gaussian elimination on \(A\) to produce \(U\), i.e. \(L_n \cdots L_1 A = U\). Then letting \(L^{-1} = L_n \cdots L_1\) we have \(A = LU\).

If permutations along the way are needed, we will have something like \(L_n P_n L_{n-1} P_{n-1} \cdots L_1 P_1 A\). These will be re-ordered into \(L_n^\prime \cdots L_1^\prime P_n \cdots P_1\). Write \(L^{-1} P A = U\). Then we have \(PA = LU\).

References

Axler, Sheldon. (2024). Linear algebra done right (Fourth edition). Self-published.

NumPy developers. (2026). numpy.linalg.svd.

Trefethen, Lloyd N. and David Bau, III. (1997). Numerical linear algebra. Society for Industrial and Applied Mathematics.

How to cite this article

Wayman, Eric Alan. (2026). Linear algebra: more decompositions. Eric Alan Wayman's technical notes. https://ericwayman.net/notes/linear-algebra-decompositions/

@misc{wayman2026linear-algebra-decompositions,
  title={Linear algebra: more decompositions},
  author={Wayman, Eric Alan},
  journal={Eric Alan Wayman's technical notes},
  url={https://ericwayman.net/notes/linear-algebra-decompositions/},
  year={2026}
}

© 2025-2026 Eric Alan Wayman. All rights reserved.