Lecture from 13.12.2024 | Video: Videos ETHZ
Recap of Previous Results
Last few lectures, we established key results concerning eigenvalues, eigenvectors, and diagonalization:
-
Complete Set of Eigenvectors: We explored conditions under which a matrix possesses a complete set of eigenvectors that form a basis for the vector space.
- We proved that matrices with distinct eigenvalues always have a complete set.
- We also showed that diagonal matrices and projection matrices always have complete sets of eigenvectors.
- A crucial concept introduced was that for a matrix to have a complete set of eigenvectors, the geometric multiplicity of each eigenvalue (dimension of its eigenspace) must equal its algebraic multiplicity (number of times it appears as a root of the characteristic polynomial).
-
Diagonalization: If a matrix has a complete set of eigenvectors, it can be diagonalized as , where is the matrix whose columns are the eigenvectors, and is a diagonal matrix with the corresponding eigenvalues.
-
Similar Matrices: We defined similar matrices ( for some invertible ) and proved that similar matrices share the same eigenvalues.
Now, building upon these foundations, we delve into the specific and important case of symmetric matrices. We want to address the question: for which matrices do we have:
- All eigenvalues are real.
- The geometric multiplicity of each eigenvalue is equal to its algebraic multiplicity .
We’ve already seen that diagonal and projection matrices satisfy these conditions. While having distinct real eigenvalues guarantees real eigenvalues and equal algebraic and geometric multiplicities, this condition is not necessary. We are seeking a more general characterization. This leads us to the Singular Value Decomposition (SVD), a powerful factorization that applies to all matrices. However, before tackling the SVD, we’ll first analyze a crucial special case: symmetric matrices. Understanding the properties of symmetric matrices will provide essential building blocks for the SVD.
Symmetric Matrices: A Special Case
Definition: A matrix is symmetric if . That is, the entries of A are symmetric across the main diagonal:
Symmetric matrices possess remarkable properties that greatly simplify their eigenstructure and have profound implications in various applications.
Real Eigenvalues for Symmetric Matrices
Proposition: If is a real symmetric matrix ( and ), then all its eigenvalues are real.
2x2 Example:
Proof: Eigenvalues of Real Symmetric Matrices are Real
This is simpler than it seems! Here’s the key idea: we’ll use the properties of the conjugate transpose (denoted by ) to show that any eigenvalue must be equal to its own conjugate, which means it has to be real.
-
Conjugate Transpose: For any matrix (or vector) with complex entries, (also denoted as Hermitian of a matrix ) is formed by taking the transpose of and then taking the complex conjugate of each entry. For example:
If , then
Important properties of the conjugate transpose:
- For a real matrix ,
- For a scalar , is the complex conjugate
- For a vector , , the squared magnitude of
-
Setting Up: Let be an eigenvalue of with a corresponding eigenvector (which might be complex). This means:
-
Taking the Conjugate Transpose: Applying the conjugate transpose to both sides of the equation: (Using properties from step 1).
-
Using Symmetry: Since A is real and symmetric, . Substituting this into our equation:
-
The Trick: Now, let’s be clever. Left-multiply the original eigenvector equation () by :
-
Substituting: From step 4, we know . Substitute this into the equation from step 5:
-
Conclusion: Since is non-zero (it’s an eigenvector!), . We can safely divide both sides by :
This means equals its own conjugate. The only way a complex number can be equal to its conjugate is if it’s a real number! Therefore, must be real.
Orthogonal Eigenvectors for Distinct Eigenvalues of Symmetric Matrices
Proposition: If is a symmetric matrix and and are eigenvectors corresponding to distinct eigenvalues and , then and are orthogonal, i.e .
Proof
We have and .
-
Pre-multiply the first equation by : .
-
Transpose the second equation and post-multiply by : Then .
-
Since is symmetric, , so: .
-
Now we have two equations:
Subtracting the second equation from the first:
-
Since , we must have , which means and are orthogonal.
The Spectral Theorem
The Spectral Theorem is a cornerstone result in linear algebra, revealing the profound connection between symmetric matrices and orthonormal bases. It states that real symmetric matrices are always diagonalizable by an orthogonal matrix and have real eigenvalues.
Theorem (Spectral Theorem): Every symmetric matrix possesses the following properties:
- Real Eigenvalues: All eigenvalues of are real.
- Orthogonal Eigenvectors: Eigenvectors corresponding to distinct eigenvalues are orthogonal.
- Orthonormal Basis of Eigenvectors: There exists an orthonormal basis for composed entirely of eigenvectors of .
- Diagonalization by an Orthogonal Matrix: can be factored as:
where:
- is an orthogonal matrix () whose columns are the orthonormal eigenvectors of .
- is a diagonal matrix whose diagonal entries are the corresponding eigenvalues of .
Equivalent Formulation: Every symmetric matrix has real eigenvalues (counting multiplicities) and an orthonormal basis of formed by the eigenvectors of
Symmetric Matrices and Positive Definiteness... (MIT)
Proof of the Spectral Theorem (by Induction)
For every , there exist orthonormal eigenvectors of .
Base Case ()
We know that has at least one real eigenvalue (this was proven earlier). Let be a corresponding eigenvector. Since is non-zero, we can normalize it to obtain an eigenvector of unit length: . This gives us one orthonormal eigenvector.
The Inductive Step
Inductive Hypothesis
Assume that for some , where , there exist orthonormal eigenvectors of the symmetric matrix , with corresponding real eigenvalues .
Construction
-
Subspace and its Orthogonal Complement:
- Let be the subspace spanned by the first orthonormal eigenvectors.
- Let be the orthogonal complement of . Since , we have .
-
Orthonormal Basis for :
- Let be an orthonormal basis for .
-
The Key Idea: To extend to orthonormal eigenvectors, we must find a new eigenvector in . This ensures it’s orthogonal to the existing eigenvectors (by definition of ). The strategy involves showing that maps vectors from to which allows us to restrict the problem to a smaller subspace.
Showing that maps to
Crucial Lemma: If is symmetric and then
Proof: Since is in , it’s orthogonal to each of the vectors We must show that is also orthogonal to each Consider the dot product of with an arbitrary :
(Recall that A is symmetric so , and is an eigenvector so , and is orthogonal to so their dot product is 0). Since is orthogonal to every basis vector of , is in
Constructing the Matrix
-
Matrix : Form a matrix whose columns are the vectors . Note that is orthogonal since its columns form an orthornormal basis for Consequently
-
Matrix : Define .
Then since and , and and (because they are from orthogonal subspaces): where is a diagonal matrix with on the diagonal, and is an symmetric matrix representing the transformation of vectors in by A.
is diagonal because:
- . Then . Since these are scalars ( matrices), the transpose is equal to the original value: . Thus is symmetric.
Finding the ()-th Eigenvector
1. Eigenvalue and Eigenvector of
Since is a symmetric matrix, it has at least one real eigenvalue and a corresponding eigenvector (a consequence of being symmetric). This means:
2. Constructing
Now, construct a vector as follows:
where is the zero vector in . Notice that has its first components as 0 and its remaining components equal to the components of Importantly, since is an eigenvector of , it is non-zero. This means is non-zero.
3. Showing w is an Eigenvector of B
Let’s compute :
This demonstrates that is an eigenvector of with eigenvalue
4. Connecting back to
Recall that . Therefore, . Now multiply both sides of the equation by on the left:
Let The equation above becomes:
This shows that is an eigenvector of with eigenvalue !
5. Orthogonality of
Since , and , we have:
Since is a linear combination of the vectors which are in , this confirms that is orthogonal to all vectors in S and is therefore orthogonal to We can normalize to obtain an orthonormal set of eigenvectors.
Completing the Induction
We have successfully found a ()-th orthonormal eigenvector corresponding to a real eigenvalue. This completes the inductive step, proving that a symmetric matrix has orthonormal eigenvectors.
Conclusion
This inductive proof establishes the Spectral Theorem, demonstrating that real symmetric matrices have a complete set of orthonormal eigenvectors that form a basis for , and are therefore diagonalizable by an orthogonal matrix. This decomposition has significant implications for understanding the properties and behavior of symmetric matrices in various applications. The Spectral Theorem underlies many important techniques in data analysis, machine learning, physics, and engineering, highlighting its fundamental importance in linear algebra and related fields.
Consequences of the Spectral Theorem
Corollary:
-
Orthogonal Diagonalization: For any symmetric matrix , there exists an orthogonal matrix (whose columns are eigenvectors of ) such that , where is a diagonal matrix with the eigenvalues of on its diagonal. This is a direct consequence of the orthonormal basis property. Since the columns of Q are orthonormal, .
-
Spectral Decomposition: Let be a real symmetric matrix. Let be an orthonormal basis of eigenvectors of , and let be the associated eigenvalues. Then can be written as a sum of rank-1 projections:
This decomposition expresses as a weighted sum of projections onto the one-dimensional subspaces spanned by each eigenvector. This reveals how acts as a combination of scalings along orthogonal directions defined by its eigenvectors.
-
Rank and Eigenvalues: The rank of a real symmetric matrix is equal to the number of non-zero eigenvalues (counting multiplicities). This follows directly from the diagonalization: the rank of is the same as the rank of , which is the number of non-zero diagonal entries.
The Rayleigh Quotient
The Rayleigh quotient is a tool for analyzing the eigenvalues and eigenvectors of a symmetric matrix. It provides a way to estimate eigenvalues and characterize their extremal properties.
Definition
Let be a symmetric matrix (). The Rayleigh quotient is a scalar-valued function defined for any non-zero vector as:
The denominator is the squared Euclidean norm of ( ), representing the squared length of the vector. The numerator is a quadratic form associated with the matrix .
Properties of the Rayleigh Quotient
Boundedness (for Symmetric Matrices): For a symmetric matrix , the Rayleigh quotient is bounded by the minimum and maximum eigenvalues of . Let and be the minimum and maximum eigenvalues of , respectively. Then for any non-zero :
This is a crucial property. It tells us that the range of values the Rayleigh quotient can take is restricted by the eigenvalues of the matrix. We might prove this in the next lecture…
Continue here: 27 Positive (Semi)Definite Matrices, Gram Matrices, SVD