16 Orthogonal Complementary Subspaces, Projections, Normal Form

Lecture from 08.11.2024 | Video: Videos ETHZ

Orthogonal Complementary Subspaces

Recall that for a subspace $V$ of $R^{n}$ , its orthogonal complement $V^{⊥}$ is defined as:

$V^{⊥} = {x \in R^{n} ∣ v^{T} x = 0 for all v \in V}$

$V^{⊥}$ is also a subspace of $R^{n}$ .

Key Properties of Orthogonal Complements:

If ${v_{1}, \dots, v_{k}}$ is a basis for $V$ and ${w_{1}, \dots, w_{l}}$ is a basis for $V^{⊥}$ , then:

Dimensionality: $l = n - k$ (The dimensions of a subspace and its orthogonal complement add up to the dimension of the ambient space).
Basis of $R^{n}$ : ${v_{1}, \dots, v_{k}, w_{1}, \dots, w_{l}}$ forms a basis for $R^{n}$ .
Unique Decomposition: Every vector $x \in R^{n}$ can be uniquely expressed as $x = v + w$ , where $v \in V$ and $w \in V^{⊥}$ .

Also recall the crucial relationship between the nullspace and row space: $N (A) = R (A)^{⊥} = C (A^{T})^{⊥}$ .

Decomposition of $R^{n}$

Lemma: Double Orthogonal Complement

For any subspace $V$ of $R^{n}$ , $V = (V^{⊥})^{⊥}$ . (The orthogonal complement of the orthogonal complement of a subspace is the subspace itself.)

Proof

Let ${v_{1}, \dots, v_{k}}$ and ${w_{1}, \dots, w_{l}}$ be bases for $V$ and $V^{⊥}$ , respectively. We know $l = n - k$ .

By definition of orthogonal complements, $v_{i}^{T} w_{j} = 0$ for all $i$ and $j$ . Consider $(V^{⊥})^{⊥}$ : the set of all vectors orthogonal to every vector in $V^{⊥}$ . Since $v_{i}^{T} w_{j} = 0$ , each $v_{i}$ belongs to $(V^{⊥})^{⊥}$ . Therefore, $V \subseteq (V^{⊥})^{⊥}$ .

Now, $dim (V) = k$ and $dim ((V^{⊥})^{⊥}) = n - dim (V^{⊥}) = n - (n - k) = k$ . Since $V$ is contained within $(V^{⊥})^{⊥}$ and they have the same dimension, we conclude $V = (V^{⊥})^{⊥}$ .

Corollary: Decomposition of $R^{n}$

For any subspace $V$ of $R^{n}$ , $R^{n} = V + V^{⊥} = {v + w ∣ v \in V, w \in V^{⊥}}$ . This means any vector in $R^{n}$ can be written as the sum of a vector in $V$ and a vector in $V^{⊥}$ .

The Set of All Solutions to a System of Linear Equations

Corollary: Nullspace and Column Space of Transpose

For a matrix $A \in R^{m \times n}$ , $N (A) = C (A^{T})^{⊥}$ and $C (A^{T}) = N (A)^{⊥}$ .

Understanding Linear Systems

Consider the subspaces $N (A)$ (the nullspace of $A$ ) and $R (A) = C (A^{T})$ (the row space of $A$ ). These are orthogonal complements in $R^{n}$ . Consequently, any vector $x \in R^{n}$ can be uniquely decomposed as $x = x_{0} + x_{1}$ , where $x_{0} \in N (A)$ and $x_{1} \in R (A)$ .

Theorem: Solution Set of $A x = b$

The set of all solutions to the linear system $A x = b$ is given by $x_{1} + N (A)$ , where $x_{1} \in R (A)$ is a particular solution satisfying $A x_{1} = b$ .

Proof

If $x$ is a solution, then $A (x - x_{1}) = A x - A x_{1} = b - b = 0$ . This means $x - x_{1} \in N (A)$ , so $x = x_{1} + x_{0}$ for some $x_{0} \in N (A)$ , and thus $x \in x_{1} + N (A)$ .
Conversely, if $x \in x_{1} + N (A)$ , then $x = x_{1} + x_{0}$ for some $x_{0} \in N (A)$ . Then $A x = A (x_{1} + x_{0}) = A x_{1} + A x_{0} = b + 0 = b$ , showing that $x$ is indeed a solution.

A Link Between the Nullspaces of $A$ and $A^{T} A$

Lemma: Nullspaces of $A$ and $A^{T} A$

For any matrix $A \in R^{m \times n}$ , the following holds:

$N (A) = N (A^{T} A)$ (The nullspaces of $A$ and $A^{T} A$ are identical.)
$C (A^{T}) = C (A A^{T})$ (The column space of $A^{T}$ is the same as the column space of $A A^{T}$ .)

Proof:

$N (A) = N (A^{T} A)$ :
- $N (A) \subseteq N (A^{T} A)$ : If $x$ is in the nullspace of $A$ ( $A x = 0$ ), then $A^{T} A x = A^{T} 0 = 0$ . This means $x$ is also in the nullspace of $A^{T} A$ .
- $N (A^{T} A) \subseteq N (A)$ : If $x$ is in the nullspace of $A^{T} A$ ( $A^{T} A x = 0$ ), pre-multiplying by $x^{T}$ gives $x^{T} A^{T} A x = (A x)^{T} (A x) = ∥ A x ∥^{2} = 0$ . The squared norm of a vector is zero if and only if the vector itself is zero. Thus, $A x = 0$ , meaning $x$ is in the nullspace of $A$ .
$C (A^{T}) = C (A A^{T})$ : We know from previous results that $C (A^{T}) = N (A)^{⊥}$ (the column space of $A^{T}$ is the orthogonal complement of the nullspace of $A$ ) and $C (A A^{T}) = N (A^{T} A)^{⊥}$ . Since we’ve just proven that $N (A) = N (A^{T} A)$ , their orthogonal complements must also be equal. Therefore, $C (A^{T}) = C (A A^{T})$ .

Projections

Note for readers: This is a much more intuitive and simpler explanation of this part…

Definition: Projection of a Vector onto a Subspace

The projection of a vector $b \in R^{m}$ onto a subspace $S \subseteq R^{m}$ is the point $p \in S$ closest to $b$ . Formally:

$proj_{S} (b) = p \in S argmin ∥ b - p ∥$

This is well-defined if the minimum exists and is unique.

The One-Dimensional Case

Lemma: Projection onto a Line

Let $a \in R^{m}, a \neq = 0$ . The projection of $b \in R^{m}$ onto the line $S = {λ a ∣ λ \in R} = C (a)$ is given by:

$proj_{S} (b) = \frac{a a ^{T}}{a ^{T} a} b = \frac{a ^{T} b}{∥ a ∥ ^{2}} a$

Here’s how we derive this formula, starting with the geometric intuition that the error vector ( $b$ minus its projection) is orthogonal to $a$ :

Orthogonality Condition: The projection of $b$ onto the line spanned by $a$ is some scalar multiple of $a$ . Let’s call this scalar $λ$ . So, the projection is $λ a$ . The error vector is $b - λ a$ . This error vector must be orthogonal to $a$ . Mathematically, this orthogonality is expressed as: $a^{T} (b - λ a) = 0$
Solving for $λ$ : Expanding the dot product gives:

$a^{T} b - λ a^{T} a = 0$

Solving for $λ$ :

$λ = \frac{a ^{T} b}{a ^{T} a} = \frac{a ^{T} b}{∥ a ∥ ^{2}}$
The Projection Formula: Substituting this value of $λ$ back into the expression for the projection ( $λ a$ ) gives:

$proj_{S} (b) = \frac{a ^{T} b}{∥ a ∥ ^{2}} a = \frac{a a ^{T}}{a ^{T} a} b$

The second form arises from recognizing that $a^{T} b$ is a scalar, and scalar multiplication is commutative. $\frac{a a ^{T}}{a ^{T} a}$ is often called the projection matrix (for projection onto the line spanned by $a$ ).

Proof

Let $p \in S$ , so $p = λ a$ for some scalar $λ$ . Our goal is to find the value of $λ$ that minimizes the distance between $b$ and $p$ , which is equivalent to minimizing the squared distance $∥ b - p ∥^{2} = ∥ b - λ a ∥^{2}$ .

Expanding the Squared Distance: We expand the squared distance using the dot product: $∥ b - λ a ∥^{2} = (b - λ a)^{T} (b - λ a)$

Distributing the terms gives: $= b^{T} b - λ b^{T} a - λ a^{T} b + λ^{2} a^{T} a$

Since $b^{T} a$ and $a^{T} b$ are both scalars and equal to each other (dot product is commutative), we can simplify this to:

$= ∥ b ∥^{2} - 2 λ b^{T} a + λ^{2} ∥ a ∥^{2}$

Let’s call this expression $g (λ)$ .
Minimizing $g (λ)$ : To find the minimum value of $g (λ)$ , we take its derivative with respect to $λ$ and set it equal to zero:

$\frac{d g ( λ )}{d λ} = - 2 b^{T} a + 2 λ ∥ a ∥^{2} = 0$
Solving for $λ$ : Now we solve for $λ$ :

$2 λ ∥ a ∥^{2} = 2 b^{T} a$

$λ = \frac{b ^{T} a}{∥ a ∥ ^{2}}$ This value of $λ$ minimizes the squared distance. Let’s call it $λ^{*}$ .
The Projection: Substitute $λ^{*}$ back into the expression for the projection $p = λ a$ :

$proj_{S} (b) = λ^{*} a = \frac{b ^{T} a}{∥ a ∥ ^{2}} a$

This can also be written as:

$proj_{S} (b) = \frac{a a ^{T}}{a ^{T} a} b$

because $a^{T} b$ is a scalar and can be moved to the left of $a$ . $a a^{T}$ is a matrix, and $a^{T} a = ∥ a ∥^{2}$ is a scalar.

Intuition and Check

Orthogonality of the Error Vector: The error vector $e = b - proj_{S} (b)$ is orthogonal to $a$ . You can verify this by computing $a^{T} e$ . This orthogonality is a fundamental property of projections.
Projection of a Collinear Vector: If $b$ is already a multiple of $a$ (i.e., $b$ lies on the line spanned by $a$ ), then the projection of $b$ onto $S$ should be $b$ itself. You can verify this using the formula. This makes intuitive sense, as the closest point on the line to a point already on the line is the point itself.

The General Case: Projection onto a Subspace

Lemma: Projection onto a Subspace

Let $S$ be an $n$ -dimensional subspace of $R^{m}$ , and let ${a_{1}, \dots, a_{n}}$ be a basis for $S$ . Form the matrix $A = ∣ a_{1} ∣ ∣ a_{2} ∣ \dots ∣ a_{n} ∣$ , so $S = C (A)$ (meaning $S$ is the column space of $A$ ). The projection of a vector $b \in R^{m}$ onto $S$ is given by $proj_{S} (b) = A \hat{x}$ , where $\hat{x}$ satisfies the normal equations:

$A^{T} A \hat{x} = A^{T} b$

Proof

Decomposition of b: We can decompose $b$ into two components: $b = p + e$ , where $p$ is the projection of $b$ onto $S$ (so $p \in S$ ), and $e$ is the error vector, which is orthogonal to $S$ (so $e \in S^{⊥}$ ).
Minimizing the Distance: The projection $p$ is the point in $S$ that is closest to $b$ . This means we want to minimize the distance between $b$ and any arbitrary point $p^{'}$ in $S$ . We do this by minimizing the squared distance $∥ b - p^{'} ∥^{2}$ .
Considering another point in S: Let $p^{'}$ be any other point in $S$ . Since both $p$ and $p^{'}$ are in $S$ , their difference $p - p^{'}$ is also in $S$ . Because $e$ is orthogonal to $S$ , $e$ is orthogonal to $p - p^{'}$ . This means their dot product is zero: $(p - p^{'})^{T} e = 0$
Expanding the Squared Distance: Now, let’s expand the squared distance between $b$ and $p^{'}$ : $∥ b - p^{'} ∥^{2} = ∥ p + e - p^{'} ∥^{2} = ∥ (p - p^{'}) + e ∥^{2}$

Using the Pythagorean theorem (which applies because $p - p^{'}$ and $e$ are orthogonal), we get: $= ∥ p - p^{'} ∥^{2} + ∥ e ∥^{2}$

Since $∥ p - p^{'} ∥^{2}$ is always non-negative, this entire expression is greater than or equal to $∥ e ∥^{2}$ :

$∥ b - p^{'} ∥^{2} \geq ∥ e ∥^{2} = ∥ b - p ∥^{2}$
The Minimum Distance: The inequality above shows that the squared distance between $b$ and any point $p^{'}$ in $S$ is always greater than or equal to $∥ e ∥^{2}$ , which is the squared distance between $b$ and $p$ . The minimum distance is achieved when $p^{'} = p$ .
Expressing the Projection: Since $p$ is in the column space of $A$ ( $p \in C (A)$ ), we can express $p$ as a linear combination of the columns of $A$ : $p = A \hat{x}$ for some vector $\hat{x} \in R^{n}$ .
Orthogonality and the Normal Equations: The error vector $e = b - A \hat{x}$ is orthogonal to $S$ . This means $e$ is orthogonal to every vector in $S$ , including the basis vectors $a_{1}, \dots, a_{n}$ that form the columns of $A$ . So, for each column $a_{i}$ : $a_{i}^{T} (b - A \hat{x}) = 0$

This can be written compactly as: $A^{T} (b - A \hat{x}) = 0$

Distributing $A^{T}$ gives us the normal equations: $A^{T} A \hat{x} = A^{T} b$

Solving this system of equations for $\hat{x}$ allows us to compute the projection $p = A \hat{x}$ .

Continue here: 17 Projections, Least Squares, Linear Regression

CS Notes

Explorer

16 Orthogonal Complementary Subspaces, Projections, Normal Form

Orthogonal Complementary Subspaces

Decomposition of $R^{n}$

Lemma: Double Orthogonal Complement

Proof

Corollary: Decomposition of $R^{n}$

The Set of All Solutions to a System of Linear Equations

Corollary: Nullspace and Column Space of Transpose

Understanding Linear Systems

Theorem: Solution Set of $A x = b$

Proof

A Link Between the Nullspaces of $A$ and $A^{T} A$

Lemma: Nullspaces of $A$ and $A^{T} A$

Proof:

Projections

Definition: Projection of a Vector onto a Subspace

The One-Dimensional Case

Lemma: Projection onto a Line

Proof

Intuition and Check

The General Case: Projection onto a Subspace

Lemma: Projection onto a Subspace

Proof

Table of Contents

Backlinks

CS Notes

Explorer

16 Orthogonal Complementary Subspaces, Projections, Normal Form

Orthogonal Complementary Subspaces

Decomposition of Rn

Lemma: Double Orthogonal Complement

Proof

Corollary: Decomposition of Rn

The Set of All Solutions to a System of Linear Equations

Corollary: Nullspace and Column Space of Transpose

Understanding Linear Systems

Theorem: Solution Set of Ax=b

Proof

A Link Between the Nullspaces of A and ATA

Lemma: Nullspaces of A and ATA

Proof:

Projections

Definition: Projection of a Vector onto a Subspace

The One-Dimensional Case

Lemma: Projection onto a Line

Proof

Intuition and Check

The General Case: Projection onto a Subspace

Lemma: Projection onto a Subspace

Proof

Table of Contents

Backlinks

Decomposition of $R^{n}$

Corollary: Decomposition of $R^{n}$

Theorem: Solution Set of $A x = b$

A Link Between the Nullspaces of $A$ and $A^{T} A$

Lemma: Nullspaces of $A$ and $A^{T} A$