10 Calculating the Inverse, LU and LUP Decomposition

Lecture from 18.10.2024 | Video: Videos ETHZ

Calculating the Inverse

We can use Gauss-Jordan elimination to find the inverse of a matrix, if it exists. The method relies on the fact that if a matrix $A$ is invertible, then applying Gauss-Jordan elimination to the augmented matrix $[A ∣ I]$ (where $I$ is the identity matrix) will result in $[I ∣ A^{- 1}]$ .

Algorithm

Augment the matrix: Form the augmented matrix $[A ∣ I]$ .
Apply Gauss-Jordan elimination: Transform the left side of the augmented matrix ( $A$ ) into the identity matrix $I$ using elementary row operations. Apply the same row operations to the right side ( $I$ ).
Extract the inverse: If the left side becomes $I$ , then the right side is the inverse $A^{- 1}$ . If the left side cannot be transformed into $I$ (e.g., if a row of zeros appears on the left), then $A$ is not invertible.

Example

Let’s find the inverse of the matrix:

A = [2111]

Augmented Matrix:

[A ∣ I] = [21111001]

Gauss-Jordan Elimination:

a. Divide row 1 by 2:
$[11 1/2 1 1/2 0 01]$
b. Subtract row 1 from row 2:
$[10 1/2 1/2 1/2 - 1/2 01]$
c. Multiply row 2 by 2:
$[10 1/2 1 1/2 - 1 02]$
d. Subtract (1/2) * row 2 from row 1:
$[1001 1 - 1 - 1 2]$
Extract Inverse: The left side is now the identity matrix, so the right side is the inverse:
$A^{- 1} = [1 - 1 - 1 2]$

Why this works

The elementary row operations performed on the augmented matrix $[A ∣ I]$ can be viewed as multiplying $[A ∣ I]$ by a sequence of elementary matrices. If $A$ is invertible, the product of these elementary matrices will be $A^{- 1}$ . Thus, applying these operations transforms the left side to $I$ and the right side to $A^{- 1}$ .

LU and LUP Decomposition

LU and LUP decomposition are valuable tools for solving linear systems $A x = b$ for multiple right-hand sides ( $b$ vectors) efficiently. Instead of performing Gaussian elimination for each $b$ , we can decompose $A$ once and then use forward and backward substitution for each $b$ .

LU Decomposition

LU decomposition factors a matrix $A$ into a lower triangular matrix $L$ and an upper triangular matrix $U$ , such that $A = LU$ . This decomposition is closely related to Gaussian elimination.

Connection to Gaussian Elimination

The entries of $L$ and $U$ are directly derived from the Gaussian elimination process. Specifically:

U: $U$ is the upper triangular matrix that results after performing Gaussian elimination on $A$ .
L: $L$ is a lower triangular matrix where:
- The diagonal entries are all 1s.
- The entries below the diagonal are the multipliers ( $c_{ij}$ ) used during elimination. $c_{ij}$ represents the multiple of row $j$ that was subtracted from row $i$ to eliminate the element in the $i$ -th row and $j$ -th column.

Illustrative Example (3x3 case)

Let’s illustrate with a 3x3 matrix $A$ . Gaussian elimination involves multiplying $A$ by a series of elementary matrices (each representing a single row operation) to transform it into an upper triangular matrix $U$ . These elementary matrices, when combined, form the inverse of $L$ ( $L^{- 1}$ ).

Consider the elementary matrices corresponding to the elimination steps:

Eliminating the first column (below the diagonal):
$E_{1} = 1 - c_{21} - c_{31} 010001$
Eliminating the second column (below the diagonal):
$E_{2} = 100 01 - c_{32} 001$

Multiplying these elementary matrices gives us:

L^{- 1} = E_{2} E_{1} = 1 - c_{21} c_{32} c_{21} - c_{31} 01 - c_{32} 001

Therefore, $L^{- 1} A = U$ .

Taking the inverse of $L^{- 1}$ gives us $L$ :

L = 1 c_{21} c_{31} 01 c_{32} 001

Thus, we arrive at the LU decomposition: $A = LU$ .

General Case

This pattern holds for larger matrices as well. $L$ will always be a lower triangular matrix with 1s on the diagonal and the multipliers $c_{ij}$ below the diagonal. $U$ will be the upper triangular result of Gaussian elimination. Here is a proof:

Theorem 3.13: Let A be an $m \times m$ matrix on which Gaussian elimination succeeds without row exchanges, resulting in an upper triangular matrix $U$ . Let $c_{ij}$ be the multiple of row $j$ that we subtract from row $i > j$ when we eliminate in column $j$ . Then $A = LU$ where:

L = 1 c_{21} ⋮ c_{m 1} 01 ⋮ c_{m 2} \dots \dots ⋱ \dots 00 ⋮ 1

Proof (which I think is easier)

The proof relies on representing the row operations in Gaussian elimination as multiplication by elementary matrices.

Elementary Matrices: Each row operation in Gaussian elimination (subtracting a multiple of one row from another) can be represented by an elementary matrix. For the operation of subtracting $c_{ij}$ times row $j$ from row $i$ , the elementary matrix $E_{ij}$ is the identity matrix with $- c_{ij}$ in the $(i, j)$ position.
Gaussian Elimination as Matrix Multiplication: Gaussian elimination, without row exchanges, can be expressed as a sequence of multiplications by these elementary matrices. Let $E_{1}, E_{2}, ..., E_{k}$ be the elementary matrices representing the row operations performed to transform $A$ into $U$ . Then:
$U = E_{k} \dots E_{2} E_{1} A$
Inverse of Elementary Matrices: The inverse of an elementary matrix $E_{ij}$ is simply the identity matrix with $+ c_{ij}$ in the $(i, j)$ position. This represents the reverse operation (adding $c_{ij}$ times row $j$ back to row $i$ ).
Constructing L: The key insight is that the product of the inverses of the elementary matrices, in the reverse order, forms the lower triangular matrix $L$ :
$L = (E_{k} \dots E_{2} E_{1})^{- 1} = E_{1}^{- 1} E_{2}^{- 1} \dots E_{k}^{- 1}$
Note: The reason we switch the order when inverting multiple matrices is simple: $I = A B (A B)^{- 1} = (A B) (B^{- 1} A^{- 1}) = A (B B^{- 1}) A^{- 1} = A I A^{- 1} = A A^{- 1} = I$

When we calculate this product, the multipliers $c_{ij}$ “accumulate” in the appropriate positions below the diagonal of $L$ , resulting in the structure defined in Theorem 3.13. The diagonal entries remain 1s because the diagonal entries of the inverse elementary matrices are also 1s.
Completing the Proof: Substituting $L$ back into the equation for $U$ :
$U = L^{- 1} A$
Multiplying both sides by $L$ gives us:
$LU = A$
This proves that $A$ can be decomposed into the product of a lower triangular matrix $L$ (constructed from the Gaussian elimination multipliers) and an upper triangular matrix $U$ (the result of Gaussian elimination).

This proof demonstrates how the LU decomposition systematically captures the row operations performed during Gaussian elimination, allowing us to efficiently solve $A x = b$ for multiple right-hand sides.

Proof (from the script)

Focus on a single row: Consider a fixed row $i$ of $A$ . Whenever we change row $i$ , we subtract $c_{ij} \cdot (row j)$ from it, for some previous row $j$ . At this point, row $j$ is “finalized” (meaning it has been transformed into its final form in $U$ ).
Expressing row $i$ of $A$ : We can express row $i$ of $A$ at any point during Gaussian elimination as a linear combination of the finalized rows of $U$ (rows 1 to $i - 1$ ) and the current state of row $i$ .
Step-by-step breakdown:
- Initially: Row $i$ of $A$ is just its initial value.
- Step 1: We subtract $c_{i 1}$ times the first finalized row (row 1 of $U$ ) from row $i$ of $A$ :
  $row i (A) = original row i (A) - c_{i 1} \cdot row 1 (U)$
- Step 2: We subtract $c_{i 2}$ times the second finalized row (row 2 of $U$ ) from the current row $i$ of $A$ :
  $row i (A) = (original row i (A) - c_{i 1} \cdot row 1 (U)) - c_{i 2} \cdot row 2 (U)$
- And so on, until step $i - 1$ .
- After step $i - 1$ : Row $i$ is finalized and becomes row $i$ of $U$ .
Rearranging the equation: We can rearrange the equation from step $i - 1$ to express the original row $i$ of $A$ as:
$original row i (A) = c_{i 1} \cdot row 1 (U) + c_{i 2} \cdot row 2 (U) + \dots + c_{i, i - 1} \cdot row (i - 1) (U) + row i (U)$
Matrix Notation: This linear combination can be written compactly using matrix notation:
$(row i of A) = [c_{i 1}, c_{i 2}, \dots, c_{i, i - 1}, 1, 0, \dots, 0] \cdot U$
This is a row vector multiplied by $U$ .
Constructing L: Stacking these row vector representations for all rows $i$ of $A$ (from $i = 1$ to $i = m$ ) forms the equation $A = LU$ . The matrix $L$ is constructed as follows:
$L = 1 c_{21} ⋮ c_{m 1} 01 ⋮ c_{m 2} \dots \dots ⋱ \dots 00 ⋮ 1$

This construction of $L$ , where the multipliers $c_{ij}$ appear below the diagonal, directly reflects the row operations performed during Gaussian elimination. This completes the proof that $A = LU$ when Gaussian elimination succeeds without row exchanges.

Solving $A x = b$ with LU

Once we have the decomposition $A = LU$ , solving $A x = b$ becomes a two-step process:

Forward Substitution: Solve $L y = b$ for $y$ . Since $L$ is lower triangular, this is done efficiently by solving for the first element, substituting it into the next equation to solve for the second, and so on.
Backward Substitution: Solve $Ux = y$ for $x$ . Since $U$ is upper triangular, this can similarly be solved efficiently starting with the last variable.

Why LU Decomposition Works for Solving Ax = b

The LU decomposition provides an efficient way to solve $A x = b$ because it breaks down the problem into two simpler steps that are easier to solve than the original system. Here’s the breakdown:

Decomposition: We decompose the original matrix $A$ into $L$ and $U$ such that $A = LU$ . This step is analogous to the elimination phase of Gaussian elimination, where we transform $A$ into an upper triangular matrix. The $L$ matrix essentially stores the operations performed during this elimination process.
Substitution: We now have the equation $LUx = b$ . We introduce an intermediate vector $y$ such that $Ux = y$ . This gives us two equations:
- $L y = b$ : This is solved using forward substitution. Since $L$ is lower triangular, we can easily solve for $y_{1}$ from the first equation, substitute that value into the second equation to solve for $y_{2}$ , and so on.
- $Ux = y$ : Once we have $y$ , we solve for $x$ using backward substitution. Because $U$ is upper triangular, we can solve for $x_{n}$ from the last equation, substitute that value into the second-to-last equation to solve for $x_{n - 1}$ , and so on.

In essence, the LU decomposition restructures the Gaussian elimination process:

The LU decomposition itself ( $A = LU$ ) represents the elimination steps.
The forward substitution ( $L y = b$ ) calculates the right-hand side of the simplified system after elimination.
The backward substitution ( $Ux = y$ ) performs the back-substitution phase of Gaussian elimination.

This method is more efficient when solving for multiple $b$ vectors because we only need to decompose $A$ once. For each subsequent $b$ , we simply perform the forward and backward substitutions, which are significantly faster than repeating the entire Gaussian elimination process.

LUP Decomposition

When row exchanges are required during Gaussian elimination, a permutation matrix $P$ is introduced, giving the LUP decomposition: $P A = LU$ .

Permutation Matrix Example

A permutation matrix $P$ is a square matrix used to represent row exchanges. Each row and column contains exactly one entry of $1$ and $0$ s elsewhere. For example, if we swap the first and second rows of a $3 \times 3$ identity matrix $I$ , the permutation matrix $P$ would be:

P = 010100001

Applying $P$ to a vector or matrix permutes its rows accordingly.

Solving $A x = b$ with LUP

Permute $b$ : Compute $b^{'} = P b$ .
Forward Substitution: Solve $L y = b^{'}$ for $y$ .
Backward Substitution: Solve $Ux = y$ for $x$ .

When LU Decomposition Fails

LU decomposition (without pivoting) fails when a zero appears on the diagonal during elimination. This prevents the calculation of multipliers needed for $L$ . LUP decomposition addresses this by incorporating row exchanges.

Efficiency

Decomposition: Both LU and LUP decomposition require $O (n^{3})$ operations (or slightly faster using Strassens algorithm and similar…), similar to Gaussian elimination.
Solving for multiple $b$ : Once $L, U$ , and $P$ are computed, solving $A x = b$ for different $b$ vectors only takes $O (n^{2})$ time (forward and backward substitution). This makes LU/LUP decomposition much more efficient than performing Gaussian elimination for each new $b$ .

Continue here: 11 Gauss-Jordan, REF, RREF, Properties of REF

CS Notes

Explorer

10 Calculating the Inverse, LU and LUP Decomposition

Calculating the Inverse

Algorithm

Example

Why this works

LU and LUP Decomposition

LU Decomposition

Connection to Gaussian Elimination

Illustrative Example (3x3 case)

General Case

Proof (which I think is easier)

Proof (from the script)

Solving $A x = b$ with LU

Why LU Decomposition Works for Solving Ax = b

LUP Decomposition

Permutation Matrix Example

Solving $A x = b$ with LUP

When LU Decomposition Fails

Efficiency

Table of Contents

Graph View

CS Notes

Explorer

10 Calculating the Inverse, LU and LUP Decomposition

Calculating the Inverse

Algorithm

Example

Why this works

LU and LUP Decomposition

LU Decomposition

Connection to Gaussian Elimination

Illustrative Example (3x3 case)

General Case

Proof (which I think is easier)

Proof (from the script)

Solving Ax=b with LU

Why LU Decomposition Works for Solving Ax = b

LUP Decomposition

Permutation Matrix Example

Solving Ax=b with LUP

When LU Decomposition Fails

Efficiency

Table of Contents

Graph View

Solving $A x = b$ with LU

Solving $A x = b$ with LUP