9 Matrix Inverses

3 Blue 1 Brown – Inverse Matrices, column space, and null space

library(dasc2594)
library(tidyverse)

For scalars, the multiplicative identity is \[ a \frac{1}{a} = a a^{-1} = a^{-1} a = 1 \] where \(a^{-1}\) is the inverse of \(a\).

Definition 9.1 (Matrix Inverse) The \(n \times n\) square matrix \(\mathbf{A}\) is said to be invertible if there exists a \(n \times n\) matrix \(\mathbf{C}\)( which we call \(\mathbf{A}^{-1}\) once we verify the inverse exists) such that \[ \begin{aligned} \mathbf{C}\mathbf{A} = \mathbf{A} \mathbf{C} & = \mathbf{I} \\ \mathbf{A}^{-1} \mathbf{A} = \mathbf{A} \mathbf{A}^{-1} & = \mathbf{I} \end{aligned} \] where \(\mathbf{I}\) is the \(n \times n\) identity matrix (the matrix with 1s on the diagonal and zeros everywhere else).

In R, an identity matrix is easy to construct. An \(n \times n\) identity matrix can be constructed using the diag() function

n <- 4
I <- diag(n)
I

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    0    0    1    0
[4,]    0    0    0    1

Example 9.1

\[ \begin{aligned} \mathbf{A} = \begin{pmatrix} 1 & -1 \\ 2 & -3 \end{pmatrix} && \mathbf{B} = \begin{pmatrix} 3 & -1 \\ 2 & -1 \end{pmatrix} \end{aligned} \]

# check if B is the inverse of A
A %*% B

     [,1] [,2]
[1,]    1    0
[2,]    0    1

# check if B is the inverse of A
B %*% A

     [,1] [,2]
[1,]    1    0
[2,]    0    1

Because \(\mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} = \mathbf{I}\), we have \(\mathbf{A}\) is an invertible matrix with inverse \(\mathbf{B} = \mathbf{A}^{-1}\).

Theorem 9.1 (Matrix Inverse for 2 by 2 matrix) Let \(\mathbf{A} = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\). If \(ad - bc \neq 0\) then \(\mathbf{A}\) is invertible and \[ \begin{aligned} \mathbf{A}^{-1} = \frac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix} \end{aligned} \] If \(ad - bc = 0\), then the matrix is not invertible.

Question: why is the matrix not invertible when \(ad - bc = 0\)?
- Have you heard of “singular” or “singularity” before?
- Black holes are called singularities. Why is this?
- Square matrices that are not invertible are call “singular”

Definition 9.2 For the \(2 \times 2\) matrix \(\mathbf{A} = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\), the term \(ad - bc\) is called the determinant of the matrix \(\mathbf{A}\) and is written as \(\operatorname{det}(\mathbf{A})\). Sometimes the determinant is written as \(| \mathbf{A}|\)

A consequence of the above theorem is that a \(2 \times 2\) matrix is invertible only if its determinant is nonzero.

Example 9.2 Determine if the following \(2 \times 2\) matrix is invertible

\(\mathbf{A} = \begin{pmatrix} 4 & -4 \\ -1 & 2 \end{pmatrix}\)

Theorem 9.2 If the \(n \times n\) matrix \(\mathbf{A}\) is invertible, then for each \(\mathbf{b} \in \mathcal{R}^n\), the matrix equation \[ \mathbf{A} \mathbf{x} = \mathbf{b} \] has the unique solution \(\mathbf{x} = \mathbf{A}^{-1} \mathbf{b}\).

Proof

There are two things to show …

show there is a solution
show the solution is unique

Example 9.3

Let \(\mathbf{A} = \begin{pmatrix} 4 & -4 & -2 \\ 5 & 2 & -5 \\ -4 & 6 & 1 \end{pmatrix}\) and \(\mathbf{b} = \begin{pmatrix} 3 \\ 1 \\ 2 \end{pmatrix}\)

Find the solution to \(\mathbf{A} \mathbf{x} = \mathbf{b}\)

Theorem 9.3 (Invertible Matrix Theorem Again) Adding onto the Invertible Matrix Theorem Theorem 9.5 we have

If \(\mathbf{A}\) is an invertible matrix, then \(\mathbf{A}^{-1}\) is invertible and \((\mathbf{A}^{-1})^{-1} = \mathbf{A}\)
If \(\mathbf{A}\) and \(\mathbf{B}\) are \(n \times n\) invertible matrices, then \(\mathbf{A} \mathbf{B}\) is also an invertible matrix whose inverse is \[ (\mathbf{A}\mathbf{B})^{-1} = \mathbf{B}^{-1}\mathbf{A}^{-1} \] which is the inverse of the matrices in reverse order.
If \(\mathbf{A}\) is an invertible matrix, then the transpose \(\mathbf{A}'\) is also invertible and the inverse of \(\mathbf{A}'\) is the transpose of \(\mathbf{A}^{-1}\). Equivalently, \[ (\mathbf{A}')^{-1} = (\mathbf{A}^{-1})' \]

Proof

Here we prove the three statements from the theorem above. All three statements rely on the definition of an invertible matrix in Definition 9.1.

If \(\mathbf{A}^{-1}\) is invertible, then, there exists a matrix \(\mathbf{C}\) such that \(\mathbf{C} \mathbf{A}^{-1} = \mathbf{A}^{-1} \mathbf{C} = \mathbf{I}\). Let \(\mathbf{C} = \mathbf{A}\). Then, we have \(\mathbf{A} \mathbf{A}^{-1} = \mathbf{A}^{-1} \mathbf{A} = \mathbf{I}\) which shows that \(\left(\mathbf{A}^{-1}\right)^{-1} = \mathbf{A}\)
First, consider multiplying \(\mathbf{A}\mathbf{B}\) on the left by \(\mathbf{B}^{-1} \mathbf{A}^{-1}\) where \((\mathbf{A}\mathbf{B}) (\mathbf{B}^{-1} \mathbf{A}^{-1}) = \mathbf{A} (\mathbf{B} \mathbf{B}^{-1}) \mathbf{A}^{-1} = \mathbf{A} \mathbf{I} \mathbf{A}^{-1} = \mathbf{A} \mathbf{A}^{-1} = \mathbf{I}\). Then multiply \(\mathbf{A}\mathbf{B}\) on the right by \(\mathbf{B}^{-1} \mathbf{A}^{-1}\) where \((\mathbf{B}^{-1} \mathbf{A}^{-1}) (\mathbf{A}\mathbf{B}) = \mathbf{B} (\mathbf{A} \mathbf{A}^{-1}) \mathbf{B}^{-1} = \mathbf{B} \mathbf{I} \mathbf{B}^{-1} = \mathbf{B} \mathbf{B}^{-1} = \mathbf{I}\).
Use the fact that \((\mathbf{A} \mathbf{B})' = \mathbf{B}' \mathbf{A}'\). Then, \((\mathbf{A}^{-1})' \mathbf{A}' = (\mathbf{A}\mathbf{A}^{-1})' = \mathbf{I}' = \mathbf{I}\). Similarly \(\mathbf{A}'(\mathbf{A}^{-1})' = (\mathbf{A}^{-1}\mathbf{A})' = \mathbf{I}' = \mathbf{I}\). Thus \(\mathbf{A}'\) is invertible with inverse \((\mathbf{A}^{-1})'\)

Note: A consequence of Theorem 12.4 (2) is that the product of \(k\) invertible \(n \times n\) matrices \(\mathbf{A}_1 \mathbf{A}_2 \cdots \mathbf{A}_k\) has inverse \(\mathbf{A}_k^{-1} \mathbf{A}_{k-1}^{-1} \cdots \mathbf{A}_1^{-1}\)

9.1 Elementary matrices

Elementary matrices are matrices that perform basic row operations (i.e., we can write the reduced row echelon algorithm as a produce of elementary matrices).

Recall the elementary row operations:

swaps: swapping two rows.
sums: replacing a row by the sum itself and a multiple of another row.
scalar multiplication: replacing a row by a scalar multiple times itself.

Example 9.4 Consider the \(3 \times 3\) matrix

A <- matrix(c(4, 5, 9, -2, -4, 1, 4, 6, -2), 3, 3)

\(\mathbf{A} = \begin{pmatrix} 4 & -2 & 4 \\ 5 & -4 & 6 \\ 9 & 1 & -2 \end{pmatrix}\)

What is the elementary matrix (let’s call it \(\mathbf{E}_1\) that swaps the first and second rows of \(\mathbf{A}\)?

E_1 <- matrix(c(0, 1, 0, 1, 0, 0,  0, 0, 1), 3, 3)

\(\mathbf{E}_1 = \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}\)

     [,1] [,2] [,3]
[1,]    4   -2    4
[2,]    5   -4    6
[3,]    9    1   -2

## left multiple A by E_1
E_1 %*% A

     [,1] [,2] [,3]
[1,]    5   -4    6
[2,]    4   -2    4
[3,]    9    1   -2

Thus, the matrix \(\mathbf{E}_1 = \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}\) is the matrix that swaps the first and second row.

What is the elementary matrix (let’s call it \(\mathbf{E}_2\) that adds two times the first of \(\mathbf{A}\) to the third row of \(\mathbf{A}\)?

E_2 <- matrix(c(1, 0, 2, 0, 1, 0, 0, 0, 1), 3, 3)

\(\mathbf{E}_2 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{pmatrix}\)

     [,1] [,2] [,3]
[1,]    4   -2    4
[2,]    5   -4    6
[3,]    9    1   -2

## left multiple A by E_2
E_2 %*% A

     [,1] [,2] [,3]
[1,]    4   -2    4
[2,]    5   -4    6
[3,]   17   -3    6

Thus, the matrix \(\mathbf{E}_2 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{pmatrix}\) is the matrix that adds two times the first of \(\mathbf{A}\) to the third row of \(\mathbf{A}\)

What is the elementary matrix (let’s call it \(\mathbf{E}_3\) that mutliples the second row of \(\mathbf{A}\) by 3?

E_3 <- matrix(c(1, 0, 0, 0, 3, 0, 0, 0, 1), 3, 3)

\(\mathbf{E}_3 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{pmatrix}\)

     [,1] [,2] [,3]
[1,]    4   -2    4
[2,]    5   -4    6
[3,]    9    1   -2

## left multiple A by E_3
E_3 %*% A

     [,1] [,2] [,3]
[1,]    4   -2    4
[2,]   15  -12   18
[3,]    9    1   -2

Thus, the matrix \(\mathbf{E}_3 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{pmatrix}\) is the matrix that multiples the second row of \(\mathbf{A}\) by 3.

Question: Do you see any patterns with how the example elementary matrices look?

\[ \begin{aligned} \mathbf{E_1} = \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix} && \mathbf{E_2} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{pmatrix} && \mathbf{E_3} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{pmatrix} \end{aligned} \]

The elementary matrices look like the identity matrix \(\mathbf{I}\) with an elementary row operation applied to \(\mathbf{I}\). In fact, this leads us to this general fact:

Fact: If an elementary row matrix is applied to the \(m \times n\) matrix \(\mathbf{A}\), the result of this elementary row operation applied to \(\mathbf{A}\) can be written as \(\mathbf{E} \mathbf{A}\) where \(\mathbf{E}\) is the \(m \times m\) identity matrix \(\mathbf{I}\) with the respective elementary row operation applied to \(\mathbf{I}\).

Fact: Each elementary matrix \(\mathbf{E}\) is invertible

Example 9.5 In class

The next theorem is quite important as the result gives an algorithm for calculating the inverse of a \(n \times n\) matrix \(\mathbf{A}\) which also makes it possible to solve matrix equations \(\mathbf{A}\mathbf{x} = \mathbf{b}\)

Theorem 9.4 If an \(n \times n\) matrix \(\mathbf{A}\) is invertible, then \(\mathbf{A}\) is row-equivalent to \(\mathbf{I}\) (\(\mathbf{A} \sim \mathbf{I}\); row-equivalent means \(\mathbf{A}\) can be reduced to \(\mathbf{I}\) using elementary row operations). The row-equivalency implies that there is a series of elementary row operations (e.g., elementary matrices \(\mathbf{E}_1, \ldots, \mathbf{E}_k\)) that converts \(\mathbf{A}\) to \(\mathbf{I}\). In addition, the application of these row matrices to \(\mathbf{I}\) transforms \(\mathbf{I}\) to the matrix inverse \(\mathbf{A}^{-1}\).

Proof: in class

9.2 Finding the inverse of \(\mathbf{A}\)

The previous theorem states that for a \(n \times n\) invertible matrix \(\mathbf{A}\), the elementary row operations that covert \(\mathbf{A}\) to \(\mathbf{I}\) also convert \(\mathbf{I}\) to \(\mathbf{A}^{-1}\). This suggests an algorithm for finding the inverse \(\mathbf{A}^{-1}\) of \(\mathbf{A}\):

Create the augmented matrix \(\begin{pmatrix} \mathbf{A} & \mathbf{I} \end{pmatrix}\) and row reduce the augmented matrix. If the row-reduced augmented matrix is of the form \(\begin{pmatrix} \mathbf{I} & \mathbf{A}^{-1} \end{pmatrix}\) then \(\mathbf{A}^{-1}\) is the inverse of \(\mathbf{A}\). If the leading matrix in the augmented matrix is not the identity matrix \(\mathbf{I}\), then \(\mathbf{A}\) is not row equivalent to \(\mathbf{I}\) and is therefore not invertible.

Example 9.6 Let \(\mathbf{A} = \begin{pmatrix} -3 & -3 & -4 \\ -4 & 2 & -4 \\ 4 & -4 & 4 \end{pmatrix}\). Does \(\mathbf{A}\) have an inverse, and if so, what is it?

Using R

9.3 The Invertible Matrix Theorem

Theorem 9.5 (The Invertible Matrix Theorem) Let \(\mathbf{A}\) be an \(n \times n\) matrix. Then the following statements are equivalent (i.e., they are all either simultaneously true or false).

\(\mathbf{A}\) is an invertible matrix.
\(\mathbf{A}\) is row equivalent to the \(n \times n\) identity matrix \(\mathbf{I}\) (\(\mathbf{A} \sim \mathbf{I}\)).
\(\mathbf{A}\) has \(n\) pivot columns.
The homogeneous matrix equation \(\mathbf{A} \mathbf{x} = \mathbf{0}\) has only the trivial solution \(\mathbf{x} = \mathbf{0}\).
The columns of \(\mathbf{A}\) are linearly independent.
The linear transformation \(T:\mathcal{R}^n \rightarrow \mathcal{R}^n\) given by the matrix transformation \(\mathbf{x} \rightarrow \mathbf{A}\mathbf{x}\) is one-to-one.
The inhomogeneous matrix equation \(\mathbf{A} \mathbf{x} = \mathbf{b}\) has a unique solution for all \(\mathbf{b} \in \mathcal{R}^n\).
The columns of \(\mathbf{A}\) span \(\mathcal{R}^n\).
The linear transformation \(\mathbf{x} \rightarrow \mathbf{A} \mathbf{x}\) maps \(\mathcal{R}^n\) onto \(\mathcal{R}^n\).
There is an \(n \times n\) matrix \(\mathbf{C}\) such that \(\mathbf{C}\mathbf{A} = \mathbf{I}\).
There is an \(n \times n\) matrix \(\mathbf{D}\) such that \(\mathbf{A}\mathbf{D} = \mathbf{I}\).
\(\mathbf{A}'\) is an invertible matrix.

Proof

In class

A result of the invertible matrix theorem is that if \(\mathbf{A}\) and \(\mathbf{B}\) are \(n \times n\) matrices with \(\mathbf{A} \mathbf{B} = \mathbf{I}\) then \(\mathbf{A} = \mathbf{B}^{-1}\) and \(\mathbf{B} = \mathbf{A}^{-1}\).

9.4 Invertible Linear Transformations

Definition 9.3 A linear transformation \(T:\mathcal{R}^n \rightarrow \mathcal{R}^n\) is said to be invertible if there exists a transformation \(S:\mathcal{R}^n \rightarrow \mathcal{R}^n\) such that

\[ \begin{aligned} S(T(\mathbf{x})) = \mathbf{x} && \mbox{for all } \mathbf{x} \in \mathcal{R}^n T(S(\mathbf{x})) = \mathbf{x} && \mbox{for all } \mathbf{x} \in \mathcal{R}^n \\ \end{aligned} \]

Draw figure in class

Theorem 9.6 Let \(T:\mathcal{R}^n \rightarrow \mathcal{R}^n\) be a linear transformation and let \(\mathbf{A}\) be the matrix representing the transformation \(T\). Then the transformation \(T\) is invertible if and only if the matrix \(\mathbf{A}\) is invertible. Therefore, the matrix that represents \(S:\mathcal{R}^n \rightarrow \mathcal{R}^n\), the inverse transformation of \(T\), is unique and is represented by the matrix \(\mathbf{A}^{-1}\).