Chapter 9 Matrix Inverses
For scalars, the multiplicative identity is \[ a \frac{1}{a} = a a^{-1} = a^{-1} a = 1 \] where \(a^{-1}\) is the inverse of \(a\).
Definition 9.1 (Matrix Inverse) The \(n \times n\) square matrix \(\mathbf{A}\) is said to be invertible if there exists a \(n \times n\) matrix \(\mathbf{C}\)( which we call \(\mathbf{A}^{-1}\) once we verify the inverse exists) such that \[ \begin{aligned} \mathbf{C}\mathbf{A} = \mathbf{A} \mathbf{C} & = \mathbf{I} \\ \mathbf{A}^{-1} \mathbf{A} = \mathbf{A} \mathbf{A}^{-1} & = \mathbf{I} \end{aligned} \] where \(\mathbf{I}\) is the \(n \times n\) identity matrix (the matrix with 1s on the diagonal and zeros everywhere else).
In R
, an identity matrix is easy to construct. An \(n \times n\) identity matrix can be constructed using the diag()
function
## [,1] [,2] [,3] [,4]
## [1,] 1 0 0 0
## [2,] 0 1 0 0
## [3,] 0 0 1 0
## [4,] 0 0 0 1
Example 9.1 \[ \begin{aligned} \mathbf{A} = \begin{pmatrix} 1 & -1 \\ 2 & -3 \end{pmatrix} && \mathbf{B} = \begin{pmatrix} 3 & -1 \\ 2 & -1 \end{pmatrix} \end{aligned} \]
## [,1] [,2]
## [1,] 1 0
## [2,] 0 1
## [,1] [,2]
## [1,] 1 0
## [2,] 0 1
Because \(\mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} = \mathbf{I}\), we have \(\mathbf{A}\) is an invertible matrix with inverse \(\mathbf{B} = \mathbf{A}^{-1}\).
Theorem 9.1 (Matrix Inverse for 2 by 2 matrix) Let \(\mathbf{A} = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\). If \(ad - bc \neq 0\) then \(\mathbf{A}\) is invertible and \[ \begin{aligned} \mathbf{A}^{-1} = \frac{1}{ad - bc} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix} \end{aligned} \] If \(ad - bc = 0\), then the matrix is not invertible.
- Question: why is the matrix not invertible when \(ad - bc = 0\)?
- Have you heard of “singular” or “singularity” before?
- Black holes are called singularities. Why is this?
- Square matrices that are not invertible are call “singular”
Definition 9.2 For the \(2 \times 2\) matrix \(\mathbf{A} = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\), the term \(ad - bc\) is called the determinant of the matrix \(\mathbf{A}\) and is written as \(\operatorname{det}(\mathbf{A})\). Sometimes the determinant is written as \(| \mathbf{A}|\)
A consequence of the above theorem is that a \(2 \times 2\) matrix is invertible only if its determinant is nonzero.
Example 9.2 Determine if the following \(2 \times 2\) matrix is invertible
\(\mathbf{A} = \begin{pmatrix} 4 & -4 \\ -1 & 2 \end{pmatrix}\)
Theorem 9.2 If the \(n \times n\) matrix \(\mathbf{A}\) is invertible, then for each \(\mathbf{b} \in \mathcal{R}^n\), the matrix equation \[ \mathbf{A} \mathbf{x} = \mathbf{b} \] has the unique solution \(\mathbf{x} = \mathbf{A}^{-1} \mathbf{b}\).
Proof. There are two things to show …
show there is a solution
show the solution is unique
Example 9.3 Let \(\mathbf{A} = \begin{pmatrix} 4 & -4 & -2 \\ 5 & 2 & -5 \\ -4 & 6 & 1 \end{pmatrix}\) and \(\mathbf{b} = \begin{pmatrix} 3 \\ 1 \\ 2 \end{pmatrix}\)
Find the solution to \(\mathbf{A} \mathbf{x} = \mathbf{b}\)
If \(\mathbf{A}\) is an invertible matrix, then \(\mathbf{A}^{-1}\) is invertible and \((\mathbf{A}^{-1})^{-1} = \mathbf{A}\)
If \(\mathbf{A}\) and \(\mathbf{B}\) are \(n \times n\) invertible matrices, then \(\mathbf{A} \mathbf{B}\) is also an invertible matrix whose inverse is \[ (\mathbf{A}\mathbf{B})^{-1} = \mathbf{B}^{-1}\mathbf{A}^{-1} \] which is the inverse of the matrices in reverse order.
If \(\mathbf{A}\) is an invertible matrix, then the transpose \(\mathbf{A}'\) is also invertible and the inverse of \(\mathbf{A}'\) is the transpose of \(\mathbf{A}^{-1}\). Equivalently, \[ (\mathbf{A}')^{-1} = (\mathbf{A}^{-1})' \]
Proof. Here we prove the three statements from the theorem above. All three statements rely on the definition of an invertible matrix in Definition 9.1
If \(\mathbf{A}^{-1}\) is invertible, then, there exists a matrix \(\mathbf{C}\) such that \(\mathbf{C} \mathbf{A}^{-1} = \mathbf{A}^{-1} \mathbf{C} = \mathbf{I}\). Let \(\mathbf{C} = \mathbf{A}\). Then, we have \(\mathbf{A} \mathbf{A}^{-1} = \mathbf{A}^{-1} \mathbf{A} = \mathbf{I}\) which shows that \(\left(\mathbf{A}^{-1}\right)^{-1} = \mathbf{A}\)
First, consider multiplying \(\mathbf{A}\mathbf{B}\) on the left by \(\mathbf{B}^{-1} \mathbf{A}^{-1}\) where \((\mathbf{A}\mathbf{B}) (\mathbf{B}^{-1} \mathbf{A}^{-1}) = \mathbf{A} (\mathbf{B} \mathbf{B}^{-1}) \mathbf{A}^{-1} = \mathbf{A} \mathbf{I} \mathbf{A}^{-1} = \mathbf{A} \mathbf{A}^{-1} = \mathbf{I}\). Then multiply \(\mathbf{A}\mathbf{B}\) on the right by \(\mathbf{B}^{-1} \mathbf{A}^{-1}\) where \((\mathbf{B}^{-1} \mathbf{A}^{-1}) (\mathbf{A}\mathbf{B}) = \mathbf{B} (\mathbf{A} \mathbf{A}^{-1}) \mathbf{B}^{-1} = \mathbf{B} \mathbf{I} \mathbf{B}^{-1} = \mathbf{B} \mathbf{B}^{-1} = \mathbf{I}\).
Use the fact that \((\mathbf{A} \mathbf{B})' = \mathbf{B}' \mathbf{A}'\). Then, \((\mathbf{A}^{-1})' \mathbf{A}' = (\mathbf{A}\mathbf{A}^{-1})' = \mathbf{I}' = \mathbf{I}\). Similarly \(\mathbf{A}'(\mathbf{A}^{-1})' = (\mathbf{A}^{-1}\mathbf{A})' = \mathbf{I}' = \mathbf{I}\). Thus \(\mathbf{A}'\) is invertible with inverse \((\mathbf{A}^{-1})'\)
- Note: A consequence of theorem 9.4 (2) is that the product of \(k\) invertible \(n \times n\) matrices \(\mathbf{A}_1 \mathbf{A}_2 \cdots \mathbf{A}_k\) has inverse \(\mathbf{A}_k^{-1} \mathbf{A}_{k-1}^{-1} \cdots \mathbf{A}_1^{-1}\)
9.1 Elementary matrices
- Elementary matrices are matrices that perform basic row operations (i.e., we can write the reduced row echelon algorithm as a produce of elementary matrices).
Recall the elementary row operations:
- swaps: swapping two rows.
- sums: replacing a row by the sum itself and a multiple of another row.
- scalar multiplication: replacing a row by a scalar multiple times itself.
Example: Consider a \(3 \times 3\) matrix
\(\mathbf{A} = \begin{pmatrix} 4 & -2 & 4 \\ 5 & -4 & 6 \\ 9 & 1 & -2 \end{pmatrix}\)
- What is the elementary matrix (let’s call it \(\mathbf{E}_1\) that swaps the first and second rows of \(\mathbf{A}\)?
\(\mathbf{E}_1 = \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}\)
## [,1] [,2] [,3] ## [1,] 4 -2 4 ## [2,] 5 -4 6 ## [3,] 9 1 -2
## [,1] [,2] [,3] ## [1,] 5 -4 6 ## [2,] 4 -2 4 ## [3,] 9 1 -2
Thus, the matrix \(\mathbf{E}_1 = \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}\) is the matrix that swaps the first and second row.
- What is the elementary matrix (let’s call it \(\mathbf{E}_2\) that adds two times the first of \(\mathbf{A}\) to the third row of \(\mathbf{A}\)?
\(\mathbf{E}_2 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{pmatrix}\)
## [,1] [,2] [,3] ## [1,] 4 -2 4 ## [2,] 5 -4 6 ## [3,] 9 1 -2
## [,1] [,2] [,3] ## [1,] 4 -2 4 ## [2,] 5 -4 6 ## [3,] 17 -3 6
Thus, the matrix \(\mathbf{E}_2 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{pmatrix}\) is the matrix that adds two times the first of \(\mathbf{A}\) to the third row of \(\mathbf{A}\)
- What is the elementary matrix (let’s call it \(\mathbf{E}_3\) that mutliples the second row of \(\mathbf{A}\) by 3?
\(\mathbf{E}_3 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{pmatrix}\)
## [,1] [,2] [,3] ## [1,] 4 -2 4 ## [2,] 5 -4 6 ## [3,] 9 1 -2
## [,1] [,2] [,3] ## [1,] 4 -2 4 ## [2,] 15 -12 18 ## [3,] 9 1 -2
Thus, the matrix \(\mathbf{E}_3 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{pmatrix}\) is the matrix that mutliples the second row of \(\mathbf{A}\) by 3.
Question: Do you see any patterns with how the example elementary matrices look?
\[ \begin{aligned} \mathbf{E_1} = \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix} && \mathbf{E_2} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{pmatrix} && \mathbf{E_3} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{pmatrix} \end{aligned} \]
- The elementary matrices look like the identity matrix \(\mathbf{I}\) with an elementary row operation applied to \(\mathbf{I}\). In fact, this leads us to this general fact:
Fact: If an elementary row matrix is applied to the \(m \times n\) matrix \(\mathbf{A}\), the result of this elementary row operation applied to \(\mathbf{A}\) can be written as \(\mathbf{E} \mathbf{A}\) where \(\mathbf{E}\) is the \(m \times m\) identity matrix \(\mathbf{I}\) with the respective elementary row operation applied to \(\mathbf{I}\).
Fact: Each elementary matrix \(\mathbf{E}\) is invertible
Example: in class
The next theorem is quite important as the result gives an algorithm for calculating the inverse of a \(n \times n\) matrix \(\mathbf{A}\) which also makes it possible to solve matrix equations \(\mathbf{A}\mathbf{x} = \mathbf{b}\)
Theorem 9.3 If an \(n \times n\) matrix \(\mathbf{A}\) is invertible, then \(\mathbf{A}\) is row-equivalent to \(\mathbf{I}\) (\(\mathbf{A} \sim \mathbf{I}\); row-equivalent means \(\mathbf{A}\) can be reduced to \(\mathbf{I}\) using elementary row operations). The row-equivalency implies that there is a series of elementary row operations (e.g., elementary matrices \(\mathbf{E}_1, \ldots, \mathbf{E}_k\)) that converts \(\mathbf{A}\) to \(\mathbf{I}\). In addition, the application of these row matrices to \(\mathbf{I}\) transforms \(\mathbf{I}\) to the matrix inverse \(\mathbf{A}^{-1}\).
- Proof: in class
9.2 Finding the inverse of \(\mathbf{A}\)
The previous theorem states that for a \(n \times n\) invertible matrix \(\mathbf{A}\), the elementary row operations that covert \(\mathbf{A}\) to \(\mathbf{I}\) also convert \(\mathbf{I}\) to \(\mathbf{A}^{-1}\). This suggests an algorithm for finding the inverse \(\mathbf{A}^{-1}\) of \(\mathbf{A}\):
Create the augmented matrix \(\begin{pmatrix} \mathbf{A} & \mathbf{I} \end{pmatrix}\) and row reduce the augmented matrix. If the row-reduced augmented matrix is of the form \(\begin{pmatrix} \mathbf{I} & \mathbf{A}^{-1} \end{pmatrix}\) then \(\mathbf{A}^{-1}\) is the inverse of \(\mathbf{A}\). If the leading matrix in the augmented matrix is not the identity matrix \(\mathbf{I}\), then \(\mathbf{A}\) is not row equivalent to \(\mathbf{I}\) and is therefore not invertible.
Example 9.4 Let \(\mathbf{A} = \begin{pmatrix} -3 & -3 & -4 \\ -4 & 2 & -4 \\ 4 & -4 & 4 \end{pmatrix}\). Does \(\mathbf{A}\) have an inverse, and if so, what is it?
Using R
9.3 The Invertible Matrix Theorem
Theorem 9.4 (The Invertible Matrix Theorem) Let \(\mathbf{A}\) be an \(n \times n\) matrix. Then the following statements are equivalent (i.e., they are all either simultaneously true or false).
\(\mathbf{A}\) is an invertible matrix.
\(\mathbf{A}\) is row equivalent to the \(n \times n\) identity matrix \(\mathbf{I}\) (\(\mathbf{A} \sim \mathbf{I}\)).
\(\mathbf{A}\) and \(n\) pivot columns.
The homogeneous matrix equation \(\mathbf{A} \mathbf{x} = \mathbf{0}\) has only the trivial solution \(\mathbf{x} = \mathbf{0}\).
The columns of \(\mathbf{A}\) are linearly independent.
The linear transformation \(T:\mathcal{R}^n \rightarrow \mathcal{R}^n\) given by the matrix transformation \(\mathbf{x} \rightarrow \mathbf{A}\mathbf{x}\) is one-to-one.
The inhomogeneous matrix equation \(\mathbf{A} \mathbf{x} = \mathbf{b}\) has a unique solution for all \(\mathbf{b} \in \mathcal{R}^n\).
The columns of \(\mathbf{A}\) span \(\mathcal{R}^n\).
The linear transformation \(\mathbf{x} \rightarrow \mathbf{A} \mathbf{x}\) maps \(\mathcal{R}^n\) onto \(\mathcal{R}^n\).
There is an \(n \times n\) matrix \(\mathbf{C}\) such that \(\mathbf{C}\mathbf{A} = \mathbf{I}\).
There is an \(n \times n\) matrix \(\mathbf{D}\) such that \(\mathbf{A}\mathbf{D} = \mathbf{I}\).
\(\mathbf{A}'\) is an invertible matrix.
Proof. In class
A result of the invertible matrix theorem is that if \(\mathbf{A}\) and \(\mathbf{B}\) are \(n \times n\) matrices with \(\mathbf{A} \mathbf{B} = \mathbf{I}\) then \(\mathbf{A} = \mathbf{B}^{-1}\) and \(\mathbf{B} = \mathbf{A}^{-1}\).
9.4 Invertible Linear Transformations
Definition 9.3 A linear transformation \(T:\mathcal{R}^n \rightarrow \mathcal{R}^n\) is said to be invertible if there exists a transformation \(S:\mathcal{R}^n \rightarrow \mathcal{R}^n\) such that
\[ \begin{aligned} S(T(\mathbf{x})) = \mathbf{x} && \mbox{for all } \mathbf{x} \in \mathcal{R}^n T(S(\mathbf{x})) = \mathbf{x} && \mbox{for all } \mathbf{x} \in \mathcal{R}^n \\ \end{aligned} \]
- Draw figure in class
Theorem 9.5 Let \(T:\mathcal{R}^n \rightarrow \mathcal{R}^n\) be a linear transformation and let \(\mathbf{A}\) be the matrix representing the transformation \(T\). Then the transformation \(T\) is invertible if and only if the matrix \(\mathbf{A}\) is invertible. Therefore, the matrix that represents \(S:\mathcal{R}^n \rightarrow \mathcal{R}^n\), the inverse transformation of \(T\), is unique and is represented by the matrix \(\mathbf{A}^{-1}\).