Determinants

The determinant of a matrix is a scalar value just like the rank that says something about the matrix. The determinant is only defined for square matrices and is denoted as \(det(A)\) or \(|A|\) for the matrix \(A \in \mathbb{R}^{n \times n}\).

So what does the determinant tell us about a matrix? The determinant of a matrix is a measure of how much the matrix scales the area or volume of the space that it is transforming. So in other words, we initially have a unit square or cube and then after the transformation we have a polyhedron in \(n\) dimensions. So in 2D, the basis vectors span a square and in 3D the basis vectors span a cube. When the matrix is applied to the basis vectors, the determinant tells us how much the area or volume of the basis has been scaled by. We know that the matrix for a linear transformation is where the standard basis vectors are transformed into the basis vectors of the space. So we can directly read off the coordinates of the transformed polyhedron from the multiplying matrix.

Calculation of the determinant of a 2x2 matrix.

If the determinant is positive, then the matrix has scaled the area or volume of the basis of the polyhedron by that factor. If the determinant is negative, then it is the same scaling factor but the polyhedron has been flipped, so the orientation of the basis has been changed. If the determinant is 0, then the matrix has squished the basis down to a lower dimension such as from a square to a line or from a cube to a square.

The case of where the determinant is 0 is the most interesting case. This is the case where the matrix does not have an inverse, so the matrix is singular. In other words, the rank of the matrix is less than \(n\) where \(n\) is the number of rows and columns of the matrix. For a system of linear equations, this means that the system of equations has no unique solution.

This is why the determinant is useful. It tells us if a matrix has an inverse and if a system of equations has a unique solution amongst other things.

Determinant of a 2x2 Matrix

From the definition of the determinant above and pythagoras theorem we can derive a formula for the determinant of a 2x2 matrix:

\[det(A) = \begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad - bc \]

Derivation of the formula for the determinant of a 2x2 matrix using pythagoras theorem.

Todo

This can also be shown using dot products or cross products?

\[ad-bc = \|\mathbf{a}\| \|\mathbf{b}\| \sin(\theta) = \|\mathbf{a}\| \|\mathbf{b}\| \cos(\theta + \frac{\pi}{2}) \]

Example

If we have the identitiy matrix \(\mathbf{I}\) then the determinant is 1:

\[\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix} = 1 \cdot 1 - 0 \cdot 0 = 1 \]

This matches our intuition that the identity matrix does not change the area of the unit square and so the determinant is 1.

Determinant of a 3x3 Matrix

We can do the same for a 3x3 matrix but the derivation is a bit more complicated for the formula so I will just give you the formula and you will have to trust me that it is correct.

\[det(A) = \begin{vmatrix} a & b & c \\ d & e & f \\ g & h & i \end{vmatrix} = aei + bfg + cdh - ceg - bdi - afh \]

Now for most people this formula is not very useful as it is quite complicated to remember. So we can use the rule of Sarrus to easily calculate the determinant of a 3x3 matrix. The rule of Sarrus is a mnemonic for the formula above. The way it works is that it defines a specific pattern from which we can derive the formula for the determinant of a 3x3 matrix.

We start by writing the matrix out in a 3x3 grid and then we copy the first two columns to the right of the matrix. Then we draw three diagonals from the top left to the bottom right and three diagonals from the top right left to the bottom left. We then multiply the numbers along the diagonals. If the diagonal goes from the top left to the bottom right then we add the products together and if the diagonal goes from the top right to the bottom left then we subtract the products.

Rule of Sarrus for calculating the determinant of a 3x3 matrix.

Todo

Can also be written as:

\[det(A) = |\mathbf{c}(\mathbf{a} \times \mathbf{b})| \]

Properties of the Determinant

Now that we have the formula for the determinant of a 2x2 matrix we can start to explore some of the properties of the determinant. We will show the derivations for the properties in the 2D space but they can be generalized to higher dimensions. This will help us to understand why the determinant is defined the way it is and how we can generalize the calculation of the determinant to higher dimensions.

Transpose

The determinant of a matrix is the same as the determinant of the transpose of the matrix. The only thing that changes is the order of multiplication which does not change the result as multiplication is commutative. This can be seen by looking at the formula for the determinant of a 2x2 matrix and then applying the transpose operation to it:

\[\begin{align*} det(A) &= \begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad - bc \\ det(A^T) &= \begin{vmatrix} a & c \\ b & d \end{vmatrix} = ad - cb \end{align*} \]

The same applies to the determinant of a 3x3 matrix and higher dimensions. The determinant of a matrix is equal to the determinant of its transpose.

Todo

Is there a better proof?

Scalar Multiplication

The determinant of a matrix multiplied by a scalar is the determinant of the matrix scaled by that scalar to the power of the number of rows or columns of the matrix.

\[\begin{align*} det(kA) &= k^n det(A) \\ det(A) &= k \begin{vmatrix} a & b \\ c & d \end{vmatrix} = \begin{vmatrix} ka & kb \\ kc & kd \end{vmatrix} = (ka)(kd) - (kb)(kc) = k^2(ad) - k^2(bc) = k^2(ad - bc) = k^2det(A) \end{align*} \]

Swapping Rows

When two rows of a matrix are swapped the determinant of the matrix changes sign. This means that performing gaussian elimination on a matrix can change the sign of the determinant.

\[\begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad - bc \quad \text{and} \quad \begin{vmatrix} c & d \\ a & b \end{vmatrix} = cb - ad = -(ad - cb) \]

Adding Matrices

Todo

This actually makes the determinant linear with respect to the rows of the matrix (can also show multiply a row). And because the transpose is the same as the original matrix, we can also say that the determinant is linear with respect to the columns of the matrix.

If we have two matrices and we add them together, then the determinant of the resulting matrix is not equal to the sum of the determinants of the two matrices. In fact, we can’t say anything about the determinant of the sum of two matrices.

\[det(A + B) \neq f(det(A), det(B)) \]

There is an exception to this rule which is if all the rows of the two matrices are the same apart from one row, and we only add that one row together. So in other words, we are adding something to a row like the following example:

\[\begin{vmatrix} a & b \\ c & d \end{vmatrix} + \begin{vmatrix} a & b \\ x & y \end{vmatrix} = \begin{vmatrix} a & b \\ c + x & d + y \end{vmatrix} = a(d + y) - b(c + x) = ad + ay - bc - bx = ad - bc + ay - bx \]

Then the determinant of the resulting matrix is equal to the sum of the determinants of the two matrices:

\[det(A + B) = det(A) + det(B) \]

So adding a multiple of one row to another could be a special case of this:

\[\begin{vmatrix} a & b \\ c & d \end{vmatrix} + \begin{vmatrix} a & b \\ \lambda a & \lambda b \end{vmatrix} = \begin{vmatrix} a & b \\ c + \lambda a & d + \lambda b \end{vmatrix} = a(d + \lambda b) - b(c + \lambda a) = ad + \lambda ab - bc - \lambda ab = ad - bc \]

Then the determinant of the resulting matrix is equal to the determinant of the original matrix:

\[det(A + B) = det(A) \]

The reason for this is because the matrix containing the row that we added to the other matrix is a linear combination of some other row of the matrix, so it will have a linearly dependent row and therefore the determinant will be 0. So we can see that adding a multiple of one row to another does not change the determinant of the matrix. Meaning that performing gaussian elimination to get a matrix into row echelon form does not change the value of the determinant of the matrix. However, if we have to perform row swaps to get the matrix into row echelon form, then the determinant will change sign. We will see that this is unfortunately not the case for gauss-jordan elimination as we will see later on due to the fact that we are not only adding a multiple of one row to another but also multiplying rows by scalars which does change the determinant.

There is also a formula for the determinant of the sum of two matrices in the 2D case, but it is not very useful as it is quite complicated to remember and you are better off just calculating the determinant of the resulting matrix directly. The formula is:

\[det(A + B) = det(A) + det(B) + tr(A)tr(B) - tr(AB) \]

Where \(tr(A)\) is the trace of the matrix \(A\) which is the sum of the diagonal elements of the matrix.

Determinants of Triangular Matrices

For a triangular matrix the determinant is very easy to calculate, this trick can also be extended to more specific triangular matrices like upper, lower and diagonal matrices. Let’s look at a 3x3 upper triangular matrix see what happens when for example the matrix is a 3x3 upper triangular matrix so certain elements are 0.

\[\begin{vmatrix} a & b & c \\ d & e & f \\ g & h & i \end{vmatrix} = aei + bfg + cdh - ceg - bdi - afh \]

So now if \(d = g = h = 0\) then the determinant simplifies to just the product of the diagonal elements \(aei\):

\[\begin{vmatrix} a & b & c \\ 0 & e & f \\ 0 & 0 & i \end{vmatrix} = aei + 0 + 0 - 0 - 0 - 0 = aei \]

This is true for any triangular matrix. For example the same can be seen for a lower triangular matrix so where \(b = c = f = 0\) then the determinant is just the product of the diagonal elements \(adi\):

\[\begin{vmatrix} a & 0 & 0 \\ d & e & 0 \\ g & h & i \end{vmatrix} = aei + 0 + 0 - 0 - 0 - 0 = aei \]

So we can summarize that the determinant of a triangular matrix is the product of the diagonal elements of the matrix. This also applies to diagonal matrices and the identity matrix as they are also triangular matrices:

\(det(\text{upper/lower triangular matrix}) = aei\)
\(det(\text{diagonal matrix}) = aei\)
\(det(\mathbf{I}) = aei = 1\)

Matrix Multiplication

The most important property of the determinant is that it is multiplicative. This means that the determinant of a matrix multiplied by another matrix is equal to the product of the determinants of the two matrices. This is a very useful property as it allows us to calculate the determinant of a product of matrices without having to calculate the determinant of the product directly:

\[\begin{align*} det(AB) &= det(A)det(B) \\ det(A) &= \begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad - bc \\ det(B) &= \begin{vmatrix} e & f \\ g & h \end{vmatrix} = eh - fg \\ det(AB) &= \begin{vmatrix} ae + bg & af + bh \\ ce + dg & cf + dh \end{vmatrix} = (ae + bg)(cf + dh) - (af + bh)(ce + dg) \\ &= (ae)(cf) + (ae)(dh) + (bg)(cf) + (bg)(dh) - (af)(ce) - (af)(dg) - (bh)(ce) - (bh)(dg) \\ &= aecf + aedh + bgcf + bgdh - afce - afdg - bhce - bhdg \\ &= acef + adeh + bcgf + bdgh - acfe - adfg - bche - bdhg \\ &= (ad)(eh) + (bc)(fg) - (ad)(fg) - (bc)(eh) \\ &= (ad - bc)(eh - fg) \end{align*} \]

This again shows us that performing gaussian elimination on a matrix does not change the value of the determinant of the matrix. This is because we can always express gaussian elimination as a product of elimination matrices which are lower triangular matrices with 1s on the diagonal. Therefore the determinant of these matrices is 1, so multiplying the original matrix by these matrices does not change the value of the determinant:

\[\mathbf{E}_1 \mathbf{E}_2 \cdots \mathbf{E}_k \mathbf{A} = \mathbf{B} \implies det(\mathbf{B}) = det(\mathbf{E}_1)det(\mathbf{E}_2) \cdots det(\mathbf{E}_k) det(\mathbf{A}) = 1 \cdot 1 \cdots 1 \cdot det(\mathbf{A}) = det(\mathbf{A}) \]

When we are performing gauss-jordan elimination however, we are also multiplying rows by scalars, which results in elimination matrices that are diagonal matrices with the scalar on the diagonal. This means that the determinant of these matrices is equal to the product of the scalars on the diagonal. So for example if we are multiply the first row by a scalar \(k\) then the elimination matrix is:

\[\mathbf{E} = \begin{bmatrix} k & 0 & \ldots & 0 \\ 0 & 1 & \ldots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & 1 \end{bmatrix} \]

And the determinant of this matrix is \(k\) because the determinant of a diagonal matrix is the product of the diagonal elements. So if we multiply the original matrix by this elimination matrix, then the determinant of the resulting matrix is scaled by \(k\):

\[det(\mathbf{E}\mathbf{A}) = det(\mathbf{E})det(\mathbf{A}) = k \cdot det(\mathbf{A}) \]

meaning that performing gauss-jordan elimination changes the value of the determinant of the matrix. This again gives us another explanation as to why the determinant of an invertible matrix must be non-zero. If the matrix is invertable then at the end of the gauss-jordan elimination we will have the identity matrix which has a determinant of 1. So if we multiply the original matrix by a series of elimination matrices which all have non-zero determinants, then if the determinant of the resulting matrix is 0, then it must be the case that the determinant of the original matrix was also 0. So we can conclude that if the determinant of a matrix is 0, then the matrix is not invertible and therefore does not have an inverse.

This also means that if we have an LU decomposition of a matrix \(\mathbf{A}\) such that \(\mathbf{A} = \mathbf{LU}\) where \(\mathbf{L}\) is a lower triangular matrix and \(\mathbf{U}\) is an upper triangular matrix, then we can calculate the determinant of the matrix by just multiplying the diagonal elements of the matrices \(\mathbf{L}\) and \(\mathbf{U}\):

\[det(\mathbf{A}) = det(\mathbf{L})det(\mathbf{U}) = \prod_{i=1}^n l_{ii} \prod_{i=1}^n u_{ii} \]

Where \(l_{ii}\) and \(u_{ii}\) are the diagonal elements of the matrices \(\mathbf{L}\) and \(\mathbf{U}\) respectively. However, remember that the matrix \(\mathbf{L}\) is just a simply lower triangular matrix with 1s on the diagonal, so the determinant of \(\mathbf{L}\) is just 1. So we can simplify this to just the product of the diagonal elements of the matrix \(\mathbf{U}\):

\[det(\mathbf{A}) = det(\mathbf{U}) = \prod_{i=1}^n u_{ii} \]

Again showing that the determinant of a matrix and the determinant of the matrix in row echelon form are the same. This is a lot more efficient than calculating the determinant of the matrix and is often used in practice.

If we also allow for partial pivoting in the LU decomposition, then we also have to divide the determinant by the sign of the permutation that was applied to the matrix. This is because swapping rows changes the sign of the determinant. However, the value of the determinant is still the product of the diagonal elements of the matrices \(\mathbf{L}\) and \(\mathbf{U}\), just divided by the sign of the permutation, i.e the determinant of the permutation matrix.

\[\mathbf{P}\mathbf{A} = \mathbf{LU} \implies det(\mathbf{A}) = \frac{det(\mathbf{L})det(\mathbf{U})}{det(\mathbf{P})} = \frac{\mathbf{U}}{det(\mathbf{P})} \]

Where \(\mathbf{P}\) is the permutation matrix that was applied to the matrix \(\mathbf{A}\) to get it into the form \(\mathbf{LU}\). Because the permutation matrix is orthogonal, we also know that its inverse is equal to its transpose, and the transpose of a determinant is equal to the determinant of the transpose. So we can also just multiply the determinant of the permutation matrix rather than dividing by it:

\[det(\mathbf{A}) = det(P^{-1}) det(\mathbf{L}) det(\mathbf{U}) = det(\mathbf{P}^T) det(\mathbf{L}) det(\mathbf{U}) = det(\mathbf{P}) det(\mathbf{L}) det(\mathbf{U}) = det(\mathbf{P}) det(\mathbf{U}) \]

Determinant of Orthogonal Matrices

Because we know that the determinant of a matrix is multiplicative, we can also use this to show that the determinant of an orthogonal matrix is either 1 or -1. An orthogonal matrix is a square matrix whose rows and columns are orthonormal vectors. This means that the dot product of any two different rows or columns is 0 and the dot product of a row or column with itself is 1 which means that the following holds:

\[\mathbf{A}^T\mathbf{A} = \mathbf{I} \]

And because we know that the determinant of the identity matrix is 1 and that the determinant of a product of matrices is the product of the determinants of the matrices, we can use this to show that the determinant of an orthogonal matrix is either 1 or -1.

\[1 = det(\mathbf{I}) = det(\mathbf{A}^T\mathbf{A}) = det(\mathbf{A}^T)det(\mathbf{A}) = det(\mathbf{A})^2 \implies det(\mathbf{A}) = \pm 1 \]

Determinant of an Invertible Matrix

We can now also use the fact that the determinant is multiplicative to argue why the determinant of an invertible matrix cannot be 0. So we want to prove the following statement:

\[\mathbf{A} \text{ is invertible} \iff det(\mathbf{A}) \neq 0 \]

So let’s start with the forward direction. If the matrix \(\mathbf{A}\) is invertible then \(\mathbf{AA}^{-1} = I\) and so \(det(\mathbf{A})det(\mathbf{A}^{-1}) = det(\mathbf{I}) = 1\). Therefore \(det(\mathbf{A}) \neq 0\) as we otherwise have zero times something equals 1 which is not possible. So we have shown that if \(\mathbf{A}\) is invertible then \(det(\mathbf{A}) \neq 0\).

Now we need to show the backwards direction, so that if \(det(\mathbf{A}) \neq 0\), then \(\mathbf{A}\) is invertible. So, let us assume that \(det(\mathbf{A}) \neq 0\). For a \(2 \times 2\) matrix this implies that \(ad - bc \neq 0\) which also implies that \(a \neq 0\) or \(d \neq 0\). So we can now construct the inverse of the matrix \(\mathbf{A}\):

\[\mathbf{A}^{-1} = \begin{bmatrix} w & x \\ y & z \end{bmatrix} \]

such that

\[\mathbf{A}\mathbf{A}^{-1} = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} w & x \\ y & z \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \]

Expanding this matrix multiplication, we get the following system of equations:

\[\begin{align*} aw + by &= 1 \\ ax + bz &= 0 \\ cw + dy &= 0 \\ cx + dz &= 1 \end{align*} \]

Solving for \(w\), \(x\), \(y\), and \(z\), we find:

\[w = \frac{d}{ad - bc}, \quad x = \frac{-b}{ad - bc}, \quad y = \frac{-c}{ad - bc}, \quad z = \frac{a}{ad - bc}. \]

We notice that the denominator is always the same so when we enter the terms back into the matrix we can simplify the matrix to:

\[\mathbf{A}^{-1} = \frac{1}{ad - bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \]

This formula for the inverse of a \(2 \times 2\) matrix is valid as long as \(det(\mathbf{A}) = ad - bc \neq 0\) as otherwise we would be dividing by 0 which is not allowed. We can rewrite this to get a more compact form to quickly calculate the inverse of a \(2 \times 2\) matrix if it is invertible:

\[\mathbf{A}^{-1} = \frac{1}{det(\mathbf{A})} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \]

Now, we can use this result to examine the determinant of the inverse matrix. Recall that the determinant of a scalar multiple of a matrix scales by that scalar. Therefore, using the formula for \(\mathbf{A}^{-1}\) the determinant of \(\mathbf{A}^{-1}\) is given by:

\[\begin{align*} det(\mathbf{A}^{-1}) &= det\left(\frac{1}{ad - bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}\right) \\ &= \left(\frac{1}{ad - bc}\right)^2 det\begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \\ &= \frac{1}{(ad - bc)^2} (ad - bc) \\ &= \frac{1}{ad - bc} \\ &= \frac{1}{det(\mathbf{A})} \end{align*} \]

So the determinant of the inverse of a matrix is the reciprocal of the determinant of the original matrix. This holds for all invertible matrices. This again shows that if the determinant of the original matrix is 0, then the inverse does not exist as we cannot divide by 0 and can also not calculate the determinant of the inverse matrix as it doesn’t exist.

Determinant of Block Matrices

If we consider the following \(n \times n\) matrix \(\mathbf{M}\):

\[\mathbf{M} = \begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{0} & \mathbf{C} \end{bmatrix} \]

where we have that \(n > m\) and the following dimensions hold:

\(\mathbf{A} \in \mathbb{R}^{m \times m}\)
\(\mathbf{B} \in \mathbb{R}^{m \times (n - m)}\)
\(\mathbf{C} \in \mathbb{R}^{(n - m) \times (n - m)}\)
\(\mathbf{0} \in \mathbb{R}^{(n - m) \times m}\)

Then the determinant of this matrix can be calculated as follows:

\[det(\mathbf{M}) = det(\mathbf{A}) \cdot det(\mathbf{C}) \]

This is because we an rewrite the matrix \(\mathbf{M}\) as follows:

\[\mathbf{M} = \begin{bmatrix} \mathbf{I} & \mathbf{B} \\ \mathbf{0} & \mathbf{C} \\ \end{bmatrix} \begin{bmatrix} \mathbf{A} & \mathbf{0} \\ \mathbf{0} & \mathbf{I} \end{bmatrix} \]

Now we notice that the first matrix is a block upper triangular matrix and the second matrix is a block diagonal matrix. The determinant of a block upper triangular matrix is the product of the determinants of the diagonal blocks, and the determinant of a block diagonal matrix is also the product of the determinants of the diagonal blocks. So the determinant of the left factor is simply \(det(\mathbf{C})\) and the determinant of the right factor is simply \(det(\mathbf{A})\). Therefore, we can conclude that the determinant of the matrix \(\mathbf{M}\) is simply the product of the determinants of the diagonal blocks as the determinant of a matrix multiplication is the product of the determinants of the matrices.

Example

Let’s calculate the determinant of the following matrix:

\[\mathbf{M} = \begin{bmatrix} 2 & 0 & 0 & 4 & 0 & 7 \\ 9 & 0 & 0 & 0 & 0 & 4 \\ 1 & 0 & 0 & 0 & 0 & 0 \\ 3 & 2 & 0 & 5 & 0 & 7 \\ 2 & 3 & 1 & 5 & 0 & 2 \\ 8 & 8 & 7 & 3 & 2 & 1 \end{bmatrix} \]

we can see that this isn’t quiet in the form that we want it, so the idea is to use some of our tricks to get it into the form that we want it. We can do this by first taking the transpose as we can see that the top rows have a lot of zeros which we want to have at the bottom left of the matrix:

\[\mathbf{M}^T = \begin{bmatrix} 2 & 9 & 1 & 3 & 2 & 8 \\ 0 & 0 & 0 & 2 & 3 & 8 \\ 0 & 0 & 0 & 0 & 1 & 7 \\ 4 & 0 & 0 & 5 & 5 & 3 \\ 0 & 0 & 0 & 0 & 0 & 2 \\ 7 & 4 & 0 & 7 & 2 & 1 \end{bmatrix} \]

Now we also remember swapping 2 rows changes the sign of the determinant. If we then swap another two rows, then the sign of the determinant will not change. So we can perform two swaps to get the matrix into the form that we want it:

\[\mathbf{M}' = \begin{bmatrix} 2 & 9 & 1 & 3 & 2 & 8 \\ 4 & 0 & 0 & 5 & 5 & 3 \\ 7 & 4 & 0 & 7 & 2 & 1 \\ 0 & 0 & 0 & 2 & 3 & 8 \\ 0 & 0 & 0 & 0 & 0 & 2 \\ 0 & 0 & 0 & 0 & 1 & 7 \end{bmatrix} \]

Now our matrix is in the desired form, leaving us just having to calculate the determinant of the two blocks. The first block is a \(3 \times 3\) matrix and the second block is a \(3 \times 3\) matrix. So we can use the rule of Sarrus to calculate the determinant of both blocks:

\[det(\mathbf{M}') = det(\begin{bmatrix} 2 & 9 & 1 \\ 4 & 0 & 0 \\ 7 & 4 & 0 \end{bmatrix}) \cdot det(\begin{bmatrix} 0 & 2 & 3 \\ 0 & 0 & 2 \\ 0 & 1 & 7 \end{bmatrix} \]

This gives us:

\[det(mathbf{M}) = -16 \cdot 4 = -64 \]

Co-Factor Expansion for NxN Matrices

We have seen how we can calculate the determinant of a \(2 \times 2\) and \(3 \times 3\) matrix. But how can we calculate the determinant of a \(4 \times 4\) matrix or even higher? The answer is to use the co-factor expansion. The co-factor expansion is a recursive way of calculating the determinant of a matrix by expanding the matrix into smaller matrices.

Before we do this however we need to understand a bit more about permutations and the sign of a permutation. A permutation is a reordering of a set of numbers. So it is a bijective function from a set of numbers to itself:

\[\sigma: \{1, 2, 3, \ldots, n\} \rightarrow \{1, 2, 3, \ldots, n\} \]

The reason why it is bijective is because an ordering of a set of numbers is a one-to-one mapping from the set of numbers to itself. This also means that there is an inverse permutation that maps the set of numbers back to the original ordering. From combinatorics we know that there are \(n!\) permutations of a set of \(n\) numbers.

The sign of a permutation is a bit weird. It can be either 1 or -1. The sign of a permutation counts the parity of the number of pairs of elements that are out of order after the permutation has been applied. So if the number of pairs of elements that are out of order is even then the sign of the permutation is 1 and if the number of pairs of elements that are out of order is odd then the sign of the permutation is -1.

\[\text{sign}(\sigma) = \begin{cases} 1 & \text{if} \forall (i,j) \text{ such that } i < j \text{ then } |\sigma(i) > \sigma(j)| \text{ is even} \\ -1 & \text{if} \forall (i,j) \text{ such that } i < j \text{ then } |\sigma(i) > \sigma(j)| \text{ is odd} \end{cases} \]

For this it is best to look at an example. Let’s look at the permutation \(\sigma = (1, 3, 2, 4)\). The candidate pairs are \((1,2), (1,3), (1,4), (2,3), (2,4), (3,4)\). After the permutation has been applied the pairs are \((1,3), (1,2), (1,4), (3,2), (3,4), (2,4)\). So the number of pairs that are out of order so where now the first element is greater than the second element is 1, the pair \((3,2)\). So the sign of the permutation is -1.

The sign of a permutation also corresponds to the determinant of the corresponding permutation matrix. Hence when we permute the rows of a matrix, the sign of the permutation tells us how the determinant of the matrix changes. If we swap two rows, then the sign of the permutation changes and so the determinant changes sign. If we apply an even number of swaps, then the sign of the permutation does not change and so the determinant does not change:

\[det(\sigma(A)) = det(P) det(A) = \text{sign}(\sigma) det(A) \]

Where \(P\) is the permutation matrix corresponding to the permutation \(\sigma\).

The determinant of a matrix can then be calculated using the following formula where \(\Sigma_n\) is the set of all permutations of the set of numbers \(\{1, 2, 3, \ldots, n\}\):

\[det(A) = \sum_{\sigma \in \Sigma_n} \text{sign}(\sigma) \prod_{i=1}^n a_{i, \sigma(i)} \]

So for a specific permutation \(\sigma\) we calculate the sign of the permutation and then multiply the elements of the matrix at the row \(i\) and column \(\sigma(i)\) for all \(i\). Let’s look at what this looks like for a \(2 \times 2\) matrix. We first define the 2 possible permutations and their signs:

\(\sigma_1 = (1, 2)\), all in order so there are 0 pairs out of order which is even, \(\text{sign}(\sigma_1) = 1\)
\(\sigma_2 = (2, 1)\), \(\text{sign}(\sigma_2) = -1\)

So the determinant of a \(2 \times 2\) matrix is then:

\[\begin{align*} det(A) = \begin{vmatrix} a & b \\ c & d \end{vmatrix} \\ &= \text{sign}(\sigma_1) \prod_{i=1}^2 a_{i, \sigma_1(i)} + \text{sign}(\sigma_2) \prod_{i=1}^2 a_{i, \sigma_2(i)} \\ &= 1 \cdot a_{1, 1}b_{2, 2} + (-1) \cdot a_{2, 1}b_{1, 2} \\ &= ad - bc \end{align*} \]

So we can see that this formula generalizes to the formula for the determinant of a \(2 \times 2\) matrix that we derived earlier. Now let’s look at the formula for the determinant of a \(3 \times 3\) matrix.

Which can also be written as:

So we can actually calculate the determinant of a \(3 \times 3\) matrix using the formula for the determinant of a \(2 \times 2\) matrix. This then further generalizes to higher dimensions so we could calculate the determinant of a \(4 \times 4\) matrix using the formula for the determinant of a \(3 \times 3\) matrix and so on.

From the above example we can actually see a pattern. We are always multiplying an element \(a_{ij}\) with the determinant of the smaller matrix where the row \(i\) and column \(j\) have been removed. So if we are given a matrix \(\mathbf{A} \in \mathbb{R}^{n \times n}\) then we define the matrix without the \(i\)-th row and \(j\)-th column as \(\mathscr{A}_{ij} \in \mathbb{R}^{(n-1) \times (n-1)}\). We can then define a new matrix \(\mathbf{C}\) the so-called co-factor matrix where the element \(c_{ij}\) is defined as:

\[c_{ij} = (-1)^{i+j} det(\mathscr{A}_{ij}) \]

We can then calculate the determinant of the matrix \(\mathbf{A}\) using the co-factor expansion recursively. We choose some \(i \in \{1, 2, 3, \ldots, n\}\) and then calculate the determinant of the matrix \(\mathbf{A}\) as:

\[det(\mathbf{A}) = \sum_{j=1}^n a_{ij} c_{ij} \]

Example

Let’s calculate the co-factor matrix for the following matrix and then calculate the determinant of the matrix using the co-factor expansion:

\[\mathbf{A} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \]

Let’s start with the first co-factor \(c_{11}\). We remove the first row and first column from the matrix and then calculate the determinant of the resulting matrix:

\[det(\mathscr{A}_{11}) = \begin{vmatrix} 5 & 6 \\ 8 & 9 \end{vmatrix} = 5 \cdot 9 - 6 \cdot 8 = 45 - 48 = -3 \]

The co-factor \(c_{11}\) is then:

\[c_{11} = (-1)^{1+1} \cdot det(\mathscr{A}_{11}) = 1 \cdot -3 = -3 \]

If we continue this process for all the co-factors we get the following co-factor matrix:

\[\mathbf{C} = \begin{bmatrix} -3 & 6 & -3 \\ 6 & -12 & 6 \\ -3 & 6 & -3 \end{bmatrix} \]

The determinant of the matrix \(\mathbf{A}\) is then depending on the row we choose:

\[\begin{align*} det(\mathbf{A}) &= \sum_{j=1}^3 a_{1j} c_{1j} \\ &= 1 \cdot -3 + 2 \cdot 6 + 3 \cdot -3 = -3 + 12 - 9 = 0 \text{ or} \\ det(\mathbf{A}) &= \sum_{j=1}^3 a_{2j} c_{2j} \\ &= 4 \cdot 6 + 5 \cdot -12 + 6 \cdot 6 = 24 - 60 + 36 = 0 \text{ or} \\ det(\mathbf{A}) &= \sum_{j=1}^3 a_{3j} c_{3j} \\ &= 7 \cdot -3 + 8 \cdot 6 + 9 \cdot -3 = -21 + 48 - 27 = 0 \end{align*} \]

So we can see that the determinant of the matrix \(\mathbf{A}\) is 0, which means that the matrix is singular and does not have an inverse. We can also compare this to the formula for the determinant of a \(3 \times 3\) matrix that we derived earlier and see that it is the same.

\[\begin{align*} det(\mathbf{A}) &= aei + bfg + cdh - ceg - bdi - afh \\ &= (1 \cdot 5 \cdot 9) + (2 \cdot 6 \cdot 7) + (3 \cdot 4 \cdot 8) - (3 \cdot 5 \cdot 7) - (2 \cdot 4 \cdot 9) - (1 \cdot 6 \cdot 8) \\ &= 45 + 84 + 96 - 105 - 72 - 48 = 0 \end{align*} \]

Allthough it isn’t very efficient we can also calculate the inverse of a matrix using the co-factor expansion. We already saw this for the \(2 \times 2\) matrix, where we get the following as the co-factor matrix:

\[\mathbf{C} = \begin{bmatrix} d & -c \\ -b & a \end{bmatrix} \]

And we get the formula for the inverse of a \(2 \times 2\) matrix:

\[\mathbf{A}^{-1} = \frac{1}{det(\mathbf{A})} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix} = \frac{1}{det(\mathbf{A})} \mathbf{C}^T \]

The transpose of the co-factor matrix is called the adjugate matrix and is denoted as \(\text{adj}(\mathbf{A})\). So we can write the formula for the inverse of a general matrix as:

\[\mathbf{A}^{-1} = \frac{1}{det(\mathbf{A})} C^T = \frac{1}{det(\mathbf{A})} \text{adj}(\mathbf{A}) \]

Cramer’s Rule

We can also solve a system of linear equations using the determinant of a matrix. This is called Cramer’s rule. So let’s say we have a system of linear equations:

\[\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} \]

The idea then relies on the fact that we can then construct the following equation:

\[\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix} \begin{bmatrix} x_1 & 0 & 0 \\ x_2 & 1 & 0 \\ x_3 & 0 & 1 \end{bmatrix} = \begin{bmatrix} b_1 & a_{12} & a_{13} \\ b_2 & a_{22} & a_{23} \\ b_3 & a_{32} & a_{33} \end{bmatrix} \]

We can then see that the middle matrix is a triangular matrix and so the determinant is the product of the diagonal elements, which in this case is just \(x_1\). So if we denote the special matrix on the right where the \(i\)-th column is replaced by the vector \(\mathbf{b}\) as \(\mathscr{B}_i\) and use the fact that the determinant of a matrix multiplied by another matrix is the product of the determinants of the two matrices we get:

\[\begin{align*} det(\mathbf{A}) \cdot x_1 = det(\mathscr{B}_1) \\ x_1 = \frac{det(\mathscr{B}_1)}{det(\mathbf{A})} \end{align*} \]

In the same way we can then also calculate \(x_2\) and \(x_3\) by replacing the \(i\)-th column of the matrix \(\mathbf{A}\) with the vector \(\mathbf{b}\) and then calculating the determinant of the resulting matrix. So more generally we can solve a system of linear equations using Cramer’s rule as:

\[x_i = \frac{det(\mathscr{B}_i)}{det(\mathbf{A})} \]

We can also see that if the system is not solvable, i.e the matrix \(\mathbf{A}\) isn’t full rank and doesn’t have an inverse, then the determinant of the matrix is also 0 and we wouldn’t be able to do the division. So solving a system of linear equations using Cramer’s rule stays consistent with our other methods. Just like with the formula for calculating the inverse of a matrix using the co-factor expansion, this method is not very efficient as it requires calculating the determinant of multiple different matrices. However, it is a nice theoretical result that shows how the determinant can be used to solve systems of linear equations.