Linear Algebra/The Permutation Expansion

The prior subsection defines a function to be a determinant if it satisfies four conditions and shows that there is at most one $$n \! \times \! n$$ determinant function for each $$n$$. What is left is to show that for each $$n$$ such a function exists.

How could such a function not exist? After all, we have done computations that start with a square matrix, follow the conditions, and end with a number.

The difficulty is that, as far as we know, the computation might not give a well-defined result. To illustrate this possibility, suppose that we were to change the second condition in the definition of determinant to be that the value of a determinant does not change on a row swap. By Remark 2.2 we know that this conflicts with the first and third conditions. Here is an instance of the conflict: here are two Gauss' method reductions of the same matrix, the first without any row swap



\begin{pmatrix} 1 &2  \\ 3  &4 \end{pmatrix} \xrightarrow[]{-3\rho_1+\rho_2} \begin{pmatrix} 1 &2  \\ 0  &-2 \end{pmatrix} $$

and the second with a swap.



\begin{pmatrix} 1 &2  \\ 3  &4 \end{pmatrix} \xrightarrow[]{\rho_1\leftrightarrow\rho_2} \begin{pmatrix} 3 &4  \\ 1  &2 \end{pmatrix} \xrightarrow[]{-(1/3)\rho_1+\rho_2} \begin{pmatrix} 3 &4  \\ 0  &2/3 \end{pmatrix} $$

Following Definition 2.1 gives that both calculations yield the determinant $$-2$$ since in the second one we keep track of the fact that the row swap changes the sign of the result of multiplying down the diagonal. But if we follow the supposition and change the second condition then the two calculations yield different values, $$-2$$ and $$2$$. That is, under the supposition the outcome would not be well-defined &mdash; no function exists that satisfies the changed second condition along with the other three.

Of course, observing that Definition 2.1 does the right thing in this one instance is not enough; what we will do in the rest of this section is to show that there is never a conflict. The natural way to try this would be to define the determinant function with: "The value of the function is the result of doing Gauss' method, keeping track of row swaps, and finishing by multiplying down the diagonal". (Since Gauss' method allows for some variation, such as a choice of which row to use when swapping, we would have to fix an explicit algorithm.) Then we would be done if we verified that this way of computing the determinant satisfies the four properties. For instance, if $$T$$ and $$\hat{T}$$ are related by a row swap then we would need to show that this algorithm returns determinants that are negatives of each other. However, how to verify this is not evident. So the development below will not proceed in this way. Instead, in this subsection we will define a different way to compute the value of a determinant, a formula, and we will use this way to prove that the conditions are satisfied.

The formula that we shall use is based on an insight gotten from property (3) of the definition of determinants. This property shows that determinants are not linear.

Since scalars come out a row at a time, we might guess that determinants are linear a row at a time.

Multilinearity allows us to expand a determinant into a sum of determinants, each of which involves a simple matrix.

So an $$ n \! \times \! n $$ determinant expands into a sum of $$ n^n $$ determinants where each row of each summands contains a single entry from the starting matrix. However, many of these summand determinants are zero.

That example illustrates the key idea. We've applied multilinearity to a $$3 \! \times \! 3$$ determinant to get $$3^3$$ separate determinants, each with one distinguished entry per row. We can drop most of these new determinants because the matrices are singular, with one row a multiple of another. We are left with the one-entry-per-row determinants also having only one entry per column (one entry from the original determinant, that is). And, since we can factor scalars out, we can further reduce to only considering determinants of one-entry-per-row-and-column matrices where the entries are ones.

These are permutation matrices. Thus, the determinant can be computed in this three-step way (Step 1) for each permutation matrix, multiply together the entries from the original matrix where that permutation matrix has ones, (Step 2) multiply that by the determinant of the permutation matrix and (Step 3) do that for all permutation matrices and sum the results together.

To state this as a formula, we introduce a notation for permutation matrices. Let $$\iota_j$$ be the row vector that is all zeroes except for a one in its $$j$$-th entry, so that the four-wide $$\iota_2$$ is $$\begin{pmatrix} 0 &1 &0 &0 \end{pmatrix}$$. We can construct permutation matrices by permuting &mdash; that is, scrambling &mdash;  the numbers $$1$$, $$2$$, ..., $$n$$, and using them as indices on the $$\iota$$'s. For instance, to get a $$ 4 \! \times \! 4 $$ permutation matrix matrix, we can scramble the numbers from $$1$$ to $$4$$ into this sequence $$ \langle 3,2,1,4 \rangle $$ and take the corresponding row vector $$\iota$$'s.



\begin{pmatrix} \iota_{3} \\ \iota_{2} \\ \iota_{1} \\ \iota_{4} \end{pmatrix}= \begin{pmatrix} 0 &0  &1  &0  \\ 0  &1  &0  &0  \\ 1  &0  &0  &0  \\ 0  &0  &0  &1 \end{pmatrix} $$

This formula is often written in summation notation



\left|T\right|= \sum_{\text{permutations }\phi}\!\!\!\! t_{1,\phi(1)}t_{2,\phi(2)}\cdots t_{n,\phi(n)} \left|P_{\phi}\right| $$

read aloud as "the sum, over all permutations $$ \phi $$, of terms having the form $$ t_{1,\phi(1)}t_{2,\phi(2)}\cdots t_{n,\phi(n)} \left|P_{\phi}\right| $$". This phrase is just a restating of the three-step process (Step 1) for each permutation matrix, compute $$ t_{1,\phi(1)}t_{2,\phi(2)}\cdots t_{n,\phi(n)} $$ (Step 2) multiply that by $$ \left|P_{\phi}\right| $$ and (Step 3) sum all such terms together.

Computing a determinant by permutation expansion usually takes longer than Gauss' method. However, here we are not trying to do the computation efficiently, we are instead trying to give a determinant formula that we can prove to be well-defined. While the permutation expansion is impractical for computations, it is useful in proofs. In particular, we can use it for the result that we are after.

The proof is deferred to the following subsection. Also there is the proof of the next result (they share some features).

The consequence of this theorem is that, while we have so far stated results in terms of rows (e.g., determinants are multilinear in their rows, row swaps change the signum, etc.), all of the results also hold in terms of columns. The final result gives examples.

We finish with a summary (although the final subsection contains the unfinished business of proving the two theorems). Determinant functions exist, are unique, and we know how to compute them. As for what determinants are about, perhaps these lines help make it memorable.  Determinant none, Solution: lots or none. Determinant some, Solution: just one. 

Exercises
These summarize the notation used in this book for the $$2$$- and $$3$$- permutations. $$\begin{array}{c|cc} i         &1      &2    \\ \hline \phi_1(i) &1      &2     \\ \phi_2(i) &2      &1 \end{array} \qquad \begin{array}{c|ccc} i         &1     &2   &3    \\ \hline \phi_1(i) &1     &2   &3    \\ \phi_2(i) &1     &3   &2    \\ \phi_3(i) &2     &1   &3    \\ \phi_4(i) &2     &3   &1    \\ \phi_5(i) &3     &1   &2    \\ \phi_6(i) &3     &2   &1 \end{array} $$

/Solutions/