unnamed lin alg website

unnamed lin alg website: Simple linear algebra explanations!

This website is a work in progress!

Created by Eldrick Chen, creator of calculusgaming.com. Based on A First Course in Linear Algebra by Robert A. Beezer

Website Update History (Last update: )

Important: You might have to refresh the tab to view the latest updates to this website.

2025-09-26: “Eigenvalues and Linear Transformations” Update

This update features an introduction to eigenvalues, eigenvectors, and linear transformations. It also features new sections that summarize equivalent conditions for a matrix to be nonsingular/invertible.

New Content

New sections added:

Sections dedicated to equivalent properties of nonsingular matrices:
Intro to Eigenvalues and Eigenvectors
Intro to Linear Transformations
Injective and Surjective Linear Transformations

Section Improvements

Unit 1: Reduced Row-Echelon Form: reworded explanation of condition 4 of reduced row-echelon form; also added more examples of matrices in reduced row-echelon form

2025-09-07: The Summer Update

This large update contains all of the content I’ve added over the summer of 2025. Learn more about many of the fundamental objects of linear algebra: vectors, matrices, and vector spaces!

New Content

Three new units have been added:

Section Improvements

Unit 1: Vectors and Matrices: explained notation for vectors and matrices and their entries
Unit 1: Consistent Systems of Equations and Free/Dependent Variables: fixed error: a consistent system has infinitely many solutions if there are more variables than equations, not if there are more equations than variables
Leading 1s in matrices in reduced row-echelon form are now boxed
Unit 1: Singular Matrices section renamed to “Singular and Nonsingular Matrices”

2025-05-17: Initial Release

This is the first version of this website to be released. It features sections from the first unit of A First Course in Linear Algebra (Systems of Equations).

Website Content

Unit 1 sections:

Website Settings

Switch to a dark theme for those of you studying late at night! (This setting does not affect any of the images on this page, so they will stay bright.)

Dark Mode

If the bright images in dark mode bother you, you can invert the colors of graphs using this setting. Warning: this will change the colors of points and curves on each graph, making graph captions inaccurate in some cases.

Invert Graph Colors

Scientific Notation Format

Control the way very large and small numbers are displayed on this website. (Primarily intended for those of you who enjoy incremental games!)

Font Settings

Change this website’s font to a font of your choice! (Note: Font must be installed on your device)

Enter font name:

Font size multiplier (scale the font size by this amount):

Color Settings

Background color:

Text color:

Background Image (or GIF)

Background image size: pixels

Background image horizontal offset: pixels

Background image vertical offset: pixels

Background opacity: 30%

What Is This Website?

A note about links on this page: Internal links (links that bring you to another spot on this page) are colored in light blue. External links (links that open a different website) are colored in dark blue. External links will always open in a new tab.

This is one of the websites in the “unnamed ____ website” series (you can find the rest at calculusgaming.com). For more information about these websites, read the “What Is This Website?” section of unnamed calc website.

The lessons on this page are based on A First Course in Linear Algebra, a free online linear algebra resource. I strongly recommend viewing this page for more detailed explanations and proofs of linear algebra concepts!

Unit	Progress
Systems of Linear Equations	5/6
Vectors	5/6
Matrices	5/6
Vector Spaces	5/6
Determinants	2/2
Eigenvalues	1/3
Linear Transformations	3/4
Representations	0/4
All Units

Unit 1: Systems of Linear Equations

A First Course in Linear Algebra link: http://linear.pugetsound.edu/html/chapter-SLE.html

Intro to Systems of Linear Equations

In algebra, you’ve studied linear systems of equations before. A huge part of linear algebra involves the study of linear systems, so let’s review them.

A linear equation is an equation of the form \(a_1 x_1 + a_2 x_2 + \cdots + a_n x_n = b\), where \(a_1\) through \(a_n\) are constant coefficients and \(b\) is a constant. A linear system of equations is a set of linear equations.

When we solve a linear equation, that means finding the values of \(x_1\), \(x_2\), ..., \(x_n\) that makes every equation in a system true at the same time.

The set of all solutions to a linear system of equations is known as its solution set. There are three possibilities for the solution set of a linear system of equations.

Linear systems with one solution

Here’s an example of this type of system:

\[ 2x_1 + 3x_2 = 5 \] \[ x_1 - x_2 = 5 \]

The only solution to this system of equations is \(x_1 = 4\) and \(x_2 = -1\).

We can visually represent this system of equations as two lines: the first equation can be represented by the line \(2x + 3y = 5\) and the second equation by the line \(x - y = 5\). These two lines intersect at exactly one point, which is our solution.

The lines intersect at one point, so the system has one solution.

Linear systems with infinitely many solutions

Here’s an example of this type of system:

\[ x_1 + 3x_2 = 2 \] \[ 2x_1 + 6x_2 = 4 \]

Notice how the second equation is just the first equation multiplied by 2, so these two equations are really asking for the same thing! Therefore, there are infinitely many pairs \((x_1, x_2)\) which satisfy both equations.

Visually, we can represent this system of equations with the lines \(x + 3y = 2\) and \(2x + 6y = 4\). These lines are the exact same, so they have infinitely many intersection points!

The lines overlap, so the system has infinitely many solutions.

Linear systems with no solutions

Here’s an example of this type of system:

\[ x_1 + 3x_2 = 2 \] \[ 2x_1 + 6x_2 = 5 \]

There are no possible values of \(x_1\) and \(x_2\) that will make both of these equations simultaneously true.

Visually, we can represent this as two lines \(x + 3y = 2\) and \(2x + 6y = 5\). These lines are parallel to each other but not the same line, so there are no intersection points.

The lines are parallel to each other and never overlap, so the system has no solutions.

Equation operations

When we have a system of equations, there are three operations we can perform on them without changing the solution set.

Swapping the order of two equations
Multiplying an equation by a nonzero constant
Adding a constant multiple of one equation to another equation

Here’s an example with this system of equations. We’re going to perform each equation operation once on this system.

\[ 2x_1 + 3x_2 + 5x_3 = 7\] \[ 3x_1 - 6x_2 + 9x_3 = 12 \] \[ x_1 + x_3 = 2 \]

An example of the first operation is swapping the order of the first and second equations to get:

\[ \class{red}{3x_1 - 6x_2 + 9x_3 = 12} \] \[ \class{red}{2x_1 + 3x_2 + 5x_3 = 7} \] \[ x_1 + x_3 = 2 \]

We have swapped the order of the highlighted equations.

An example of the second operation is multiplying the third equation by 2 to get:

\[ 3x_1 - 6x_2 + 9x_3 = 12 \] \[ 2x_1 + 3x_2 + 5x_3 = 7 \] \[ \class{red}{2x_1 + 2x_3 = 4} \]

We have multiplied the highlighted equation by 2.

An example of the third operation is adding 3 times the second equation to the third equation to get:

\[ 3x_1 - 6x_2 + 9x_3 = 12 \] \[ \class{blue}{2x_1 + 3x_2 + 5x_3 = 7} \] \[ \class{red}{8x_1 + 9x_2 + 17x_3 = 25} \]

We have added 3 times the blue equation to the red equation.

Vectors and Matrices

In the section after this one, we will learn how to represent systems of linear equations using vectors and matrices. But let’s first talk about what vectors and matrices even are.

Vectors

A vector is a list of numbers. They can be represented in multiple ways: they can be written out like coordinates (e.g. \((1, 2, 3)\)), or as a column vector, where the numbers are stacked vertically:

\[ \mathbf{v} = \begin{bmatrix} 1\\ 2\\ 3 \end{bmatrix} \]

Vectors are usually denoted by lowercase letters in bold.

There is a special type of vector known as a zero vector, and it’s a vector that only contains zeros. Here’s an example of a zero vector:

\[ \mathbf{0} = \begin{bmatrix} 0\\ 0\\ 0 \end{bmatrix}\]

The zero vector is normally denoted by the digit 0 in bold.

I will use the notation \([\mathbf{v}]_i\) to refer to the \(i\)th entry of \(\mathbf{v}\). For example, if \(\mathbf{v} = (2, 3, 4)\), then \([\mathbf{v}]_1 = 2\), \([\mathbf{v}]_2 = 3\), and \([\mathbf{v}]_3 = 4\).

Matrices

A matrix is a 2-dimensional grid of numbers. You can think of them as a group of column vectors stacked side by side. Here’s an example:

\[ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \]

A matrix with \(m\) rows and \(n\) columns is known as an \(m \times n\) matrix. For this example, because our matrix has 3 rows and 3 columns, it is a \(3 \times 3\) matrix.

Matrices are usually denoted by uppercase letters.

I will use the notation \([A]_{i, j}\) to refer to the entry at the \(i\)th row and \(j\)th column of \(A\). For example, for the above matrix \(A\), \([A]_{1, 1} = 1\), \([A]_{2, 3} = 6\), and \([A]_{3, 1} = 7\).

In future sections, we will learn how we can use vectors and matrices to describe systems of equations and discover their properties. We will also eventually learn about the operations we can perform on vectors and matrices.

Representing Systems of Linear Equations

We can represent a system of linear equations using a matrix.

The coefficient matrix

The coefficient matrix is a way to represent the coefficients of a linear system of equations. Each row in the coefficient matrix represents the coefficients in one equation. For example, consider the following system of equations:

\[ \class{red}{2}x_1 + \class{red}{3}x_2 + \class{red}{4}x_3 = 5 \] \[ \class{red}{8}x_1 \class{red}{- 7}x_2 + \class{red}{6}x_3 = 5 \] \[ \class{red}{-3}x_1 \class{red}{- 6}x_2 \class{red}{- 9}x_3 = 12 \]

The coefficients of each equation are highlighted in red. If we put these coefficients into a matrix, we get the coefficient matrix. The coefficient matrix \(A\) for this system of equations is:

\[ A = \begin{bmatrix} \class{red}{2} & \class{red}{3} & \class{red}{4}\\ \class{red}{8} & \class{red}{-7} & \class{red}{6}\\ \class{red}{-3} & \class{red}{-6} & \class{red}{-9}\\ \end{bmatrix} \]

Don’t forget about the signs of each coefficient!

The vector of constants

The vector of constants holds the constants that each linear expression in our system of equations equals. Typically, these are the constants on the right-hand side of each equation. Let’s go back to our system of equations:

\[ {2}x_1 + {3}x_2 + {4}x_3 = \class{blue}{5} \] \[ {8}x_1 - 7x_2 + {6}x_3 = \class{blue}{5} \] \[ -3x_1 - 6x_2 - 9x_3 = \class{blue}{12} \]

This time, I’ve highlighted the constants in this system. Putting these vectors into a column vector gives us the vector of constants. The vector of constants \(\mathbf{b}\) for this system is:

\[ \mathbf{b} = \begin{bmatrix} \class{blue}{5} \\ \class{blue}{5} \\ \class{blue}{12} \end{bmatrix} \]

The augmented matrix

If we add another column to the right-hand side of the coefficient matrix and fill it up with the vector of constants, we get the augmented matrix for a system of equations. In this example, the augmented matrix is:

\[ \begin{bmatrix} \class{red}{2} & \class{red}{3} & \class{red}{4} & \class{blue}{5}\\ \class{red}{8} & \class{red}{-7} & \class{red}{6} & \class{blue}{5}\\ \class{red}{-3} & \class{red}{-6} & \class{red}{-9} & \class{blue}{12}\\ \end{bmatrix} \]

The benefit of using a coefficient matrix is that we no longer have to worry about what our variables are called; it provides a compact way to record all of the information about a system of equations in one neat package.

Reduced Row-Echelon Form

Now that we know how to represent systems of equations with matrices, how can we use this knowledge to actually solve them? To do this, we need to simplify our systems down into a form that’s simpler. One way to do this is to convert a system’s augmented matrix into a simpler form known as reduced row-echelon form.

A matrix is in reduced row-echelon form (abbreviated RREF) when it meets these conditions:

If a row only contains zeros (this is known as a zero row), it is below all rows that aren’t zero rows.
The leftmost nonzero number of every row is a 1 (unless the row is a zero row); this 1 is known as a leading 1.
If a column has a leading 1, it is the only nonzero number in that column.
Consider any two leading 1s in the matrix. If one of the leading 1s is in a lower row than the other, then it must be to the right of the other leading 1.
- In symbols: Let’s say the first leading 1 is in row \(r_1\) and column \(c_1\) and the second leading 1 is in row \(r_2\) and column \(c_2\). It must always be true that if \(r_2 \gt r_1\), then \(c_2 \gt c_1\).

Here’s a matrix that is not in reduced row-echelon form because it violates condition 1:

\[ \begin{bmatrix} 1 & 0 & 3\\ \class{red}{0} & \class{red}{0} & \class{red}{0}\\ 0 & 1 & 2 \end{bmatrix}\]

The highlighted row is not below all other rows with nonzero terms.

Here’s a matrix that violates condition 2:

\[ \begin{bmatrix} \class{red}{2} & 0 & 3\\ 0 & 1 & 2\\ 0 & 0 & 0\\ \end{bmatrix}\]

The highlighted entry is a leading nonzero term of the first row but is not 1.

Here’s a matrix that violates condition 3:

\[ \begin{bmatrix} 1 & \class{red}{2} & 3\\ 0 & \class{red}{1} & 2\\ 0 & 0 & 0\\ \end{bmatrix}\]

The leading 1 in the second row is not the only nonzero entry in its column (as shown by the highlighted entries).

And finally, here’s a matrix that violates condition 4:

\[ \begin{bmatrix} 1 & 0 & 3\\ 0 & \class{red}{1} & 2\\ \class{blue}{1} & 0 & 0\\ \end{bmatrix}\]

The blue leading 1 is below the red leading 1, but the blue leading 1 is to the left of the red leading 1. (The row number of the blue leading 1 is greater than the row number of the red leading 1, but the column number of the blue leading 1 is less than the row number of the red leading 1.)

Here are some examples of matrices in reduced row-echelon form:

\[ \begin{bmatrix} 1 & 0 & 3\\ 0 & 1 & 2\\ 0 & 0 & 0\\ \end{bmatrix}\] \[ \begin{bmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{bmatrix}\] \[ \begin{bmatrix} 1 & 0 & 0 & 0 & 0\\ 0 & 1 & 3 & 0 & 0\\ 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 1\\ \end{bmatrix}\]

Notice the staircase-like pattern that the leading 1s make in matrices that are in reduced row-echelon form.

Reduced row-echelon form is useful because once we get an augmented matrix into reduced row-echelon form (you will learn how to do this in the next section), it’s easy to find the solutions to the corresponding system of equations.

For example, the corresponding system of equations to the above matrix is:

\[ 1x_1 + 0x_2 = 3 \] \[ 0x_1 + 1x_2 = 2 \] \[ 0x_1 + 0x_2 = 0 \]

The last line simplifies to \(0 = 0\), so it is always true no matter what the values of \(x_1\) and \(x_2\) are. Therefore, we can disregard that equation. The other two lines directly give us the values of \(x_1\) and \(x_2\): \(x_1 = 3\) and \(x_2 = 2\).

For a matrix in reduced row-echelon form, a column with a leading 1 is known as a pivot column. In this example, column 1 and column 2 are pivot columns.

\[ \begin{bmatrix} \class{red}{1} & 0 & 3\\ 0 & \class{red}{1} & 2\\ 0 & 0 & 0\\ \end{bmatrix}\]

A column with a leading 1 is a pivot column.

I will sometimes use the term “row-reducing” to refer to analyzing the reduced row-echelon form of a matrix (without actually changing the original matrix).

From now on, I will box the leading 1s in every row-reduced matrix for clarity. Here’s an example of what that might look like:

\[ \begin{bmatrix} \boxed{1} & 0 & 3 & 4\\ 0 & \boxed{1} & 2 & 3\\ 0 & 0 & 0 & 0 \end{bmatrix} \]

Gauss-Jordan Elimination

Gauss-Jordan elimination is a systematic way to turn a matrix into reduced row-echelon form. The basic idea is to go through our matrix column by column and perform row operations to turn this matrix into reduced row-echelon form.

Row operations

There are three row operations we can perform on a matrix without changing the solution set of the corresponding system of equations:

Swap the order of two rows
Multiply every entry in a row by a nonzero constant
Add a constant multiple of one row to another row (i.e. multiply every entry of a row by a constant multiple and add it to the entries of another row without changing the original row)

Notice the similarities between the row operations and equation operations mentioned in Intro to Systems of Linear Equations. When one matrix can be transformed into another matrix through these row operations, the matrices are known as row-equivalent.

Let’s go through the process of Gauss-Jordan elimination with the following matrix:

\[ A = \begin{bmatrix} 1 & 1 & 1 & 2\\ 2 & -3 & 1 & -9\\ -4 & 0 & 5 & -14 \end{bmatrix}\]

We first need to define some variables to keep track of where we are in the process. We will define the variables \(j\) and \(r\) and set them both to 0. \(j\) will serve as a counter to keep track of what column we’re on. In addition, we’ll define \(m\) as the number of rows in the matrix \(A\) and \(n\) as the number of columns (i.e. \(A\) is an \(m \times n\) matrix).

The first column (\(j = 1\))

We start off by increasing \(j\) by 1. The variable \(j\) is now 1, meaning that we’re working on the first column.

\[\begin{bmatrix} \class{red}{1} & 1 & 1 & 2\\ \class{red}{2} & -3 & 1 & -9\\ \class{red}{-4} & 0 & 5 & -14 \end{bmatrix}\]

Now we look at the entries of \(A\) in this column (in this case the first column). If all of the entries in this column from row \(r + 1\) to \(m\) are zero, then we skip this column. \(r + 1\) is currently 1 in this case, so we need to look at all of the entries in this column. These entries are not all zero, so we proceed.

Our goal now is to convert column 1 to all zeros, except for one entry which will end up being a 1 (this will be a leading 1 in our final row-reduced matrix). Here’s how we’ll do that:

We choose a row from rows \(r + 1\) to \(m\) such that the entry in column \(j\) is nonzero. We’ll call the index of this row \(i\). In this case, we can choose any of the rows, so I’ll choose row 1.

\[\begin{bmatrix} \class{blue}{1} & \class{blue}{1} & \class{blue}{1} & \class{blue}{2}\\ 2 & -3 & 1 & -9\\ -4 & 0 & 5 & -14 \end{bmatrix}\]

This is row \(i = 1\), the row we’re focusing on.

Now we increase \(r\) by 1 (after incrementing, \(r\) is 1 now). If \(i\) and \(r\) are different, we swap rows \(i\) and \(r\) of the matrix. In this case, because \(i\) and \(r\) are both 1, we don’t need to do anything here.

Now, we multiply row \(r\) by a constant to make the entry at column \(j\) 1. In this case, the entry at row 1 and column 1 is already 1, so we skip this step.

\[\begin{bmatrix} \class{purple}{1} & 1 & 1 & 2\\ 2 & -3 & 1 & -9\\ -4 & 0 & 5 & -14 \end{bmatrix}\]

Because the entry at row \(r\) and column \(j\) is already 1, we don’t do anything. Otherwise, we would multiply row \(r\) by a constant to turn this entry into a 1.

We then add constant multiples of row \(r\) to all other rows to make all other entries of column \(j\) (in this case, the first column) zero.

Adding -2 times row 1 to row 2:

\[ \begin{bmatrix} 1 & 1 & 1 & 2\\ 0 & -5 & -1 & -13\\ -4 & 0 & 5 & -14 \end{bmatrix} \]

Adding 4 times row 1 to row 3:

\[ \begin{bmatrix} \boxed{1} & 1 & 1 & 2\\ 0 & -5 & -1 & -13\\ 0 & 4 & 9 & -6 \end{bmatrix} \]

After completing each column, I will box that column’s leading 1 if it exists for clarity.

Now we can move on to the next column.

The second column (\(j = 2\))

We add 1 to \(j\), so \(j\) is currently 2. This means that we’re focusing on the second column.

\[ \begin{bmatrix} \boxed{1} & \class{red}{1} & 1 & 2\\ 0 & \class{red}{-5} & -1 & -13\\ 0 & \class{red}{4} & 9 & -6 \end{bmatrix} \]

Now we need to choose a row from row \(r + 1\) (which is currently 2) to \(m\) (which is 3) with a nonzero entry in this column (i.e. column \(j = 2\)). I’ll choose row 2 for this example, so \(i = 2\).

Now we increase \(r\) by 1, so it’s currently 2. Because we chose row 2 and \(r = 2\), we don’t need to swap any rows.

Now we need to set the entry at row \(r\) and column \(j\) to a 1 by multiplying row \(r\) by a constant. To do this, we multiply row 2 by \(-1/5\) to turn the entry at row 2 and column 2 to a 1.

\[ \begin{bmatrix} \boxed{1} & 1 & 1 & 2\\ 0 & \class{purple}{1} & 1/5 & 13/5\\ 0 & 4 & 9 & -6 \end{bmatrix} \]

The purpose of this row operation is to turn the entry at row \(r\) and column \(j\) into a 1.

Now we have to zero out the other entries of column 2 by adding constant multiples of row 2 to every other row. Let’s start by adding -1 times row 2 to row 1:

\[ \begin{bmatrix} \boxed{1} & 0 & 4/5 & -3/5\\ 0 & 1 & 1/5 & 13/5\\ 0 & 4 & 9 & -6 \end{bmatrix} \]

Now let’s add -4 times row 2 to row 3:

\[ \begin{bmatrix} \boxed{1} & 0 & 4/5 & -3/5\\ 0 & \boxed{1} & 1/5 & 13/5\\ 0 & 0 & 41/5 & -82/5 \end{bmatrix} \]

Now we move on to column 3.

The third column (\(j = 3\))

We need to choose a row from \(r + 1 = 3\) to \(m = 3\). Our only choice in this case is the 3rd row. We then increase \(r\) by 1, so \(r = 3\) now. Therefore, we don’t need to swap any rows.

Now we want the entry at row 3 and column 3 to be a 1, so we multiply row 3 by \(5/41\).

\[ \begin{bmatrix} \boxed{1} & 0 & 4/5 & -3/5\\ 0 & \boxed{1} & 1/5 & 13/5\\ 0 & 0 & 1 & -2 \end{bmatrix} \]

Now we just have to zero out the other entries of column 3. Adding \(-1/5\) times the 3rd row to the 2nd row:

\[ \begin{bmatrix} \boxed{1} & 0 & 4/5 & -3/5\\ 0 & \boxed{1} & 0 & 3\\ 0 & 0 & 1 & -2 \end{bmatrix} \]

Adding \(-4/5\) times row 3 to row 1:

\[ \begin{bmatrix} \boxed{1} & 0 & 0 & 1\\ 0 & \boxed{1} & 0 & 3\\ 0 & 0 & \boxed{1} & -2 \end{bmatrix} \]

Our matrix is now in reduced row-echelon form! In this form, we can easily read out the solutions to the corresponding system of equations. Let’s translate this matrix into its corresponding system of equations now:

\[ 1x_1 + 0x_2 + 0x_3 = 1 \] \[ 0x_1 + 1x_2 + 0x_3 = 3 \] \[ 0x_1 + 0x_2 + 1x_3 = -2 \]

In this form, we can easily tell that the solution is \(x_1 = 1\), \(x_2 = 3\), and \(x_3 = -2\).

Note that every matrix has only one row-equivalent matrix in reduced row-echelon form.

Another example

Here’s another example of Gauss-Jordan elimination with this matrix:

\[ \begin{bmatrix} 1 & 2 & 4 & 0\\ 2 & 4 & 8 & 0\\ 3 & 5 & 7 & 0\\ \end{bmatrix} \]

We start off by initializing our variables: \(r = 0\) and \(j = 0\).

The first column (\(j = 1\))

We need to look at the entries in column \(j = 1\) from \(r + 1\) to \(m\) (the number of rows). Because not all of the entries are zero, we don’t skip this column. We choose a row from rows \(r + 1\) to \(m\) with a nonzero entry in column \(j = 1\). I’ll choose row 1.

Now we add 1 to \(r\). We don’t need to swap rows because the row we picked is row \(r\).

We also don’t need to multiply row 1 by a constant because the entry at row \(r = 1\) and column \(j = 1\) is already 1. Now we can zero out the other entries of column \(j = 1\).

Adding -2 times row 1 to row 2:

\[ \begin{bmatrix} 1 & 2 & 4 & 0\\ 0 & 0 & 0 & 0\\ 3 & 5 & 7 & 0\\ \end{bmatrix} \]

Adding -3 times row 1 to row 3:

\[ \begin{bmatrix} \boxed{1} & 2 & 4 & 0\\ 0 & 0 & 0 & 0\\ 0 & -1 & -5 & 0\\ \end{bmatrix} \]

The second column (\(j = 2\))

We now look at the entries in column 2 from rows \(r + 1 = 2\) to \(m = 3\).

\[ \begin{bmatrix} \boxed{1} & \class{red}{2} & 4 & 0\\ 0 & \class{red}{0} & 0 & 0\\ 0 & \class{red}{-1} & -5 & 0\\ \end{bmatrix} \]

Our only choice here is row 3, so we set the variable \(i\) to 3.

We need to add 1 to \(r\) now, so \(r = 2\). Because \(r\) is not equal to \(i\), we need to swap rows \(r\) and \(i\):

\[ \begin{bmatrix} \boxed{1} & 2 & 4 & 0\\ 0 & -1 & -5 & 0\\ 0 & 0 & 0 & 0\\ \end{bmatrix} \]

The matrix after swapping rows \(r = 2\) and \(i = 3\).

Now we convert the entry at row \(r = 2\) and column \(j = 2\) to a 1 by multiplying row 2 by -1:

\[ \begin{bmatrix} \boxed{1} & 2 & 4 & 0\\ 0 & 1 & 5 & 0\\ 0 & 0 & 0 & 0\\ \end{bmatrix} \]

Finally, we zero out the entry at row 1 and column 2 by adding -2 times row 2 to row 1:

\[ \begin{bmatrix} \boxed{1} & 0 & -6 & 0\\ 0 & \boxed{1} & 5 & 0\\ 0 & 0 & 0 & 0\\ \end{bmatrix} \]

This matrix is in reduced row-echelon form now! Here’s what it looks like when we translate this matrix back into a system of equations:

\[ 1x_1 + 0x_2 - 6x_3 = 0\] \[ 0x_1 + 1x_2 + 5x_3 = 0\] \[ 0x_1 + 0x_2 + 0x_3 = 0\]

After some simplification, this system becomes:

\[ x_1 - 6x_3 = 0 \] \[ x_2 + 5x_3 = 0 \] \[ 0 = 0 \]

We can disregard the \(0 = 0\) equation because it will always be true. Let’s rearrange the other two equations to solve for \(x_1\) and \(x_2\):

\[ x_1 = 6x_3 \] \[ x_2 = -5x_3 \]

Therefore, any solution to our original system of equations is of the form \((x_1, x_2, x_3) = (6x_3, -5x_3, x_3)\).

Consistent Systems of Equations and Free/Dependent Variables

Some systems of equations have no solutions, and some systems of equations have one or infinitely many solutions. When a system of equations has at least one solution, we call it a consistent system. A system of equations with no solutions is an inconsistent system.

A simple example of a consistent system is the following:

\[ x_1 + x_2 = 3\] \[ x_1 - x_2 = 1 \]

This system has a solution \(x_1 = 2\) and \(x_2 = 1\). (Note: Consistent systems can also have infinitely many solutions.)

A simple example of an inconsistent system is:

\[ x_1 + x_2 = 3\] \[ 2x_1 + 2x_2 = 7 \]

This system has no solution, so it is inconsistent.

An example of a system with infinitely many solutions

Consider the following augmented matrix:

\[ \begin{bmatrix} 1 & 1 & 1 & 2 \\ 2 & 2 & 2 & 4\\ 3 & 2 & 4 & 6 \end{bmatrix} \]

When we convert this matrix into reduced row-echelon form, we get:

\[ \begin{bmatrix} \boxed{1} & 0 & 2 & 2 \\ 0 & \boxed{1} & -1 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\]

The corresponding system of equations is:

\[ 1x_1 + 0x_2 + 2x_3 = 2 \] \[ 0x_1 + 1x_2 - 1x_3 = 0 \] \[ 0x_1 + 0x_2 + 0x_3 = 0 \]

We can write this system more simply as:

\[ x_1 + x_3 = 2 \] \[ x_2 - x_3 = 0 \] \[ 0 = 0 \]

The last equation \(0 = 0\) is always true, so we can disregard it. But notice what happens with the other two equations. We can rewrite them as follows:

\[ x_1 = 2 - x_3\] \[ x_2 = x_3\]

Notice how when we write the solutions in this way, \(x_3\) is free to take on any value, and the values of \(x_1\) and \(x_2\) depend on \(x_3\). Because \(x_3\) can take on any value, there are infinitely many solutions.

We can describe the solution set of this system as all ordered pairs of the form \((x_1, x_2, x_3) = (2 - x_3, x_3, x_3)\) where \(x_3\) is any real number (or even any complex number). Typically, I will write the solution set as a column vector, like this:

\[ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 2-x_3 \\ x_3 \\ x_3 \end{bmatrix} \]

Dependent and free variables

If we have \(A\), the augmented matrix of a system of equations, and \(B\), a row-equivalent matrix in reduced row-echelon form, then if column \(j\) of \(B\) is a pivot column, then the variable \(x_j\) is known as a dependent variable. All other variables are known as free variables.

Let’s look back at our previous example. Here is the matrix in reduced row-echelon form:

\[ \begin{bmatrix} \boxed{1} & 0 & 2 & 2 \\ 0 & \boxed{1} & -1 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\]

Notice that columns 1 and 2 are pivot columns. This means that in the corresponding system of equations, the variables \(x_1\) and \(x_2\) are dependent, and \(x_3\) is free. This is related to how \(x_3\) was able to take on any value in the solution of our system of equations, while the values of \(x_1\) and \(x_2\) depended on the value of \(x_3\).

Determining consistency of systems

We can tell if a system is consistent by looking at the reduced row-echelon form of its corresponding augmented matrix. If this row-reduced matrix has a pivot column at column \(n + 1\), where \(n\) is the number of variables in the system, then the system is inconsistent (i.e. the system is inconsistent if the last column of the row-reduced matrix is a pivot column). Otherwise it is consistent.

For example, consider this system:

\[ x_1 + x_2 + x_3 = 1 \] \[ 2x_1 + 2x_2 + 2x_3 = 3 \] \[ x_1 + 2x_2 + 3x_3 = 4 \]

The augmented matrix of this system row-reduces to:

\[ \begin{bmatrix} \boxed{1} & 0 & -1 & 0\\ 0 & \boxed{1} & 2 & 0\\ 0 & 0 & 0 & \boxed{\class{red}{1}} \end{bmatrix}\]

Because column \(n + 1 = 4\) of this matrix is a pivot column (as indicated by the leading 1 highlighted in red), this system is inconsistent. (Notice how it is impossible for the equations \(x_1 + x_2 + x_3 = 1\) and \(2x_1 + 2x_2 + 2x_3 = 3\) to be true at the same time! In addition, notice how the last row of the row-reduced matrix translates to \(0 = 1\), a false statement.)

Determining the number of solutions of a system of equations

If the row-reduced matrix of a consistent system has \(r\) pivot columns and \(n\) variables, it is guaranteed that \(r \le n\) (i.e. a system cannot have more pivot columns than variables). If \(r \lt n\), then the system has infinitely many solutions, and if \(r = n\), then the system has exactly one solution.

A consistent system with at least one free variable has infinitely many solutions (because there are infinitely many choices for the free variables), and a consistent system with no free variables has exactly one solution.

If a consistent system has more variables than it has equations, then the system has infinitely many solutions. Here’s an example:

\[ x_1 + 2x_2 + 3x_3 = 0 \] \[ 4x_1 - 5x_2 + 6x_3 = 0 \]

We can tell this system has infinitely many solutions because:

This system is consistent: \(x_1 = 0\), \(x_2 = 0\), and \(x_3 = 0\) is a solution.
There are more variables than equations, since this system has 3 variables and 2 equations.

In conclusion, there are three possibilities for a linear system with \(n\) variables with augmented matrix \(A\):

If column \(n + 1\) (the right-most column) of the reduced row-echelon form of \(A\) is a pivot column, the system is inconsistent and has no solutions.
Otherwise:
- If the reduced row-echelon form of \(A\) has the same number of pivot columns as it has variables (i.e. \(r = n\)), the system has one solution.
- If the reduced row-echelon form of \(A\) has fewer pivot columns than it has variables (i.e. \(r \lt n\)), the system has infinitely many solutions.

The following augmented matrix represents a system with no solution:

\[ \begin{bmatrix} 1 & 7 & 8 & 1\\ 2 & 5 & 7 & 0\\ 2 & -2 & 0 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 1 & 0\\ 0 & \boxed{1} & 1 & 0\\ 0 & 0 & 0 & \boxed{1} \end{bmatrix} \]

The following augmented matrix represents a system with exactly one solution:

\[ \begin{bmatrix} 1 & 7 & 7 & 0\\ 2 & 5 & 7 & 0\\ 2 & -2 & 0 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & 0\\ 0 & \boxed{1} & 0 & 0\\ 0 & 0 & \boxed{1} & 0 \end{bmatrix} \]

The following augmented matrix represents a system with infinitely many solutions:

\[ \begin{bmatrix} 1 & 7 & 8 & 0\\ 2 & 5 & 7 & 0\\ 2 & -2 & 0 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 1 & 0\\ 0 & \boxed{1} & 1 & 0\\ 0 & 0 & 0 & 0 \end{bmatrix} \]

Counting free variables

We can tell how many free variables a consistent system has by looking at its corresponding matrix in reduced row-echelon form. If this system has \(n\) variables and the matrix in reduced row-echelon form has \(r\) nonzero rows (rows that don’t only contain zeros), then we can describe the solutions of the system with \(n - r\) free variables. Note that \(r\) is also the number of pivot columns and the number of leading 1s.

Going back to our first example, the matrix in reduced row-echelon form has two nonzero rows, so \(r = 2\). The system of equations has 3 variables \(x_1\), \(x_2\), and \(x_3\), so \(n = 3\). Therefore, the system of equations has \(n - r = 3 - 2 = 1\) free variable.

\[ \begin{bmatrix} \boxed{1} & 0 & 2 & 2 \\ 0 & \boxed{1} & -1 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\]

This matrix in reduced row-echelon form has 2 pivot columns (as indicated by the two leading 1s). Since the system has 3 variables, there is \(3 - 2 = 1\) free variable in the system.

To contrast this example, let’s say a system has augmented matrix \(A\) and the reduced row-echelon form of \(A\) is:

\[ \begin{bmatrix} \boxed{1} & 0 & 0 & 0\\ 0 & \boxed{1} & 0 & 0\\ 0 & 0 & \boxed{1} & 0 \end{bmatrix} \]

In this case, the system has 3 variables while the reduced matrix has 3 pivot columns, so there are \(3 - 3 = 0\) free variables.

Homogeneous Systems of Equations and Null Spaces

A homogeneous system of equations is a special type of linear system of equations where all of the constants are zero.

Here’s an example of a homogeneous system:

\[ \begin{flalign} x_1 + x_2 + x_3 &= \class{red}{0}\\ 2x_1 - x_2 + 4x_3 &= \class{red}{0} \\ -x_1 + 2x_2 &= \class{red}{0} \end{flalign} \]

Notice how the constants (highlighted in red) are zero.

Homogeneous systems are always consistent: you can always find a solution to a homogeneous system by setting all of the variables to zero (this is known as the trivial solution).

In this case, setting \(x_1\), \(x_2\), and \(x_3\) all to zero results in all three equations becoming \(0 = 0\).

Are there any other solutions to this system? We can find that out by row-reducing the augmented matrix to get:

\[ \begin{bmatrix} 1 & 1 & 1 & 0\\ 2 & -1 & 4 & 0\\ -1 & 2 & 0 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & 0\\ 0 & \boxed{1} & 0 & 0\\ 0 & 0 & \boxed{1} & 0 \end{bmatrix}\]

Converting this back into a system of equations, we have:

\[ \begin{flalign} x_1 = 0 \\ x_2 = 0 \\ x_3 = 0 \end{flalign} \]

Therefore, the trivial solution to this system is also the only solution.

If a homogeneous system has more variables than equations, it has infinitely many solutions (this is because homogeneous systems are always consistent and a consistent system with more variables than equations has infinitely many solutions).

Null spaces of matrices

The null space of a matrix \(A\) is the solution set of the system of equations with \(A\) as its coefficient matrix and the zero vector as its vector of constants (i.e. all of the constants are zero).

Let’s look at the coefficient matrix for our previous system of equations:

\[ \begin{bmatrix} 1 & 1 & 1\\ 2 & -1 & 4\\ -1 & 2 & 0 \end{bmatrix}\]

The null space of this matrix is the solution set to our previous system of equations. In this case, the null space only contains the zero vector, since that’s the only solution to our system of equations.

Now let’s look at another matrix:

\[ \begin{bmatrix} 1 & 2 & 3 \\ -1 & -2 & -3 \\ 2 & 0 & 2\\ \end{bmatrix} \]

The full augmented matrix for the homogeneous system of equations for this matrix is:

\[ \begin{bmatrix} 1 & 2 & 3 & 0\\ -1 & -2 & -3 & 0\\ 2 & 0 & 2 & 0\\ \end{bmatrix} \]

Row-reducing this matrix results in:

\[ \begin{bmatrix} \boxed{1} & 0 & 1 & 0\\ 0 & \boxed{1} & 1 & 0\\ 0 & 0 & 0 & 0 \end{bmatrix} \]

Because there are 2 pivot columns and 3 variables, there is \(3 - 2 = 1\) free variable. Therefore, there are infinitely many solutions to the system. As a result, the null space of our original matrix \(A\) has infinitely many elements, with each element being a solution to the system.

Singular and Nonsingular Matrices

Before we talk about singular matrices, let’s define some special types of matrices.

Square matrices

A square matrix is a matrix with the same number of rows and columns.

Here’s an example of a square matrix:

\[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}\]

This matrix has 3 rows and 3 columns, so it is square.

Here’s an example of a non-square matrix:

\[ \begin{bmatrix} 1 & 2 & 3 & 4 \\ 5 & 6 & 7 & 8 \\ 9 & 10 & 11 & 12 \end{bmatrix}\]

This matrix has 3 rows and 4 columns, so it is not square.

Identity matrices

An identity matrix is a square matrix with all 1s on the main diagonal and 0s everywhere else. More formally, the \(n \times n\) identity matrix, denoted by \(I_n\), is defined by:

\[ [I_n]_{i,j} = \begin{cases} 1 \;\text{ if } i = j\\ 0 \;\text{ if } i \ne j\\ \end{cases} \quad \text{ for } 1\le i \le n,\, 1 \le j \le n\]

In simple words, the entry at row \(i\) and column \(j\) is 1 if \(i\) and \(j\) are equal and 0 otherwise.

Here are some examples of identity matrices:

\[ I_2 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \] \[ I_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \] \[ I_4 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \]

Singular matrices

A square matrix \(A\) is singular if the system of equations with \(A\) as its coefficient matrix and all-zero constants has infinitely many solutions (equivalently, the system has non-trivial solutions). If the system of equations has only the trivial solution, the matrix \(A\) is nonsingular.

In the previous example, we looked at this matrix:

\[ \begin{bmatrix} 1 & 1 & 1\\ 2 & -1 & 4\\ -1 & 2 & 0 \end{bmatrix}\]

We found that the only solution to the corresponding homogeneous system of equations was the trivial solution where all of the variables were set to zero. Therefore, this matrix is nonsingular.

\[ \begin{flalign} x_1 + x_2 + x_3 &= 0 \\ 2x_1 - x_2 + 4x_3 &= 0 \\ -x_1 + 2x_2 &= 0 \end{flalign} \]

This is the corresponding homogeneous system. The system only has the trivial solution, so the coefficient matrix is nonsingular.

We also looked at this matrix:

\[ \begin{bmatrix} 1 & 2 & 3 \\ -1 & -2 & -3 \\ 2 & 0 & 2\\ \end{bmatrix} \]

We found that the corresponding homogeneous system had infinitely many solutions, so this matrix is singular.

\[ \begin{flalign} x_1 + 2x_2 + 3x_3 &= 0 \\ -x_1 - 2x_2 - 3x_3 &= 0 \\ 2x_1 + 2x_3 &= 0 \end{flalign} \]

This is the corresponding homogeneous system. The system has infinitely many solutions, so the coefficient matrix is singular.

An interesting fact about nonsingular matrices is that reducing any nonsingular matrix to reduced row-echelon form always results in an identity matrix. More formally, a matrix is nonsingular if and only if the matrix in reduced row-echelon form is an identity matrix.

Here are the reduced row-echelon forms of the previous two matrices:

\[ \begin{bmatrix} 1 & 1 & 1\\ 2 & -1 & 4\\ -1 & 2 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0\\ 0 & \boxed{1} & 0\\ 0 & 0 & \boxed{1} \end{bmatrix} \]

This matrix is nonsingular, so it row-reduces to an identity matrix.

\[ \begin{bmatrix} 1 & 2 & 3\\ -1 & -2 & -3\\ 2 & 0 & 2 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 1\\ 0 & \boxed{1} & 1\\ 0 & 0 & 0 \end{bmatrix} \]

This matrix is singular, so it does not row-reduce to an identity matrix.

The null space of a nonsingular matrix contains just one element: the zero vector. A square matrix \(A\) is nonsingular if and only if its null space only contains the zero vector.

In addition, a square matrix \(A\) is nonsingular if and only if the system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) has a single unique solution for any possible choice for \(\mathbf{b}\).

To summarize, for any matrix \(A\), the following properties are equivalent (meaning for every property, all other properties are true if and only if that property is true):

\(A\) is a nonsingular matrix.
The reduced row-echelon form of \(A\) is equal to the \(n \times n\) identity matrix.
The null space of \(A\) contains only the zero vector.
The linear system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) has a single unique solution for every possible choice of \(\mathbf{b}\).

Nonsingular Matrix Equivalences, Part 1

The concept of a nonsingular matrix is one that will appear many times throughout linear algebra. So far, we have collected a few conditions that are equivalent to a matrix being nonsingular.

If \(A\) is an \(n \times n\) square matrix, these conditions are equivalent:

\(A\) is a nonsingular matrix.
The reduced row-echelon form of \(A\) is equal to the \(n \times n\) identity matrix.
The null space of \(A\) contains only the zero vector.
The linear system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) has a single unique solution for every possible choice of \(\mathbf{b}\).

In future units, I will add more equivalences to this list.

Unit 2: Vectors

A First Course in Linear Algebra link: http://linear.ups.edu/html/chapter-V.html

Vector Operations

Now we will study vectors in more detail. To start, here are some of the operations we can perform involving vectors. (Specifically, these are properties for column vectors. We will study other types of “vectors” later on.)

Vector equality

Two vectors of the same size \(\mathbf{u}\) and \(\mathbf{v}\) are equal if each entry in \(\mathbf{u}\) equals its corresponding entry in \(\mathbf{v}\). More formally, if \(\mathbf{u}\) and \(\mathbf{v}\) have size \(m\), then \(\mathbf{u} = \mathbf{v}\) if:

\[ [\mathbf{u}]_i = [\mathbf{v}]_i \, \text{ for } 1 \le i \le m \]

This definition of vector equality allows us to compactly describe systems of equations as an equality of two vectors. For example, consider the following system:

\[ 2x_1 + 3x_2 + 5x_3 = 7\] \[ 3x_1 - 6x_2 + 9x_3 = 12 \] \[ x_1 + x_3 = 2 \]

We can write this system as follows:

\[ \begin{bmatrix} 2x_1 + 3x_2 + 5x_3 \\ 3x_1 - 6x_2 + 9x_3 \\ x_1 + x_3 \end{bmatrix} = \begin{bmatrix} 7 \\ 12 \\ 2 \end{bmatrix} \]

Vector addition/subtraction

We can add two vectors of the same size by adding the corresponding entries. If \(\mathbf{u}\) and \(\mathbf{v}\) are of size \(m\):

\[ [\mathbf{u} + \mathbf{v}]_i = [\mathbf{u}]_i + [\mathbf{v}]_i \, \text{ for } 1 \le i \le m \]

Similarly, we can subtract two vectors by subtracting their corresponding entries. If \(\mathbf{u}\) and \(\mathbf{v}\) are of size \(m\):

\[ [\mathbf{u} - \mathbf{v}]_i = [\mathbf{u}]_i - [\mathbf{v}]_i \, \text{ for } 1 \le i \le m \]

Scalar-vector multiplication

We can multiply a vector by a scalar by multiplying each of the vector’s components by the scalar. If \(\alpha\) is a scalar and \(\mathbf{v}\) is a vector of size \(m\):

\[ [\alpha\mathbf{u}]_i = \alpha[\mathbf{u}]_i \, \text{ for } 1 \le i \le m \]

Examples of column vector operations

\[ \begin{bmatrix} 1 \\ 5 \\ 9 \end{bmatrix} + \begin{bmatrix} 2 \\ 8 \\ -3 \end{bmatrix} = \begin{bmatrix} 1 + 2 \\ 5 + 8 \\ 9 + (-3) \end{bmatrix} = \begin{bmatrix} 3 \\ 13 \\ 6 \end{bmatrix} \] \[ 3\begin{bmatrix} 1 \\ 5 \\ 9 \end{bmatrix} = \begin{bmatrix} 3(1) \\ 3(5) \\ 3(9) \end{bmatrix} = \begin{bmatrix} 3 \\ 15 \\ 27 \end{bmatrix} \]

Column vector properties

Here are some properties of column vectors.

If \(\mathbf{u}\) and \(\mathbf{v}\) are column vectors of size \(m\), then \(\mathbf{u} + \mathbf{v}\) is a column vector of size \(m\).
If \(\alpha\) is a scalar (any complex number) and \(\mathbf{u}\) is a column vector, then \(\alpha \mathbf{u}\) is a column vector.
If \(\mathbf{u}\) and \(\mathbf{v}\) are column vectors, then \(\mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u}\).
If \(\mathbf{u}\), \(\mathbf{v}\), and \(\mathbf{w}\) are column vectors, then \(\mathbf{u} + (\mathbf{v} + \mathbf{w}) = (\mathbf{u} + \mathbf{v}) + \mathbf{w}\).
There exists a vector \(\mathbf{0}\) known as the zero vector such that \(\mathbf{u} + \mathbf{0} = \mathbf{u}\) for all possible column vectors \(\mathbf{u}\).
For every possible column vector \(\mathbf{u}\), there exists a vector \(-\mathbf{u}\) such that \(\mathbf{u} + (-\mathbf{u}) = \mathbf{0}\).
If \(\alpha\) and \(\beta\) are scalars and \(\mathbf{u}\) is a column vector, then \(\alpha(\beta \mathbf{u}) = (\alpha \beta) \mathbf{u}\).
If \(\alpha\) is a scalar and \(\mathbf{u}\) and \(\mathbf{v}\) are column vectors, then \(\alpha(\mathbf{u} + \mathbf{v}) = \alpha \mathbf{u} + \alpha \mathbf{v}\).
If \(\alpha\) and \(\beta\) are scalars and \(\mathbf{u}\) is a column vector, then \((\alpha + \beta)\mathbf{u} = \alpha \mathbf{u} + \beta \mathbf{u}\).
For all column vectors \(\mathbf{u}\), \(1\mathbf{u} = \mathbf{u}\).

Sets and Set Notation

Before we continue talking about vectors, let’s take a moment to review sets, which we will be using a lot in linear algebra.

A set is a collection of objects, which are called elements. Typically these objects will be numbers, but there’s no restriction on the type of objects that a set can contain (they can contain vectors, matrices, or anything else you can think of). Sets can contain a finite or infinite number of elements.

The notation for sets involves curly brackets. For example, the set containing the numbers 1 and 2 can be written as \(\{1, 2\}\), and the set containing the letters A, B, and C can be written as \(\{\text{A}, \text{B}, \text{C}\}\).

Sets are unordered, meaning the order of elements does not matter, e.g. \(\{1, 2, 3\}\) and \(\{3, 1, 2\}\) are the same set.

Sets are typically denoted using capital letters, like \(S\). If an object \(x\) is in the set \(S\), we write \(x \in S\), and if \(x\) is not in \(S\), we write \(x \notin S\).

Subsets and set equality

A subset \(T\) of a set \(S\) is a set such that every element in \(T\) is also in \(S\). That is, if \(x \in T\), then \(x \in S\). This relationship is denoted \(T \subseteq S\).

Note that the definition of a subset allows \(S\) and \(T\) to be equal to each other. A proper subset \(T\) of a set \(S\) is a subset of \(S\) that is not equal to \(S\), denoted by \(T \subset S\). That is, if \(T \subseteq S\) and \(T \neq S\), then \(T \subset S\).

We still haven’t formally defined set equality yet. Two sets \(S\) and \(T\) are equal, denoted by \(S = T\), if \(S\) and \(T\) are subsets of each other; that is, \(S \subseteq T\) and \(T \subseteq S\).

Common sets of numbers

Some sets of numbers are used so often that they get their own symbols:

\(\mathbb{Z}\): set of integers
\(\mathbb{Q}\): set of rational numbers
\(\mathbb{R}\): set of real numbers
\(\mathbb{C}\): set of complex numbers

Defining sets

We can define a set by listing its elements, e.g. \(S = \{2, 4, 8, 16\}\), but we also could define a set by specifying a condition for an element to be in the set. For example:

\[ S = \{x \in \mathbb{Z} | x \text{ is even}\} \]

The vertical bar means “such that”, so \(S\) is the set of all even integers. Sometimes a colon instead of a vertical bar is used to mean “such that”.

Here’s another example:

\[ T = \left\{ \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \bigg| x_1 + x_2 = 1 \right\}\]

\(T\) contains all column vectors of size 2 whose entries sum to 1.

Linear Combinations

Now that we’ve defined vector operations, we can create linear combinations of vectors. A linear combination of the scalars \(\alpha_1, \alpha_2, ..., \alpha_n\) and the vectors \(\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\) is:

\[ \alpha_1\mathbf{u}_1 + \alpha_2\mathbf{u}_2 + \cdots + \alpha_n\mathbf{u}_n \]

We can represent systems of linear equations as linear combinations. For example, consider this system:

\[ \begin{flalign} x_1 + x_2 + x_3 &= 2 \\ 2x_1 - 3x_2 + x_3 &= -9 \\ -4x_1 + 5x_3 &= -14 \end{flalign} \]

We can use vector equality to write these three equations as a single equation:

\[ \begin{bmatrix} x_1 + x_2 + x_3\\ 2x_1 - 3x_2 + x_3\\ -4x_1 + 5x_3 \end{bmatrix} = \begin{bmatrix} 2 \\ -9 \\ -14 \end{bmatrix} \]

We can write the left-hand side as a sum of vectors:

\[ \begin{bmatrix} x_1 \\ 2x_1 \\ -4x_1\\ \end{bmatrix} + \begin{bmatrix} x_2 \\ -3x_2 \\ 0 \end{bmatrix} + \begin{bmatrix} x_3 \\ x_3 \\ 5x_3 \end{bmatrix} = \begin{bmatrix} 2 \\ -9 \\ -14 \end{bmatrix} \]

And we can do some factoring:

\[ x_1\begin{bmatrix} 1 \\ 2 \\ -4 \\ \end{bmatrix} + x_2\begin{bmatrix} 1 \\ -3 \\ 0 \end{bmatrix} + x_3\begin{bmatrix} 1 \\ 1 \\ 5 \end{bmatrix} = \begin{bmatrix} 2 \\ -9 \\ -14 \end{bmatrix} \]

Now we have written our system of equations as an equality of a linear combination of the scalars \(x_1\), \(x_2\), and \(x_3\) and the columns of the system’s coefficient matrix.

We previously found that a solution to this system of equations is \(x_1 = 1\), \(x_2 = 3\), and \(x_3 = -2\). We can check this by plugging in these values:

\[ 1\begin{bmatrix} 1 \\ 2 \\ -4 \\ \end{bmatrix} + 3\begin{bmatrix} 1 \\ -3 \\ 0 \end{bmatrix} + (-2)\begin{bmatrix} 1 \\ 1 \\ 5 \end{bmatrix} = \begin{bmatrix} 2 \\ -9 \\ -14 \end{bmatrix} \] \[ \begin{bmatrix} 1 \\ 2 \\ -4 \\ \end{bmatrix} + \begin{bmatrix} 3 \\ -9 \\ 0 \end{bmatrix} + \begin{bmatrix} -2 \\ -2 \\ -10 \end{bmatrix} = \begin{bmatrix} 2 \\ -9 \\ -14 \end{bmatrix} \] \[ \begin{bmatrix} 2 \\ -9 \\ -14 \end{bmatrix} = \begin{bmatrix} 2 \\ -9 \\ -14 \end{bmatrix} \]

The importance of writing systems as linear combinations is that it gives us a different perspective of what it means to solve a linear system of equations.

Solutions to linear systems as linear combinations

Consider a system of equations with an \(m \times n \) coefficient matrix \(A\) and vector of constants \(\mathbf{b}\). If \(\mathbf{A}_1, \mathbf{A}_2, ..., \mathbf{A}_n\) are the columns of \(A\), then the vector \(\mathbf{x}\) is a solution the the system if and only if:

\[ [\mathbf{x}]_1\mathbf{A}_1 + [\mathbf{x}]_2\mathbf{A}_2 + \cdots + [\mathbf{x}]_n\mathbf{A}_n = \mathbf{b} \]

We can use this to write all of the solutions of a linear system of equations as a linear combination. For example, consider this system:

\[ \begin{flalign} x_1 + 2x_2 + 3x_3 + 4x_4 &= 5\\ 6x_1 + 7x_2 + 8x_3 + 9x_4 &= 10\\ 11x_1 + 12x_2 + 13x_3 + 14x_4 &= 15 \end{flalign}\]

Let’s write the corresponding augmented matrix in reduced row-echelon form:

\[\begin{bmatrix} 1 & 2 & 3 & 4 & 5 \\ 6 & 7 & 8 & 9 & 10 \\ 11 & 12 & 13 & 14 & 15 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & -1 & -2 & -3 \\ 0 & \boxed{1} & 2 & 3 & 4 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\]

Converting this matrix back into a system of equations, we get:

\[ \begin{flalign} x_1 - x_3 - 2x_4 = -3 \\ x_2 + 2x_3 + 3x_4 = 4 \\ 0 = 0 \end{flalign} \]

We can disregard the \(0 = 0\) equation, and solve for \(x_1\) and \(x_2\) in terms of the free variables \(x_3\) and \(x_4\) to get:

\[ x_1 = -3 + x_3 + 2x_4 \] \[ x_2 = 4 - 2x_3 - 3x_4 \]

We can write this in vector form as:

\[ \begin{bmatrix} x_1\\ x_2\\ x_3\\ x_4\\ \end{bmatrix} = \begin{bmatrix} -3 + x_3 + 2x_4\\ 4 - 2x_3 - 3x_4\\ x_3\\ x_4\\ \end{bmatrix} \]

This is one way we can write the solutions to this system. We can also decompose the right-hand side as follows:

\[ \begin{flalign} \begin{bmatrix} x_1\\ x_2\\ x_3\\ x_4\\ \end{bmatrix} &= \begin{bmatrix} -3\\ 4\\ 0\\ 0\\ \end{bmatrix} + \begin{bmatrix} x_3\\ -2x_3\\ x_3\\ 0\\ \end{bmatrix} + \begin{bmatrix} 2x_4\\ -3x_4\\ 0\\ x_4\\ \end{bmatrix}\\ &= \begin{bmatrix} -3\\ 4\\ 0\\ 0\\ \end{bmatrix} + x_3\begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix} + x_4\begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix} \end{flalign} \]

This is a compact way to describe all of the solutions of this linear system. We can choose any values of \(x_3\) and \(x_4\) and this equation will give us a solution. For example, here’s what we get when \(x_3 = 1\) and \(x_4 = 1\):

\[ \begin{flalign} \begin{bmatrix} x_1\\ x_2\\ x_3\\ x_4\\ \end{bmatrix} &= \begin{bmatrix} -3\\ 4\\ 0\\ 0\\ \end{bmatrix} + 1\begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix} + 1\begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix}\\ &= \begin{bmatrix} 0 \\ -1 \\ 1 \\ 1 \end{bmatrix} \end{flalign} \]

And here’s what we get when \(x_3 = 5\) and \(x_4 = -10\):

\[ \begin{flalign} \begin{bmatrix} x_1\\ x_2\\ x_3\\ x_4\\ \end{bmatrix} &= \begin{bmatrix} -3\\ 4\\ 0\\ 0\\ \end{bmatrix} + 5\begin{bmatrix} 1 \\ -2 \\ 1 \\ 0 \end{bmatrix} + (-10)\begin{bmatrix} 2 \\ -3 \\ 0 \\ 1 \end{bmatrix}\\ &= \begin{bmatrix} -18 \\ 24 \\ 5 \\ -10 \end{bmatrix} \end{flalign} \]

With the power of linear combinations, we can use another technique to find solutions to linear systems. If we know one of the solutions of a consistent system, we can find all of the other solutions. Here’s how:

If \(\mathbf{w}\) is the solution to a system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\), then \(\mathbf{w} + \mathbf{z}\) is also a solution if and only if \(\mathbf{z}\) is in the null space of \(A\).

Let’s say I want to find all of the solutions to the following system of equations:

\[ 2x_1 + x_2 - x_3 = 3 \] \[ 5x_1 + 2x_2 + x_3 = 5 \] \[ 3x_1 + x_2 + 2x_3 = 2 \]

And let’s say I gave you one of the solutions to this system already: \(x_1 = -1\), \(x_2 = 5\), and \(x_3 = 0\).

We can use this theorem to find the other solutions. To do so, we first find the null space of the coefficient matrix. The first step is to row-reduce the coefficient matrix:

\[ \begin{bmatrix} 2 & 1 & -1 \\ 5 & 2 & 1 \\ 3 & 1 & 2 \\ \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 3 \\ 0 & \boxed{1} & -7 \\ 0 & 0 & 0 \\ \end{bmatrix} \]

Now let’s create a homogeneous system using this reduced matrix.

\[ x_1 + 3x_3 = 0 \] \[ x_2 - 7x_3 = 0 \]

From here, we can write all of the solutions of this system as linear combinations:

\[ x_1 = -3x_3 \] \[ x_2 = 7x_3 \]

Writing this as a vector equality:

\[ \begin{flalign} \begin{bmatrix} x_1\\x_2\\x_3 \end{bmatrix} &= \begin{bmatrix} -3x_3 \\ 7x_3 \\ x_3 \end{bmatrix}\\ &= x_3\begin{bmatrix} -3 \\ 7 \\ 1 \end{bmatrix}\\ \end{flalign}\]

Every element of the null space of the coefficient matrix is of this form. Therefore, we can write every solution of our initial system of equations as:

\[ \begin{bmatrix} -1\\5\\0 \end{bmatrix} + x_3\begin{bmatrix} -3\\7\\1 \end{bmatrix}\]

Spanning Sets

Now that we can write solution sets as linear combinations, I’m going to introduce another concept: the span of a set of vectors.

The span \(\langle S \rangle\) of the \(p\) vectors \(S = \{\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_p \}\) is the set of possible linear combinations of the \(\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_p\). More formally, for every possible choice of scalars \(\alpha_1, \alpha_2, ..., \alpha_p\), the span contains the following linear combination:

\[ \alpha_1\mathbf{u}_1 + \alpha_2 \mathbf{u}_2 + \cdots + \alpha_p \mathbf{u}_p\]

For example, consider this set of vectors:

\[ S = \left\{ \begin{bmatrix} 1 \\ 3 \\ 5 \end{bmatrix}, \begin{bmatrix} 2 \\ 4 \\ 6 \end{bmatrix}, \begin{bmatrix} 3 \\ 5 \\ 7 \end{bmatrix} \right\}\]

The span of \(S\) consists of all possible linear combinations of these vectors. Here are some of the elements of \(S\):

\[ \begin{flalign} \begin{bmatrix} 1 \\ 3 \\ 5 \end{bmatrix} + \begin{bmatrix} 2 \\ 4 \\ 6 \end{bmatrix} + \begin{bmatrix} 3 \\ 5 \\ 7 \end{bmatrix} &= \begin{bmatrix} 6 \\ 12 \\ 18 \end{bmatrix}\\ 2\begin{bmatrix} 1 \\ 3 \\ 5 \end{bmatrix} + 5\begin{bmatrix} 2 \\ 4 \\ 6 \end{bmatrix} + 8\begin{bmatrix} 3 \\ 5 \\ 7 \end{bmatrix} &= \begin{bmatrix} 36 \\ 66 \\ 96 \end{bmatrix} \\ (-4)\begin{bmatrix} 1 \\ 3 \\ 5 \end{bmatrix} + (-2)\begin{bmatrix} 2 \\ 4 \\ 6 \end{bmatrix} + 0\begin{bmatrix} 3 \\ 5 \\ 7 \end{bmatrix} &= \begin{bmatrix} -8 \\ -20 \\ -32 \end{bmatrix} \end{flalign}\]

How can we test whether a specific element is in this span? For example, let’s say we wanted to know if these vectors are in the span:

\[ \mathbf{u} = \begin{bmatrix} 11 \\ 23 \\ 35 \end{bmatrix} \] \[ \mathbf{v} = \begin{bmatrix} 11 \\ 20 \\ 30 \end{bmatrix} \]

Let’s focus on \(\mathbf{u}\) first. If \(\mathbf{u}\) is in the span, this means that there are scalars \(x_1\), \(x_2\), and \(x_3\) such that:

\[ x_1 \begin{bmatrix} 1 \\ 3 \\ 5 \end{bmatrix} + x_2 \begin{bmatrix} 2 \\ 4 \\ 6 \end{bmatrix} + x_3 \begin{bmatrix} 3 \\ 5 \\ 7 \end{bmatrix} = \begin{bmatrix} 11 \\ 23 \\ 35 \end{bmatrix} \]

We want to find a set of values for \(x_1\), \(x_2\), and \(x_3\) that will satisfy this equality. Let’s see what happens when we expand the left-hand side:

\[ \begin{bmatrix} x_1 \\ 3x_1 \\ 5x_1 \end{bmatrix} + \begin{bmatrix} 2x_2 \\ 4x_2 \\ 6x_2 \end{bmatrix} + \begin{bmatrix} 3x_3 \\ 5x_3 \\ 7x_3 \end{bmatrix} = \begin{bmatrix} 11 \\ 23 \\ 35 \end{bmatrix} \] \[ \begin{bmatrix} x_1 + 2x_2 + 3x_3 \\ 3x_1 + 4x_2 + 5x_3 \\ 5x_1 + 6x_2 + 7x_3 \end{bmatrix} = \begin{bmatrix} 11 \\ 23 \\ 35 \end{bmatrix} \] \[ x_1 + 2x_2 + 3x_3 = 11 \] \[ 3x_1 + 4x_2 + 5x_3 = 23 \] \[ 5x_1 + 6x_2 + 7x_3 = 35 \]

It turns out that this is just a system of equations in disguise! So now we just need to see if there are solutions to this system of equations. We can do this by row-reducing the augmented matrix:

\[ \begin{bmatrix} 1 & 2 & 3 & 11\\ 3 & 4 & 5 & 23\\ 5 & 6 & 7 & 35 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & -1 & 1\\ 0 & \boxed{1} & 2 & 5\\ 0 & 0 & 0 & 0 \end{bmatrix}\]

Because the last column is not a pivot column, this system is consistent. Therefore, our original vector \(\mathbf{u}\) is in the span of \(S\).

Now let’s test if \(\mathbf{v}\) is in the span of \(S\). As a reminder, this is what \(\mathbf{v}\) looks like:

\[ \mathbf{v} = \begin{bmatrix} 11 \\ 20 \\ 30 \end{bmatrix} \]

Asking whether \(\mathbf{v}\) is in the span is equivalent to asking whether the following system of equations has a solution:

\[ x_1 + 2x_2 + 3x_3 = 11 \] \[ 3x_1 + 4x_2 + 5x_3 = 20 \] \[ 5x_1 + 6x_2 + 7x_3 = 30 \]

To test if there are solutions, we row-reduce the augmented matrix:

\[ \begin{bmatrix} 1 & 2 & 3 & 11\\ 3 & 4 & 5 & 20\\ 5 & 6 & 7 & 30 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & -1 & 0\\ 0 & \boxed{1} & 2 & 0\\ 0 & 0 & 0 & 1 \end{bmatrix}\]

Because the last column is a pivot column, the system is inconsistent, and so \(\mathbf{v}\) is not part of the span of \(S\).

Writing null spaces as spans

We can write the null spaces of matrices as spans. For example, consider the following matrix:

\[ A = \begin{bmatrix} -2 & 4 & 6 & 2\\ -1 & 2 & 3 & 1\\ 2 & 0 & 2 & 2 \end{bmatrix} \]

We want to consider the system of equations with \(A\) as its coefficient matrix and the vector of constants \(\mathbf{0}\). The solutions to this system make up the null space of \(A\).

To find these solutions, we first row-reduce the augmented matrix:

\[ \begin{bmatrix} -2 & 4 & 6 & 2 & 0\\ -1 & 2 & 3 & 1 & 0\\ 2 & 0 & 2 & 2 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 1 & 1 & 0\\ 0 & \boxed{1} & 2 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\]

Writing this reduced matrix as a system of equations gives us:

\[ x_1 + x_3 + x_4 = 0 \] \[ x_2 + 2x_3 + x_4 = 0 \]

Rearranging to solve for the dependent variables \(x_1\) and \(x_2\) in terms of the free variables \(x_3\) and \(x_4\):

\[ x_1 = -x_3 - x_4 \] \[ x_2 = -2x_3 - x_4 \]

We can write the solutions in vector form as:

\[ \begin{flalign} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} &= \begin{bmatrix} -x_3 - x_4 \\ -2x_3 - x_4 \\ x_3 \\ x_4 \end{bmatrix}\\ &= \begin{bmatrix} -x_3 \\ -2x_3 \\ x_3 \\ 0 \end{bmatrix} + \begin{bmatrix} -x_4 \\ -x_4 \\ 0 \\ x_4 \end{bmatrix} \\ &= x_3 \begin{bmatrix} -1 \\ -2 \\ 1 \\ 0 \end{bmatrix} + x_4 \begin{bmatrix} -1 \\ -1 \\ 0 \\ 1 \end{bmatrix} \\ \end{flalign} \]

We can write every solution as a linear combination of these two vectors. Therefore, the solution set (which is the null space of \(A\)) can also be described as the span of these two vectors. The null space of \(A\) is thus:

\[ \left\langle \left\{ \begin{bmatrix} -1 \\ -2 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} -1 \\ -1 \\ 0 \\ 1 \end{bmatrix} \right\} \right\rangle \]

Linear Dependence and Independence

Let’s say we’re finding the solutions to a homogeneous system of equations with coefficient matrix \(A\). If we label the columns of \(A\) as \(\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\), then solving this system of equations is equivalent to finding coefficients \(\alpha_1, \alpha_2, ..., \alpha_n\) such that:

\[ \alpha_1 \mathbf{u}_1 + \alpha_2 \mathbf{u}_2 + \cdots + \alpha_n \mathbf{u}_n = \mathbf{0} \]

A true equation of this form is known as a relation of linear dependence on the set of vectors \(\{\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\}\).

Here’s an example of a relation of linear dependence:

\[ 2\begin{bmatrix}1 \\ -2 \end{bmatrix} + 4 \begin{bmatrix} -2 \\ 4 \end{bmatrix} + 3 \begin{bmatrix} 2 \\ -4 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}\]

Note that for any set of vectors \(\{\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\}\), the above equation can be trivially be made true by setting the coefficients \(\alpha_1, \alpha_2, ..., \alpha_n\) all to zero. This is known as a trivial relation of linear dependence.

Linearly independent sets of vectors

A set of vectors \(S = \{\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\}\) is linearly independent if the only relation of linear dependence on \(S\) is trivial. If there are non-trivial relations of linear dependence on \(S\), then \(S\) is linearly dependent.

Example 1

For example, consider this set of vectors:

\[ S = \left\{ \begin{bmatrix} 1 \\ 2 \\ -1 \end{bmatrix}, \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix}, \begin{bmatrix} 1 \\ 4 \\ 0 \end{bmatrix} \right\} \]

Is this set of vectors linearly independent? To find out, we need to determine if there is a non-trivial relation of linear dependence on \(S\). This is equivalent to finding values of \(\alpha_1\), \(\alpha_2\), and \(\alpha_3\) such that:

\[ \alpha_1\begin{bmatrix} 1 \\ 2 \\ -1 \end{bmatrix} + \alpha_2\begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} + \alpha_3\begin{bmatrix} 1 \\ 4 \\ 0 \end{bmatrix} = \mathbf{0}\]

Notice that this is the same as solving the following system of equations:

\[ 1\alpha_1 + 1\alpha_2 + 1\alpha_3 = 0 \] \[ 2\alpha_1 - 1\alpha_2 + 4\alpha_3 = 0 \] \[ -1\alpha_1 + 2\alpha_2 + 0\alpha_3 = 0 \]

To solve this system, we row-reduce the augmented matrix:

\[ \begin{bmatrix} 1 & 1 & 1 & 0 \\ 2 & -1 & 4 & 0 \\ -1 & 2 & 0 & 0 \\ \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & 0 \\ 0 & \boxed{1} & 0 & 0 \\ 0 & 0 & \boxed{1} & 0 \end{bmatrix} \]

This row-reduced matrix tells us that the only solution to the system is \(\alpha_1 = 0\), \(\alpha_2 = 0\), and \(\alpha_3 = 0\). Therefore, the only relation of linear dependence on \(S\) is the trivial one, and so \(S\) is linearly independent.

Notice how the columns of the augmented matrix we used here are the vectors of \(S\) with a zero vector added on as the rightmost column.

In addition, notice that the rightmost column of zeros does not change when row-reducing the matrix. Therefore, we don’t actually need to keep track of this column.

Using these observations, we can create a faster way to determine if a set of vectors are linearly independent.

Example 2

Let’s say we want to determine if these vectors are linearly independent:

\[ S = \left\{ \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix}, \begin{bmatrix} 2 \\ -2 \\ 0 \end{bmatrix}, \begin{bmatrix} 3 \\ -3 \\ 2 \end{bmatrix} \right\} \]

We first create a matrix with the vectors of \(S\) as columns:

\[ \begin{bmatrix} 1 & 2 & 3 \\ -1 & -2 & -3 \\ 2 & 0 & 2 \end{bmatrix} \]

Now we row-reduce this matrix:

\[ \begin{bmatrix} 1 & 2 & 3 \\ -1 & -2 & -3 \\ 2 & 0 & 2 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 1 \\ 0 & \boxed{1} & 1 \\ 0 & 0 & 0 \end{bmatrix} \]

Because this row-reduced matrix has less pivot columns (2 columns in this case) than its corresponding homogeneous system has variables (3 variables in this case, one for each column in the matrix), the system has infinitely many solutions.

Therefore, because there can only be at most one trivial relation of linear dependence on \(S\), there must be non-trivial relations of linear dependence on \(S\), meaning that \(S\) is linearly dependent.

To summarize, if \(S\) is a set of vectors and \(A\) is a matrix with columns that are the vectors in \(S\), then \(S\) is linearly independent if and only if the homogeneous system of equations with coefficient matrix \(A\) has a unique solution (the trivial solution).

As a consequence, the columns of a square matrix \(A\) are linearly independent if and only if \(A\) is nonsingular.

Looking back at the row-reduced matrix, we were able to deduce that there are infinitely many solutions because the number of pivot columns was less than the number of total columns.

This gives us a simple test to tell if a set of vectors \(S\) is linearly dependent: first create a matrix, with each column being one of the vectors in \(S\). Then convert this matrix to reduced row-echelon form, and count the number of pivot columns.

If there are fewer pivot columns than total columns, then \(S\) is linearly dependent (because the homogeneous system corresponding to the matrix has infinitely many solutions).
If there are the same number of pivot columns as total columns, then \(S\) is linearly independent (because the homogeneous system corresponding to the matrix has a unique solution).

If we have a set \(S\) of vectors and there are more vectors in the set than entries in each vector, then the set is linearly dependent. This is because when we create the matrix from the vectors in \(S\), there will be more columns than rows, and so there must be less pivot columns than total columns (as each row can only have one leading 1).

Example 3

For example, consider this set of vectors:

\[ S = \left\{ \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix}, \begin{bmatrix} 2 \\ -2 \\ 0 \end{bmatrix}, \begin{bmatrix} 3 \\ -3 \\ 2 \end{bmatrix}, \begin{bmatrix} 4 \\ -5 \\ 4 \end{bmatrix} \right\} \]

Without having to do any row-reducing, we can notice that \(S\) contains 4 vectors and each vector has 3 entries. Because the number of vectors is greater than the number of entries in each vector, we can immediately tell that \(S\) is linearly dependent.

Here’s what the matrix formed from the vectors in \(S\) looks like:

\[ \begin{bmatrix} 1 & 2 & 3 & 4 \\ -1 & -2 & -3 & -5 \\ 2 & 0 & 2 & 4 \end{bmatrix} \]

If we were to row-reduce the matrix, because this matrix has 3 rows, there can be at most 3 pivot columns. However, there are 4 columns in total, so the number of pivot columns must be less than the number of total columns, meaning that the set \(S\) must be linearly dependent.

Linear Dependence/Independence and Spans

An interesting property of linearly dependent sets of vectors is that in any linearly dependent set, at least one of the vectors can be written as a linear combination of some of the others.

We can use this fact to describe spans more efficiently. If we have a span described with a linearly dependent set of vectors, we can rewrite the same span using one less vector.

For example, let’s look at this set of vectors:

\[ S = \left\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3 \right\} = \left\{ \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix}, \begin{bmatrix} 2 \\ -2 \\ 0 \end{bmatrix}, \begin{bmatrix} 3 \\ -3 \\ 2 \end{bmatrix} \right\} \]

We found previously that this set is linearly dependent. This means there is a relation of the form:

\[ x_1 \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} + x_2 \begin{bmatrix} 2 \\ -2 \\ 0 \end{bmatrix} + x_3 \begin{bmatrix} 3 \\ -3 \\ 2 \end{bmatrix} = \mathbf{0}\]

We can find a solution by row-reducing the augmented matrix for this system:

\[ \begin{bmatrix} 1 & 2 & 3 & 0\\ -1 & -2 & -3 & 0\\ 2 & 0 & 2 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 1 & 0\\ 0 & \boxed{1} & 1 & 0\\ 0 & 0 & 0 & 0 \end{bmatrix} \]

This reduced matrix corresponds to:

\[ x_1 = -x_3 \] \[ x_2 = -x_3 \]

By choosing \(x_3 = 1\), we get the solution \(x_1 = -1\), \(x_2 = -1\), and \(x_3 = 1\), meaning that we can write the following relation of linear dependence:

\[ (-1)\mathbf{v}_1 + (-1)\mathbf{v}_2 + 1\mathbf{v}_3 = \mathbf{0}\]

By solving for \(\mathbf{v}_3\), we can write \(\mathbf{v}_3\) as a linear combination of \(\mathbf{v}_1\) and \(\mathbf{v}_2\):

\[ \mathbf{v}_3 = \mathbf{v}_1 + \mathbf{v}_2 \]

By the definition of a span, any member \(\mathbf{u}\) of the span of \(S\) can be written as a linear combination as follows:

\[ \mathbf{u} = \alpha_1 \mathbf{v}_1 + \alpha_2 \mathbf{v}_2 + \alpha_3 \mathbf{v}_3 \]

Using our equation for \(\mathbf{v}_3\), any linear combination of this form can be rewritten as:

\[ \mathbf{u} = \alpha_1 \mathbf{v}_1 + \alpha_2 \mathbf{v}_2 + \alpha_3 (\mathbf{v}_1 + \mathbf{v}_2) \] \[ \mathbf{u} = \alpha_1 \mathbf{v}_1 + \alpha_2 \mathbf{v}_2 + \alpha_3 \mathbf{v}_1 + \alpha_3 \mathbf{v}_2 \] \[ \mathbf{u} = (\alpha_1 + \alpha_3)\mathbf{v}_1 + (\alpha_2 + \alpha_3)\mathbf{v}_2 \]

We have successfully written an arbitrary vector \(\mathbf{u}\) in the span of \(S\) as a linear combination of the two vectors \(\mathbf{v}_1\) and \(\mathbf{v}_2\). Therefore, because any vector in the span of \(S\) can be written as a linear combination of \(\mathbf{v}_1\) and \(\mathbf{v}_2\), the span of \(S\) can be written with just two vectors:

\[ \langle S \rangle = \langle \left\{\mathbf{v}_1, \mathbf{v}_2 \right\} \rangle = \left\langle \left\{ \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix}, \begin{bmatrix} 2 \\ -2 \\ 0 \end{bmatrix} \right\} \right\rangle\]

The reason we were able to reduce this spanning set down from three to two vectors is because our original spanning set \(S\) was linearly dependent. Because of this, we were able to write one of the vectors \(\mathbf{v}_3\) as a linear combination of the other two vectors, allowing us to reduce the spanning set.

A quicker way to reduce spanning sets

There is a more systematic way to reduce spanning sets. Let’s start with this span:

\[ S = \left\{ \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix}, \begin{bmatrix} 3 \\ 9 \\ 6 \end{bmatrix}, \begin{bmatrix} -1 \\ 4 \\ 0 \end{bmatrix}, \begin{bmatrix} -1 \\ 11 \\ 2 \end{bmatrix} \right\}\]

We will put these vectors into a matrix and row-reduce:

\[ \begin{bmatrix} 1 & 3 & -1 & -1 \\ 3 & 9 & 4 & 11 \\ 2 & 6 & 0 & 2 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 3 & 0 & 1 \\ 0 & 0 & \boxed{1} & 2 \\ 0 & 0 & 0 & 0 \end{bmatrix} \]

The pivot columns of the row-reduced matrix are the 1st and 3rd column. So let’s take the 1st and 3rd columns of the original matrix:

\[ \left\{ \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix}, \begin{bmatrix} -1 \\ 4 \\ 0 \end{bmatrix} \right\}\]

The span of this set is the same as the span of the set we started off with!

Orthogonality and More Vector Operations

Note: This section is incomplete.

Complex conjugates of vectors

The complex conjugate of a complex number \(\alpha\) is the number obtained by swapping the sign of the imaginary part of \(\alpha\). For example, the conjugate of \(5 + 2i\) is \(5 - 2i\). The conjugate of \(\alpha\) is denoted by \(\overline{\alpha}\).

We can find the complex conjugate of a vector by conjugating every entry. More formally, the conjugate of a vector \(\mathbf{u}\), denoted by \(\overline{\mathbf{u}}\), is defined by \([\overline{\mathbf{u}}]_i = \overline{[\mathbf{u}]_i}\) for \(1 \le i \le m\), where \(m\) is the size of \(\mathbf{u}\).

Properties of complex conjugates

If \(\mathbf{x}\) and \(\mathbf{y}\) are vectors, then \(\overline{\mathbf{x} + \mathbf{y}} = \overline{\mathbf{x}} + \overline{\mathbf{y}}\).
If \(\alpha\) is a scalar and \(\mathbf{x}\) is a vector, then \(\overline{\alpha\mathbf{x}} = (\overline{\alpha})(\overline{\mathbf{x}})\).

The inner product

The inner product of two vectors \(\mathbf{u}\) and \(\mathbf{v}\) of size \(m\) is defined as:

\[ \langle \mathbf{u}, \mathbf{v} \rangle = \overline{[\mathbf{u}]_1}[\mathbf{v}]_1 + \overline{[\mathbf{u}]_2}[\mathbf{v}]_2 + \cdots + \overline{[\mathbf{u}]_m}[\mathbf{v}]_m \]

Here’s an example of an inner product:

\[ \mathbf{u} = \begin{bmatrix} 2 + 5i \\ 3 \\ 6 - 2i \end{bmatrix} \] \[ \mathbf{v} = \begin{bmatrix} 11 \\ 2 - 7i \\ -3 + i \end{bmatrix} \] \[ \begin{flalign} \langle \mathbf{u}, \mathbf{v} \rangle &= (\overline{2+5i})(11) + (\overline{3})(2-7i) + (\overline{6-2i})(-3+i)\\ &= (2-5i)(11) + 3(2-7i) + (6+2i)(-3+i)\\ &= (22-55i) + (6-21i) + (-20)\\ &= 8 - 76i \end{flalign}\]

Note that when both vectors have only real entries, the inner product is equivalent to the dot product.

Properties of inner products

\(\langle \mathbf{u} + \mathbf{v}, \mathbf{w} \rangle = \langle \mathbf{u}, \mathbf{w} \rangle + \langle \mathbf{v}, \mathbf{w} \rangle\)
\(\langle \mathbf{u}, \mathbf{v} + \mathbf{w} \rangle = \langle \mathbf{u}, \mathbf{v} \rangle + \langle \mathbf{u}, \mathbf{w} \rangle\)
\(\langle \alpha\mathbf{u}, \mathbf{v} \rangle = \overline{\alpha} \langle \mathbf{u}, \mathbf{v} \rangle\)
\(\langle \mathbf{u}, \alpha\mathbf{v} \rangle = \alpha \langle \mathbf{u}, \mathbf{v} \rangle\)
\(\langle \mathbf{u}, \alpha\mathbf{v} \rangle = \overline{\langle \mathbf{v}, \mathbf{u} \rangle}\)

Norms of vectors

For a vector \(\mathbf{u}\) of size \(m\), its norm is calculated as follows:

\[ ||\mathbf{u}|| = \sqrt{|[\mathbf{u}]_1|^2 + |[\mathbf{u}]_2|^2 + \cdots + |[\mathbf{u}]_m|^2} \]

If \(\mathbf{u}\) only contains real number entries, the norm of \(\mathbf{u}\) is equal to its magnitude: the length of the vector when plotted on the coordinate plane.

Note that because the entries of \(\mathbf{u}\) can be complex, we need to take the absolute value of each entry in order to assure that the norm is never negative.

One interesting property about the norm is that \(||\mathbf{u}||^2 = \langle \mathbf{u}, \mathbf{u} \rangle\). In addition, the inner product \(\langle \mathbf{u}, \mathbf{u} \rangle\) is always greater than or equal to zero, and is equal to zero if and only if \(\mathbf{u}\) is the zero vector.

Orthogonal vectors

Two vectors \(\mathbf{u}\) and \(\mathbf{v}\) are orthogonal if the inner product \(\langle \mathbf{u}, \mathbf{v} \rangle\) equals 0.

An orthogonal set of vectors is a set of vectors where any two distinct vectors from that set will always be orthogonal. More formally, a set \(S = \{\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\}\) is orthogonal if \(\langle \mathbf{u}_i, \mathbf{u}_j \rangle = 0\) whenever \(i \ne j\).

Nonsingular Matrix Equivalences, Part 2

With our discussion of linearly independent sets, we can add another equivalence to our list of nonsingular matrix equivalences.

If \(A\) is an \(n \times n\) square matrix, these conditions are equivalent:

\(A\) is a nonsingular matrix.
The reduced row-echelon form of \(A\) is equal to the \(n \times n\) identity matrix.
The null space of \(A\) contains only the zero vector.
The linear system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) has a single unique solution for every possible choice of \(\mathbf{b}\).
The columns of \(A\) are linearly independent.

Unit 3: Matrices and Determinants

A First Course in Linear Algebra link: http://linear.ups.edu/html/chapter-M.html (Matrices), http://linear.ups.edu/html/chapter-D.html (Determinants)

Matrix Operations

There are some operations we can perform on matrices.

Matrix equality

Two matrices are equal if they are the same size and if every corresponding entry is equal. More formally, the \(m \times n\) matrices \(A\) and \(B\) are equal if \([A]_{i,j} = [B]_{i,j}\) for \(1 \le i \le m\) and \(1 \le j \le n\).

Matrix addition

We can add two matrices by adding their corresponding entries. For example:

\[ \begin{flalign} \begin{bmatrix} -5 & 10\\ 2 & 4 \end{bmatrix} + \begin{bmatrix} 12 & -7\\ -8 & 3 \end{bmatrix} &= \begin{bmatrix} -5 + 12 & 10 + (-7)\\ 2 + (-8) & 4 + 3 \end{bmatrix} \\ &= \begin{bmatrix} 7 & 3\\ -6 & 7 \end{bmatrix} \\ \end{flalign}\]

Here’s the formal definition of the sum of two \(m \times n \) matrices \(A\) and \(B\):

\[ [A+B]_{i, j} = [A]_{i, j} + [B]_{i, j} \; \text{ for } 1 \le i \le m, 1 \le j \le n \]

Multiplying matrices by scalars

We can multiply a matrix by a scalar by multiplying the individual entries by that scalar. For example:

\[ \begin{flalign} 3\begin{bmatrix} -5 & 10\\ 2 & 4 \end{bmatrix} &= \begin{bmatrix} 3(-5) & 3(10) \\ 3(2) & 3(4) \end{bmatrix} \\ &= \begin{bmatrix} -15 & 30 \\ 6 & 12 \end{bmatrix} \\ \end{flalign}\]

Here’s the formal definition of scalar multiplication for an \(m \times n\) matrix \(A\):

\[ [\alpha A]_{i, j} = \alpha[A]_{i, j} \; \text{ for } 1 \le i \le m, 1 \le j \le n \]

Matrix properties

If \(A\) and \(B\) are \(m \times n\) matrices, then \(A+B\) is an \(m \times n\) matrix.
If \(\alpha\) is a scalar and \(A\) is an \(m \times n\) matrix, then \(\alpha A\) is an \(m \times n\) matrix.
If \(A\) and \(B\) are matrices, then \(A + B = B + A\).
If \(A\), \(B\), and \(C\) are matrices, then \(A + (B + C) = (A + B) + C\).
There exists a matrix \(\mathrm{o}\) known as the zero matrix such that \(A + \mathrm{o} = A\) for all matrices \(A\).
For every possible matrix \(\mathrm{A}\), there exists a matrix \(-A\) such that \(A + (-A) = \mathrm{o}\).
If \(\alpha\) and \(\beta\) are scalars and \(A\) is a matrix, then \(\alpha(\beta A) = (\alpha \beta)A\).
If \(\alpha\) is a scalar and \(A\) and \(B\) are matrices, then \(\alpha(A + B) = \alpha A + \alpha B\).
If \(\alpha\) and \(\beta\) are scalars and \(A\) is a matrix, then \((\alpha + \beta)A = \alpha A + \beta A\).
For all matrices \(A\), \(1A = A\).

Notice the similarities between these matrix properties and the vector properties mentioned in Vector Operations.

In addition, the zero matrix \(\mathrm{o}\) is simply a matrix filled with only zeros.

Matrix transposes

The transpose of a matrix \(A\), denoted by \(A^t\), is defined as the matrix formed by swapping \(A\)’s rows and columns. More formally, if \(A\) is an \(m \times n\) matrix:

\[ [A^t]_{i, j} = [A]_{j, i} \; \text{ for } 1 \le i \le m, 1 \le j \le n \]

Here’s an example:

\[ \begin{bmatrix} \class{red}{-2} & \class{red}{3} & \class{red}{7} \\ \class{blue}{5} & \class{blue}{-4} & \class{blue}{0} \\ \class{purple}{-11} & \class{purple}{-6} & \class{purple}{2} \end{bmatrix}^t = \begin{bmatrix} \class{red}{-2} & \class{blue}{5} & \class{purple}{-11} \\ \class{red}{3} & \class{blue}{-4} & \class{purple}{-6} \\ \class{red}{7} & \class{blue}{0} & \class{purple}{2} \end{bmatrix}\]

You can also view transposing a matrix as flipping the entries across the main diagonal (the diagonal that starts in the upper-left corner and goes towards the lower-right corner).

Symmetric matrices

A matrix is known as symmetric if it equals its transpose. Here’s an example of a symmetric matrix:

\[ \begin{bmatrix} 1 & 2 & 3\\ 2 & 4 & 5 \\ 3 & 5 & 6 \end{bmatrix} \]

Note that symmetric matrices must also be square matrices. A non-square matrix cannot be symmetric because transposing it will change its dimensions (for example, the transpose of a \(4 \times 3\) matrix is a \(3 \times 4\) matrix).

Properties of matrix transposes

\((A+B)^t = A^t + B^t\) for any matrices \(A\) and \(B\)
\((\alpha A)^t = \alpha A^t\) for any scalar \(\alpha\) and matrix \(A\)
\((A^t)^t = A\) for any matrix \(A\)

Complex conjugates

As a reminder, the complex conjugate of a complex number \(\alpha\), denoted by \(\overline{\alpha}\), is the number obtained by switching the sign of the imaginary part. For example, the complex conjugate of \(2 + 4i\) is \(2 - 4i\).

We can find the complex conjugate of a matrix \(A\), denoted by \(\overline{A}\), by calculating the complex conjugate of every entry. More formally, for an \(m \times n\) matrix \(A\):

\[ \left[\overline{A}\right]_{i, j} = \overline{[A]_{i,j}}\; \text{ for } 1 \le i \le m, 1 \le j \le n\]

Here’s an example:

\[ A = \begin{bmatrix} 1 + 3i & 5\\ 7i & 2 - 10i \end{bmatrix} \] \[ \overline{A} = \begin{bmatrix} 1 - 3i & 5\\ -7i & 2 + 10i \end{bmatrix} \]

Matrix conjugation properties

\(\overline{A+B} = \overline{A}+\overline{B}\) for any two matrices \(A\) and \(B\)
\(\overline{\alpha A} = \overline{\alpha}\overline{A}\) for any matrix \(A\) and scalar \(\alpha\)
\(\overline{\left(\overline{A}\right)} = A\) for any matrix \(A\)
\(\overline{(A^t)} = \left(\overline{A}\right)^t\) for any matrix \(A\)

Adjoints of matrices

The adjoint of a matrix \(A\), denoted by \(A^*\), is the transpose of the conjugate of \(A\).

\[ A^* = \left(\overline{A}\right)^t \]

Here’s an example:

\[ A = \begin{bmatrix} 1 + 3i & 5\\ 7i & 2 - 10i \end{bmatrix} \] \[ \overline{A} = \begin{bmatrix} 1 - 3i & 5\\ -7i & 2 + 10i \end{bmatrix} \] \[ A^* = \begin{bmatrix} 1 - 3i & -7i\\ 5 & 2 + 10i \end{bmatrix} \]

Matrix adjoint properties

\((A+B)^* = A^* + B^*\) for any two matrices \(A\) and \(B\)
\((\alpha A)^* = \overline{\alpha}A^*\) for any matrix \(A\) and scalar \(\alpha\)
\((A^*)^* = A\) for any matrix \(A\)

Matrix Multiplication

Matrix-vector products

Now we’re going to look at how to multiply a matrix with a vector. This definition may seem strange at first, but it does have some nice properties that we can takek advantage of later on.

The product of an \(m \times n\) matrix \(A\) with columns \(\mathbf{A}_1, \mathbf{A}_2, ..., \mathbf{A}_n\) and a vector \(\mathbf{u}\) of size \(n\) is:

\[ \begin{flalign} A\mathbf{u} &= [\mathbf{u}]_1\mathbf{A}_1 + [\mathbf{u}]_2 \mathbf{A}_2 + [\mathbf{u}]_3 \mathbf{A}_3 + \cdots + [\mathbf{u}]_n \mathbf{A}_n\\ &= \sum_{i=1}^n [\mathbf{u}]_i\mathbf{A}_i \end{flalign} \]

In other words, to multiply \(A\) by \(\mathbf{u}\), we calculate the linear combination of the entries of \(\mathbf{u}\) and the columns of \(A\).

Here’s an example:

\[ A = \begin{bmatrix} \class{red}{-9} & \class{blue}{4} & \class{purple}{-7} \\ \class{red}{-5} & \class{blue}{-7} & \class{purple}{6} \\ \class{red}{2} & \class{blue}{9} & \class{purple}{-9} \\ \class{red}{4} & \class{blue}{10} & \class{purple}{-4} \end{bmatrix} \] \[ \mathbf{u} = \begin{bmatrix} \class{red}{3} \\ \class{blue}{-5} \\ \class{purple}{10} \end{bmatrix} \] \[ \begin{flalign} A\mathbf{u} &= \class{red}{3} \begin{bmatrix} \class{red}{-9} \\ \class{red}{-5} \\ \class{red}{2} \\ \class{red}{4} \end{bmatrix} + (\class{blue}{-5})\begin{bmatrix} \class{blue}{4} \\ \class{blue}{-7} \\ \class{blue}{9} \\ \class{blue}{10} \end{bmatrix} + \class{purple}{10}\begin{bmatrix} \class{purple}{-7} \\ \class{purple}{6} \\ \class{purple}{-9} \\ \class{purple}{-4} \end{bmatrix}\\ &= \begin{bmatrix} -27 \\ -15 \\ 6 \\ 12 \end{bmatrix} + \begin{bmatrix} -20 \\ 35 \\ -45 \\ -50 \end{bmatrix} + \begin{bmatrix} -70 \\ 60 \\ -90 \\ -40 \end{bmatrix}\\ &= \begin{bmatrix} -117 \\ 80 \\ -129 \\ -78 \end{bmatrix} \end{flalign}\]

The nice thing about this definition of matrix-vector multiplication is that it allows us to represent an entire system of equations in just one very short equation.

Remember that if \(\mathbf{A}_1, \mathbf{A}_2, ..., \mathbf{A}_n\) are the columns of an \(m \times n\) matrix \(A\), a solution \(\mathbf{x}\) to the system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) can be represented by the solutions to:

\[ [\mathbf{x}]_1\mathbf{A}_1 + [\mathbf{x}]_2\mathbf{A}_2 + \cdots + [\mathbf{x}]_n\mathbf{A}_n = \mathbf{b} \]

Using our definition of matrix-vector multiplication, this can be simplified all the way down to:

\[ A\mathbf{x} = \mathbf{b} \]

The solutions to this system of equations are simply the vectors \(\mathbf{x}\) that make the equation \(A\mathbf{x} = \mathbf{b}\) true.

Matrix multiplication

We can extend this definition further to allow us to multiply two matrices together. When we multiply an \(m \times n\) matrix \(A\) by an \(n \times p\) matrix \(B\), we get an \(m \times p\) matrix \(AB\). For \(1 \le i \le p\), column \(i\) of \(AB\) is equal to the matrix-vector product \(A\mathbf{B}_i\), where \(\mathbf{B}_i\) is column \(i\) of the matrix \(B\).

\[ AB = A[\mathbf{B}_1|\mathbf{B}_2|...|\mathbf{B}_p] = [A\mathbf{B}_1|A\mathbf{B}_2|...|A\mathbf{B}_p] \]

The notation \([\mathbf{v}_1|\mathbf{v}_2|...|\mathbf{v}_n]\) means creating a matrix with columns \(\mathbf{v}_1, \mathbf{v}_2, ..., \mathbf{v}_n\).

In order to multiply two matrices, the number of columns in the first matrix must equal the number of rows in the second matrix.

Here’s an example of matrix multiplication. Let’s say we have these two matrices:

\[ A = \begin{bmatrix} 4 & 6 & -2\\ 1 & -4 & 0\\ 2 & 5 & -3 \\ 1 & 3 & 6 \end{bmatrix} \] \[ B = \begin{bmatrix} 7 & -1\\ 2 & 0\\ -2 & -3 \end{bmatrix} \]

To multiply these matrices, we need to find \(A\mathbf{B}_1\) and \(A\mathbf{B}_2\) first. Here are the calculations for these two products:

\[ \begin{flalign} A\mathbf{B}_1 &= \begin{bmatrix} 4 & 6 & -2\\ 1 & -4 & 0\\ 2 & 5 & -3 \\ 1 & 3 & 6 \end{bmatrix}\begin{bmatrix} 7 \\ 2 \\ -2 \end{bmatrix}\\ &= 7 \begin{bmatrix} 4 \\ 1 \\ 2 \\ 1 \end{bmatrix} + 2 \begin{bmatrix} 6 \\ -4 \\ 5 \\ 3 \end{bmatrix} + (-2) \begin{bmatrix} -2 \\ 0 \\ -3 \\ 6 \end{bmatrix}\\ &= \begin{bmatrix} 44 \\ -1 \\ 30 \\ 1 \end{bmatrix} \end{flalign} \] \[ \begin{flalign} A\mathbf{B}_2 &= \begin{bmatrix} 4 & 6 & -2\\ 1 & -4 & 0\\ 2 & 5 & -3 \\ 1 & 3 & 6 \end{bmatrix}\begin{bmatrix} -1 \\ 0 \\ -3 \end{bmatrix}\\ &= (-1) \begin{bmatrix} 4 \\ 1 \\ 2 \\ 1 \end{bmatrix} + 0 \begin{bmatrix} 6 \\ -4 \\ 5 \\ 3 \end{bmatrix} + (-3) \begin{bmatrix} -2 \\ 0 \\ -3 \\ 6 \end{bmatrix}\\ &= \begin{bmatrix} 2 \\ -1 \\ 7 \\ -19 \end{bmatrix} \end{flalign} \]

We can put these vectors together to get the overall matrix product:

\[ \begin{flalign} AB &= A[\mathbf{B}_1|\mathbf{B_2}]\\ &= [A\mathbf{B}_1|A\mathbf{B}_2] \\ &= \begin{bmatrix} 44 & 2\\ -1 & -1\\ 30 & 7\\ 1 & -19 \end{bmatrix} \end{flalign}\]

There is another equivalent way to define matrix multiplication: for an \(m \times n\) matrix \(A\) and an \(n \times p\) matrix \(B\), each entry of the \(m \times p\) matrix \(AB\) is given by:

\[ \begin{flalign} [AB]_{i, j} &= [A]_{i,1}[B]_{1,j} + [A]_{i,2}[B]_{2,j} + \cdots + [A]_{i,n}[B]_{n,j}\\ &= \sum_{k=1}^n [A]_{i,k}[B]_{k,j} \end{flalign}\]

Let’s work out the previous example this way. As a reminder, here are the two matrices we multiplied previously:

\[ A = \begin{bmatrix} 4 & 6 & -2\\ 1 & -4 & 0\\ 2 & 5 & -3 \\ 1 & 3 & 6 \end{bmatrix} \] \[ B = \begin{bmatrix} 7 & -1\\ 2 & 0\\ -2 & -3 \end{bmatrix} \]

The top-left entry of \(AB\) is given by:

\[ \begin{flalign} [AB]_{\class{red}{1}, \class{blue}{1}} &= [A]_{\class{red}{1},1}[B]_{1,\class{blue}{1}} + [A]_{\class{red}{1},2}[B]_{2,\class{blue}{1}} + [A]_{\class{red}{1},3}[B]_{3,\class{blue}{1}}\\ &= 4(7) + 6(2) + (-2)(-2)\\ &= 44 \end{flalign}\]

The entry at row 1 and column 2 is given by:

\[ \begin{flalign} [AB]_{\class{red}{1}, \class{blue}{2}} &= [A]_{\class{red}{1},1}[B]_{1,\class{blue}{2}} + [A]_{\class{red}{1},2}[B]_{2,\class{blue}{2}} + [A]_{\class{red}{1},3}[B]_{3,\class{blue}{2}}\\ &= 4(-1) + 6(0) + (-2)(-3)\\ &= 2 \end{flalign}\]

We can calculate the rest of the entries this way.

Properties of matrix multiplication

\(A\mathrm{o}_{n \times p} = \mathrm{o}_{m \times p}\) for any \(m \times n\) matrix \(A\)
\(\mathrm{o}_{p \times m}A = \mathrm{o}_{p \times n}\) for any \(m \times n\) matrix \(A\)
\(AI_n = I_m A = A\) for any \(m \times n\) matrix \(A\)
\(A(B+C) = AB+AC\) for any \(m \times n\) matrix A and \(n \times p\) matrices \(B\) and \(C\)
\((B+C)D = BD+CD\) for any \(n \times p\) matrices \(B\) and \(C\) and \(p \times s\) matrix \(D\)
\(\alpha(AB) = (\alpha A)B = A(\alpha B)\) for any scalar \(\alpha\), \(m \times n\) matrix \(A\), and \(n \times p\) matrix \(B\)
\(A(BD) = (AB)D\) for any \(m \times n\) matrix \(A\), \(n \times p\) matrix \(B\), and \(p \times s\) matrix \(D\). In other words, matrix multiplication is associative.
Matrix multiplication is not commutative: \(AB\) does not necessarily equal \(BA\).
\(\overline{A}\overline{B} = \overline{AB}\) for any \(m \times n\) matrix \(A\) and \(n \times p\) matrix \(B\)
\((AB)^t = B^t A^t\) for any \(m \times n\) matrix \(A\) and \(n \times p\) matrix \(B\)
\((AB)^* = B^* A^*\) for any \(m \times n\) matrix \(A\) and \(n \times p\) matrix \(B\)

Matrix Inverses

Matrix multiplication allows us to define an interesting operation on matrices.

In the real numbers, we can define the reciprocal of a number \(x\) as the number that when multiplied by \(x\) gives 1. For example, \(1/4\) is the reciprocal of 4 because \(4 \times 1/4 = 1\).

We can define a similar idea for matrices. We define a matrix \(B\) to be the inverse of an \(n \times n\) matrix \(A\) if \(AB = BA = I_n\), where \(I_n\) is the \(n \times n\) identity matrix. (You can think of \(I_n\) as being the matrix version of the number 1, as multiplying a matrix by the identity matrix does not change it.)

We can use this idea of a matrix inverse to solve some systems of linear equations. For example, consider this system:

\[ \begin{flalign} 2x_1 + 3x_2 + 4x_3 = 7 \\ x_1 - x_2 + 4x_3 = 10 \\ x_1 + 3x_3 = -2 \end{flalign} \]

We can represent this system as the equation \(A\mathbf{x} = \mathbf{b}\), where \(A\) is the coefficient matrix and \(\mathbf{b}\) is the vector of constants. Here are the values of \(A\) and \(\mathbf{b}\):

\[ A = \begin{bmatrix} 2 & 3 & 4 \\ 1 & -1 & 4 \\ 1 & 0 & 3 \end{bmatrix} \] \[ \mathbf{b} = \begin{bmatrix} 7 \\ 10 \\ -2 \end{bmatrix} \]

If we had a matrix \(B\) that was the inverse of the matrix \(A\), then we could write \(\mathbf{x}\) as follows:

\[ \begin{flalign} \mathbf{x} &= I_3\mathbf{x} \\ &= (BA)\mathbf{x}\\ &= B(A\mathbf{x})\\ &= B\mathbf{b} \end{flalign}\]

In this case, the inverse of \(A\) is:

\[ B = \begin{bmatrix} -3 & -9 & 16\\ 1 & 2 & -4\\ 1 & 3 & -5 \end{bmatrix}\]

I will discuss how to calculate inverses like this one later. For now, we can use this inverse \(B\) to find the solution \(\mathbf{x}\) to the system of equations:

\[ \begin{flalign} \mathbf{x} &= B\mathbf{b}\\ &= \begin{bmatrix} -3 & -9 & 16\\ 1 & 2 & -4\\ 1 & 3 & -5 \end{bmatrix} \begin{bmatrix} 7 \\ 10 \\ -2 \end{bmatrix}\\ &= \begin{bmatrix} -143 \\ 35 \\ 47 \end{bmatrix} \end{flalign}\]

Looking back at this example, if we multiply \(A\) and \(B\) together, we get:

\[ \begin{flalign} AB &= \begin{bmatrix} 2 & 3 & 4 \\ 1 & -1 & 4 \\ 1 & 0 & 3 \end{bmatrix}\begin{bmatrix} -3 & -9 & 16\\ 1 & 2 & -4\\ 1 & 3 & -5 \end{bmatrix}\\ &= \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ \end{bmatrix} \end{flalign} \]

Now let’s find the product \(BA\):

\[ \begin{flalign} BA &= \begin{bmatrix} -3 & -9 & 16\\ 1 & 2 & -4\\ 1 & 3 & -5 \end{bmatrix}\begin{bmatrix} 2 & 3 & 4 \\ 1 & -1 & 4 \\ 1 & 0 & 3 \end{bmatrix}\\ &= \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ \end{bmatrix} \end{flalign} \]

Both of these products are the identity matrix, which is what we should always get when we multiply a matrix by its inverse or vice versa.

Finding matrix inverses

Now let’s go through a procedure on how we can find matrix inverses.

To find the inverse of a matrix \(A\), we need to find another matrix \(B\) such that \(AB = I_n\). We can use the definition of matrix multiplication to expand this:

\[ AB = I_n \] \[ A[\mathbf{B}_1|\mathbf{B}_2|...|\mathbf{B}_n] = I_n \] \[ [A\mathbf{B}_1|A\mathbf{B}_2|...|A\mathbf{B}_n] = [\mathbf{e}_1|\mathbf{e}_2|...|\mathbf{e}_n] \]

The vectors \(\mathbf{e}_1, \mathbf{e_2}, ..., \mathbf{e}_n\) are the columns of the identity matrix \(I_n\).

Now we can set the columns of both matrices equal to each other, to get the equations \(A\mathbf{B}_1 = \mathbf{e}_1\), \(A\mathbf{B}_2 = \mathbf{e}_2\), ..., \(A\mathbf{B}_n = \mathbf{e}_n\).

Let’s go over an example. We will define \(A\) as the matrix we used previously:

\[ A = \begin{bmatrix} 2 & 3 & 4 \\ 1 & -1 & 4 \\ 1 & 0 & 3 \end{bmatrix} \]

Now we solve for the columns of \(B\): \(\mathbf{B}_1\), \(\mathbf{B}_2\), and \(\mathbf{B}_3\). We can do this by using the equations I previously mentioned: \(A\mathbf{B}_1 = \mathbf{e}_1\), \(A\mathbf{B}_2 = \mathbf{e}_2\), and \(A\mathbf{B}_3 = \mathbf{e}_3\). Let’s start with the first one:

\[ A\mathbf{B}_1 = \mathbf{e}_1 = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\]

This is the same as solving the system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{e}_1\). We can do this by row-reducing the augmented matrix:

\[ \begin{bmatrix} 2 & 3 & 4 & 1 \\ 1 & -1 & 4 & 0 \\ 1 & 0 & 3 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} 1 & 0 & 0 & -3\\ 0 & 1 & 0 & 1\\ 0 & 0 & 1 & 1 \end{bmatrix} \]

From here, we can read out that the solution is \(x_1 = -3\), \(x_2 = 1\), and \(x_3 = 1\). Therefore:

\[ \mathbf{B}_1 = \begin{bmatrix} -3 \\ 1 \\ 1 \end{bmatrix} \]

The next equation to solve for is \(A\mathbf{B}_2 = \mathbf{e}_2 = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\). Here’s the row-reduction for this equation:

\[ \begin{bmatrix} 2 & 3 & 4 & 0 \\ 1 & -1 & 4 & 1 \\ 1 & 0 & 3 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} 1 & 0 & 0 & -9\\ 0 & 1 & 0 & 2\\ 0 & 0 & 1 & 3 \end{bmatrix} \] \[ \mathbf{B}_2 = \begin{bmatrix} -9 \\ 2 \\ 3 \end{bmatrix} \]

And finally, we have \(A\mathbf{B}_3 = \mathbf{e}_3 = \begin{bmatrix} 0 \\ 0 \\ 1\end{bmatrix}\). Here’s the row-reduction for this equation:

\[ \begin{bmatrix} 2 & 3 & 4 & 0 \\ 1 & -1 & 4 & 0 \\ 1 & 0 & 3 & 1 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} 1 & 0 & 0 & 16\\ 0 & 1 & 0 & -4\\ 0 & 0 & 1 & -5 \end{bmatrix} \] \[ \mathbf{B}_3 = \begin{bmatrix} 16 \\ -4 \\ -5 \end{bmatrix} \]

Using these three vectors \(\mathbf{B}_1\), \(\mathbf{B}_2\), and \(\mathbf{B}_3\), we can construct the inverse matrix \(B\):

\[ \begin{flalign} B &= [\mathbf{B}_1|\mathbf{B}_2|\mathbf{B}_3]\\ &= \begin{bmatrix} -3 & -9 & 16\\ 1 & 2 & -4\\ 1 & 3 & -5 \end{bmatrix} \end{flalign}\]

A faster way to perform this process is to create a new matrix formed by combining the columns of \(A\) with the identity matrix \(I_n\). Here’s what that would look like in this case:

\[ \begin{bmatrix} \class{red}{2} & \class{red}{3} & \class{red}{4} & \class{blue}{1} & \class{blue}{0} & \class{blue}{0}\\ \class{red}{1} & \class{red}{-1} & \class{red}{4} & \class{blue}{0} & \class{blue}{1} & \class{blue}{0}\\ \class{red}{1} & \class{red}{0} & \class{red}{3} & \class{blue}{0} & \class{blue}{0} & \class{blue}{1} \end{bmatrix}\]

The columns of matrix \(A\) are highlighted in red, and the columns of the identity matrix \(I_3\) are highlighted in blue.

Then, we row-reduce this matrix:

\[ \begin{bmatrix} 2 & 3 & 4 & 1 & 0 & 0\\ 1 & -1 & 4 & 0 & 1 & 0\\ 1 & 0 & 3 & 0 & 0 & 1 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} 1 & 0 & 0 & -3 & -9 & 16\\ 0 & 1 & 0 & 1 & 2 & -4\\ 0 & 0 & 1 & 1 & 3 & -5 \end{bmatrix} \]

The inverse of \(A\) is then obtained by reading out the rightmost \(n\) columns of this matrix. In this case, the rightmost 3 columns are:

\[ B = \begin{bmatrix} -3 & -9 & 16\\ 1 & 2 & -4\\ 1 & 3 & -5 \end{bmatrix}\]

Inverse of a \(2\times 2\) matrix

For \(2 \times 2\) matrices specifically, there is a simple formula for the inverse of \(A\):

\[ A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \] \[ A^{-1} = \frac{1}{ad-bc}\begin{bmatrix} d & -b \\ -c & a \end{bmatrix} \]

Note that this formula doesn’t work if \(ad-bc = 0\).

Invertible matrices

The general notation for an inverse of a matrix \(A\) is \(A^{-1}\). This is similar to how the reciprocal of a real number \(x\) is \(x^{-1}\).

Just like how not every real number has a reciprocal (zero doesn’t have a reciprocal), only some matrices have an inverse. A matrix that has an inverse is known as invertible. Specifically, only nonsingular matrices have an inverse, so the properties “nonsingular” and “invertible” are equivalent.

If \(A\) is nonsingular, the system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) has a unique solution given by \(A^{-1}\mathbf{b}\).

Consider a \(2 \times 2\) matrix \(A\) of this form:

\[ A = \begin{bmatrix} a & b\\ c & d \end{bmatrix} \]

The matrix \(A\) is only invertible if \(ad - bc \ne 0\).

Properties of matrix inverses

Any invertible matrix \(A\) has a unique inverse.
If \(A\) and \(B\) are invertible, then \(AB\) is invertible and \((AB)^{-1} = A^{-1}B^{-1}\).
For any invertible matrix \(A\), \(A^{-1}\) is invertible and \((A^{-1})^{-1} = A\).
For any invertible matrix \(A\), \(A^t\) is invertible and \((A^t)^{-1} = (A^{-1})^t\).
For any invertible matrix \(A\) and scalar \(\alpha\), \(\alpha A\) is invertible and \((\alpha A)^{-1} = \frac{1}{\alpha}A^{-1}\).

Column and Row Spaces

Column spaces

The column space of a matrix is the span of the columns of the matrix (that is, the set of all possible linear combinations of the columns of a matrix).

For example, consider this matrix:

\[ A = \begin{bmatrix} 1 & 3 & 5 \\ -2 & 4 & 0 \\ 7 & -3 & 11 \end{bmatrix} \]

The column space of this matrix is the following span:

\[ \left\langle \left\{ \begin{bmatrix} 1 \\ -2 \\ 7\ \end{bmatrix}\\, \begin{bmatrix} 3 \\ 4 \\ -3 \end{bmatrix}\\, \begin{bmatrix} 5 \\ 0 \\ 11 \end{bmatrix} \right\} \right\rangle \]

How can we determine what vectors are in the column space of \(A\)? Is every possible vector of size 3 in \(A\)’s column space?

For example, let’s try to determine if the following vector is in \(A\)’s column space:

\[ \mathbf{b}_1 = \begin{bmatrix} 9 \\ 12 \\ -9 \end{bmatrix} \]

This is the same as asking if \(\mathbf{b}_1\) can be written as a linear combination of \(A\)’s columns. In other words, we are looking for scalars \(x_1\), \(x_2\), and \(x_3\) such that:

\[ x_1 \begin{bmatrix} 1 \\ -2 \\ 7 \end{bmatrix} + x_2 \begin{bmatrix} 3 \\ 4 \\ -3 \end{bmatrix} + x_3 \begin{bmatrix} 5 \\ 0 \\ 11 \end{bmatrix} = \begin{bmatrix} 9 \\ 12 \\ -9 \end{bmatrix} \]

This, in turn, is the same as solving the following system of equations:

\[ x_1 + 3x_2 + 5x_3 = 9 \] \[ -2x_1 + 4x_2 = 12 \] \[ 7x_1 - 3x_2 + 11x_3 = -9 \]

We can solve this system by row-reducing the following matrix, as we usually would:

\[ \begin{bmatrix} 1 & 3 & 5 & 9\\ -2 & 4 & 0 & 12\\ 7 & -3 & 11 & -9 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 2 & 0 \\ 0 & \boxed{1} & 1 & 3 \\ 0 & 0 & 0 & 0 \end{bmatrix}\]

Because the last column isn’t a pivot column, there is a solution to our system of equations (in fact, there are infinitely many of them). This means that the vector \(\mathbf{b}_1\) is a member of the column space of \(A\).

Let’s do another example, this time with the following vector:

\[ \mathbf{b}_2 = \begin{bmatrix} 9 \\ 12 \\ 3 \end{bmatrix} \]

To find if this vector is in the column space of \(A\), we row-reduce the augmented matrix \([A|\mathbf{b}_2]\):

\[ \begin{bmatrix} 1 & 3 & 5 & 9\\ -2 & 4 & 0 & 12\\ 7 & -3 & 11 & 3 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 2 & 0 \\ 0 & \boxed{1} & 1 & 0 \\ 0 & 0 & 0 & \boxed{1} \end{bmatrix}\]

Because the last column is a pivot column, the corresponding system is inconsistent, so \(\mathbf{b}_2\) is not in the column space of \(A\).

To summarize this process, the vector \(\mathbf{b}\) is in the column space of \(A\) if the system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) is consistent (i.e. if \(A\mathbf{x} = \mathbf{b}\) is consistent).

In the last example, \(\mathbf{b}_1\) was in the column space of \(A\), but not \(\mathbf{b}_2\). This also means that not every possible vector of size 3 is in the column space of \(A\). Is there a \(3 \times 3\) matrix with a column space that contains every possible vector of size 3?

Let’s take our matrix \(A\) and make a small adjustment to create a new matrix \(B\):

\[ B = \begin{bmatrix} 1 & 3 & 5 \\ -2 & 4 & 0 \\ 7 & -3 & 12 \end{bmatrix} \]

Let’s bring back our vectors \(\mathbf{b}_1\) and \(\mathbf{b}_2\) from before and check if they’re in \(B\)’s column space.

\[ \begin{flalign} \mathbf{b}_1 = \begin{bmatrix} 9 \\ 12 \\ -9 \end{bmatrix} \quad & \mathbf{b}_2 = \begin{bmatrix} 9 \\ 12 \\ 3 \end{bmatrix} \end{flalign} \]

\[ \begin{bmatrix} 1 & 3 & 5 & 9\\ -2 & 4 & 0 & 12\\ 7 & -3 & 12 & -9 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & 0 \\ 0 & \boxed{1} & 0 & 3 \\ 0 & 0 & \boxed{1} & 0 \end{bmatrix}\] \[ \begin{bmatrix} 1 & 3 & 5 & 9\\ -2 & 4 & 0 & 12\\ 7 & -3 & 12 & 3 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & -24 \\ 0 & \boxed{1} & 0 & -9 \\ 0 & 0 & \boxed{1} & 12 \end{bmatrix}\]

Both \(\mathbf{b}_1\) and \(\mathbf{b}_2\) are in \(B\)’s column space. But it turns out, any possible vector of size 3 is in \(B\)’s column space! To find out why, let’s look closer at \(B\). What happens if we row-reduce \(B\) by itself?

\[ \begin{bmatrix} 1 & 3 & 5 \\ -2 & 4 & 0 \\ 7 & -3 & 12 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 \\ 0 & \boxed{1} & 0 \\ 0 & 0 & \boxed{1} \end{bmatrix}\]

It turns out that \(B\) is a nonsingular matrix (since it row-reduces to the identity matrix). And what this means is that the system \(B\mathbf{x} = \mathbf{b}\) will always have a single solution no matter what \(\mathbf{b}\) is, so every possible vector of size 3 is in the column space of \(B\).

In general, if \(A\) is an \(n \times n\) nonsingular matrix, then its column space will consist of all possible vectors of size \(n\).

Simplifying a column space

Let’s go back to the matrix \(A\) and look at its column space:

\[ A = \begin{bmatrix} 1 & 3 & 5 \\ -2 & 4 & 0 \\ 7 & -3 & 11 \end{bmatrix} \]

We could describe the column space as the span of \(A\)’s columns:

\[ C(A) = \left\langle \left\{ \begin{bmatrix} 1 \\ -2 \\ 7 \end{bmatrix}, \begin{bmatrix} 3 \\ 4 \\ -3 \end{bmatrix}, \begin{bmatrix} 5 \\ 0 \\ 11 \end{bmatrix} \right\} \right\rangle\]

I’m using \(C(A)\) to denote the column space of \(A\).

However, since these vectors aren’t linearly independent, we can simplify our description of the span. Let’s try reducing this spanning set by row-reducing the matrix \(A\):

\[ \begin{bmatrix} 1 & 3 & 5\\ -2 & 4 & 0\\ 7 & -3 & 11 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 2 \\ 0 & \boxed{1} & 1 \\ 0 & 0 & 0 \end{bmatrix} \]

Because the 1st and 2nd columns are pivot columns, we can describe this span using only the first two columns from \(A\):

\[ C(A) = \left\langle \left\{ \begin{bmatrix} 1 \\ -2 \\ 7 \end{bmatrix}, \begin{bmatrix} 3 \\ 4 \\ -3 \end{bmatrix} \right\} \right\rangle \]

Row spaces

The row space of a matrix \(A\) is the column space of its transpose \(A^t\). The row space of \(A\) can also be seen as the span of the rows of \(A\).

Here’s the row space of our matrix from before:

\[ A = \begin{bmatrix} 1 & 3 & 5 \\ -2 & 4 & 0 \\ 7 & -3 & 11 \end{bmatrix} \] \[ A^t = \begin{bmatrix} 1 & -2 & 7\\ 3 & 4 & -3\\ 5 & 0 & 11 \end{bmatrix} \] \[ R(A) = \left\langle \begin{bmatrix} 1 \\ 3 \\ 5 \end{bmatrix}, \begin{bmatrix} -2 \\ 4 \\ 0 \end{bmatrix}, \begin{bmatrix} 7 \\ -3 \\ 11 \end{bmatrix} \right\rangle \]

I’m using \(R(A)\) to denote the row space of \(A\).

Row spaces are interesting because performing a row operation on a matrix does not have any effect on its row space. What this means is that if we row-reduce a matrix, the row space stays the same, allowing us to get a simpler spanning set.

Let’s row-reduce the matrix \(A\) to demonstrate this:

\[ \begin{bmatrix} 1 & 3 & 5 \\ -2 & 4 & 0 \\ 7 & -3 & 11 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 2\\ 0 & \boxed{1} & 1\\ 0 & 0 & 0 \end{bmatrix} \]

Taking the rows of this reduced matrix and writing them as column vectors, we get that the row space of \(A\) is:

\[ R(A) = \left\langle \left\{ \begin{bmatrix} 1 \\ 0 \\ 2 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 1\end{bmatrix} \right\}\right\rangle \]

Note that we don’t need to include the zero vector, since including the zero vector doesn’t change the span at all.

Now let’s do the same for \(B\):

\[ \begin{bmatrix} 1 & 3 & 5 \\ -2 & 4 & 0 \\ 7 & -3 & 12 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 \\ 0 & \boxed{1} & 0 \\ 0 & 0 & \boxed{1} \end{bmatrix}\] \[ R(B) = \left\langle \left\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \right\}\right\rangle \]

This also give us another way to find the column space of a matrix.

Column space from row operations

We can first transpose the matrix, then find its row space by row-reducing the matrix.

Elementary Matrices

Elementary matrices are matrices that are slight variations of the identity matrix. Specifically, they are the result of applying a row operation to the identity matrix. Here are the three types of elementary matrices:

For \(i \ne j\), \(E_{i, j}\) is the identity matrix with rows \(i\) and \(j\) swapped.
For \(\alpha \ne 0\), \(E_i(\alpha)\) is the identity matrix with the entry at row \(i\) and column \(i\) replaced with \(\alpha\).
For \(i \ne j\), \(E_{i, j}(\alpha)\) is the identity matrix with the entry at row \(j\) and column \(i\) replaced with \(\alpha\).

Here are some examples of \(3 \times 3\) elementary matrices:

\[ E_{1, 2} = \begin{bmatrix} 0 & 1 & 0\\ 1 & 0 & 0\\ 0 & 0 & 1 \end{bmatrix} \] \[ E_{2}(4) = \begin{bmatrix} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & 1 \end{bmatrix}\] \[ E_{1, 3}(-1) = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ -1 & 0 & 1 \end{bmatrix}\]

Elementary matrices are special because multiplying an elementary matrix by a matrix \(A\) has the same effect as performing a row operation on \(A\). Specifically:

\(E_{i,j}A\) is equal to the matrix obtained by swapping row \(i\) and \(j\) of \(A\).
\(E_i(\alpha) A\) is equal to the matrix obtained by multiplying row \(i\) of \(A\) by \(\alpha\).
\(E_{i,j}(\alpha) A\) is equal to the matrix obtained by adding \(\alpha\) times row \(i\) to row \(j\) of \(A\).

In addition, elementary matrices are nonsingular, and any nonsingular matrix can be written as a product of elementary matrices.

Determinants of Matrices

The determinant is a special property of a square matrix, and it is a scalar. As we will see in a bit, its main use is to determine if a square matrix is singular or nonsingular. It is calculated as follows:

If \(A\) is a \(1 \times 1\) matrix, then the determinant of \(A\), denoted \(\det(A)\) or \(|A|\), is the value of the only entry of \(A\) (i.e. \([A]_{1, 1}\)).

If \(A\) is a \(2 \times 2\) matrix as follows:

\[ A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \]

then \(\det(A) = ad-bc\).

If \(A\) is larger, then the calculation of a determinant is more complicated. Let’s go through an example for the following \(3 \times 3\) matrix:

\[ A = \begin{bmatrix} 2 & 3 & 1\\ -2 & 0 & 0\\ 4 & 2 & 3 \end{bmatrix}\]

We start by removing the first row and first column to get a smaller matrix, known as a submatrix:

\[ \begin{bmatrix} 0 & 0\\ 2 & 3\\ \end{bmatrix}\]

Then we multiply the determinant of this matrix by the first entry of the first row of \(A\) (i.e. \([A]_{1, 1}\)). By doing this, we get:

\[ \begin{flalign} [A]_{1,1}\det\begin{bmatrix} 0 & 0\\ 2 & 3\\ \end{bmatrix} &= 2\det\begin{bmatrix} 0 & 0\\ 2 & 3\\ \end{bmatrix}\\ &= 2(0(3) - 0(2))\\ &= 0 \end{flalign}\]

Now, we remove the first row and second column from \(A\), and multiply the determinant by the negative of the second entry of the first row of \(A\):

\[ \begin{flalign} -[A]_{1, 2}\det\begin{bmatrix} -2 & 0\\ 4 & 3\\ \end{bmatrix} &= -3\det\begin{bmatrix} -2 & 0\\ 4 & 3\\ \end{bmatrix}\\ &= -3((-2)(3) + 0(4))\\ &= 18 \end{flalign}\]

Now we remove the first row and third column from \(A\), and multiply the determinant by the third entry of the first row of \(A\).

\[ \begin{flalign} [A]_{1, 3}\det\begin{bmatrix} -2 & 0\\ 4 & 2\\ \end{bmatrix} &= 1\det\begin{bmatrix} -2 & 0\\ 4 & 2\\ \end{bmatrix}\\ &= 1((-2)(2) + 0(4))\\ &= -4 \end{flalign}\]

To find the overall determinant of \(A\), we add these three values together.

\[ \det(A) = 0 + 18 + (-4) = 14\]

Notice how we took the negative of \([A]_{1, 2}\). In general, if the column number \(j\) is even, we will need to take the negative of the matrix entry \([A]_{1, j}\).

To describe this process in a formula, we first need to discuss submatrices.

Submatrices

The submatrix \(A(i|j)\) is the matrix obtained by removing row \(i\) and column \(j\) from \(A\).

For example, consider this matrix:

\[ A = \begin{bmatrix} 1 & 2 & 3\\ -4 & -5 & -6\\ 7 & 8 & 9 \end{bmatrix} \]

Here are some examples of submatrices:

\[ A(1|1) = \begin{bmatrix} -5 & -6\\ 8 & 9 \end{bmatrix}\] \[ A(2|3) = \begin{bmatrix} 1 & 2\\ 7 & 8 \end{bmatrix}\]

Determinant formula

With this submatrix notation, we can write down the formula for a determinant of a \(2 \times 2\) or larger square matrix:

\[ \begin{flalign} \det(A) &= [A]_{1,1}\det(A(1|1)) - [A]_{1, 2}\det(A(1|2)) + \cdots + (-1)^{n+1}[A]_{1,n}\det(A(1|n))\\ &= \sum_{j=1}^n (-1)^{j+1}[A]_{1,j}\det(A(1|j)) \end{flalign} \]

This is known as the expansion about row 1. The reason is that we can actually perform this type of expansion about any row in the matrix, and we will still get the same determinant! In general, the expansion about row \(i\) is:

\[ \begin{flalign} \det(A) &= (-1)^{i+1}[A]_{i,1}\det(A(i|1)) + (-1)^{i+2}[A]_{i, 2}\det(A(i|2)) + \cdots + (-1)^{i+n}[A]_{i,n}\det(A(i|n))\\ &= \sum_{j=1}^n (-1)^{i+j}[A]_{i,j}\det(A(i|j)) \end{flalign}\]

We can also perform this expansion about any column of the matrix:

\[ \begin{flalign} \det(A) &= (-1)^{1+j}[A]_{1,j}\det(A(1|j)) + (-1)^{2+j}[A]_{2,j}\det(A(2|j)) + \cdots + (-1)^{n+j}[A]_{n,j}\det(A(n|j))\\ &= \sum_{i=1}^n (-1)^{i+j}[A]_{i,j}\det(A(i|j)) \end{flalign} \]

Using this information, we can cleverly choose certain rows or columns to expand about in order to calculuate determinants faster. Let’s go back to our previous matrix:

\[ A = \begin{bmatrix} 2 & 3 & 1\\ -2 & 0 & 0\\ 4 & 2 & 3 \end{bmatrix} \]

This time, we’re going to calculate the determinant of \(A\) by expanding about the second row:

\[ \begin{flalign} \det(A) &= (-1)^{2+1}[A]_{2,1}\det(A(2|1)) + (-1)^{2+2}[A]_{2,2}\det(A(2|2)) + (-1)^{2+3}[A]_{2,3}\det(A(2|3))\\ &= -(-2)\det\begin{bmatrix} 3 & 1 \\ 2 & 3\end{bmatrix} + 0\det\begin{bmatrix} 2 & 1 \\ 4 & 3 \end{bmatrix} + (-0) \det\begin{bmatrix} 2 & 3 \\ 4 & 2\end{bmatrix} \\ &= 2\det\begin{bmatrix} 3 & 1 \\ 2 & 3\end{bmatrix}\\ &= 2(3(3) - 1(2))\\ &= 14 \end{flalign}\]

The advantage of calculating the determinant this way is that two of the entries in the second row are 0, meaning we can skip over calculating two \(2 \times 2\) determinants.

Properties of determinants

A square matrix with a row or column that only contains zeros has a determinant of 0.
A square matrix with two equal rows or two equal columns has a determinant of 0.
If \(B\) is a matrix obtained by swapping two rows or columns of the square matrix \(A\), then \(\det(B) = -\det(A)\).
If \(B\) is a matrix obtained by multiplying a row or column of the square matrix \(A\) by a scalar \(\alpha\), then \(\det(B) = \alpha\det(A)\).
If \(B\) is a matrix obtained by adding \(\alpha\) times a row of the square matrix \(A\) to another row of \(A\), or adding \(\alpha\) times a column of \(A\) to another column of \(A\), then \(\det(B) = \det(A)\).
If \(A\) and \(B\) are square matrices of the same size, then \(\det(AB) = \det(A)\det(B)\).
The square matrix \(A\) is a singular matrix if and only if \(\det(A) = 0\).

The last property is a large part of why determinants are useful. We now have another way to check if a matrix is singular: using its determinant. If the determinant of a square matrix \(A\) is zero, then \(A\) is singular.

Determinants of elementary matrices

\(\det(I_n) = 1\)

The following can be derived from the properties of determinants and the determinant of an identity matrix:

\(\det(E_{i,j}) = -1\)
\(\det(E_i(\alpha)) = \alpha\)
\(\det(E_{i,j}(\alpha)) = 1\)

Nonsingular Matrix Equivalences, Part 3

Now that we’ve gone over inverse matrices, column spaces, determinants, we can add three more nonsingularity conditions to our list.

If \(A\) is an \(n \times n\) square matrix, these conditions are equivalent:

\(A\) is a nonsingular matrix.
The reduced row-echelon form of \(A\) is equal to the \(n \times n\) identity matrix.
The null space of \(A\) contains only the zero vector (i.e. the system \(A\mathbf{x} = \mathbf{0}\) has a single solution, the zero vector).
The linear system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) (i.e. the system \(A\mathbf{x} = \mathbf{b}\)) has a single unique solution for every possible choice of \(\mathbf{b}\).
The columns of \(A\) are linearly independent.
\(A\) is invertible (i.e. \(A\) has an inverse).
The column space of \(A\) consists of all possible vectors of size \(n\).
The determinant of \(A\) is nonzero (i.e. \(\det(A) \ne 0\)).

Unit 4: Vector Spaces

A First Course in Linear Algebra link: http://linear.ups.edu/html/chapter-VS.html

Intro to Vector Spaces

Did you notice that the basic properties of column vectors and matrices are very similar? (See the Vector Operations and Matrix Operations sections to review them.) It turns out that many other mathematical objects also share these properties, so we can use the power of linear algebra with them!

Mathematicians have a special name for sets of objects that behave like how we would expect vectors to behave: vector spaces.

A vector space is made of a set \(V\) of objects along with definitions of these two operations: vector addition (denoted by a plus sign) and scalar multiplication (denoted by juxtaposition, i.e. writing a scalar and a vector next to each other). \(V\) is a vector space if these ten properties (known as axioms) hold:

Additive Closure: If \(\mathbf{u}\) and \(\mathbf{v}\) are in \(V\), then \(\mathbf{u} + \mathbf{v}\) is in \(V\).
Scalar Closure: If \(\alpha\) is a scalar and \(\mathbf{u}\) is in \(V\), then \(\alpha\mathbf{u}\) is in \(V\).
Commutativity: If \(\mathbf{u}\) and \(\mathbf{v}\) are in \(V\), then \(\mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u}\).
Additive Associativity: If \(\mathbf{u}\), \(\mathbf{v}\), and \(\mathbf{w}\) are in \(V\), then \((\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v} + \mathbf{w})\)
Zero Vector: There is a zero vector, i.e. a vector \(\mathbf{0}\) such that \(\mathbf{u} + \mathbf{0} = \mathbf{u}\) for all \(\mathbf{u}\) in \(V\).
Additive Inverses: For every \(\mathbf{u}\) in \(V\), there is a vector \(-\mathbf{u}\) in \(V\) such that \(\mathbf{u} + (-\mathbf{u}) = \mathbf{V}\).
Scalar Multiplication Associativity: If \(\alpha\) and \(\beta\) are scalars and \(\mathbf{u}\) is in \(V\), then \(\alpha(\beta \mathbf{u}) = (\alpha\beta)\mathbf{u}\).
Distributivity Across Vector Addition: If \(\alpha\) is a scalar and \(\mathbf{u}\) and \(\mathbf{v}\) are in \(V\), then \(\alpha(\mathbf{u} + \mathbf{v}) = \alpha\mathbf{u} + \alpha\mathbf{v}\).
Distributivity Across Scalar Addition: If \(\alpha\) and \(\beta\) are scalars and \(\mathbf{u}\) is in \(V\), then \((\alpha + \beta)\mathbf{u} = \alpha\mathbf{u} + \beta\mathbf{u}\).
Scalar Multiplication By One: If \(\mathbf{u}\) is in \(V\), then \(1\mathbf{u} = \mathbf{u}\).

The objects in \(V\) are known as vectors. Note that in the context of vector spaces, the term “vector” can refer to any type of object, not just the column vectors we’re used to using.

In addition, vector addition and scalar multiplication can be defined however we want as long as these 10 axioms hold. This means that depending on your definitions, vector addition and scalar multiplication might not behave as you would expect!

Keep in mind that if even one of these axioms doesn’t hold, then the set and its operations is not a vector space.

Here are some examples of vector spaces:

The vector space of column vectors

The set of all column vectors of size \(m\), denoted \(\mathbb{C}^m\) (or \(\mathbb{R}^m\) if you’re only considering column vectors with real entries), is a vector space. As we’ve shown in the Vector Operations section, the ten properties hold when we’re using the standard definitions of vector addition and scalar multiplication.

The vector space of matrices

The set of all matrices of size \(m \times n\), also denoted \(M_{mn}\), is also a vector space. The ten properties required are met with the standard definitions of matrix addition and scalar multiplication, as shown by the properties in the Matrix Operations section.

The vector space of polynomials

The set of all polynomials of degree \(n\) or less, denoted \(P_n\), is a vector space if these properties are defined as such:

Vector equality: Two polynomials \(a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n\) and \(b_0 + b_1 x + b_2 x^2 + \cdots + b_n x^n\) are equal if \(a_i = b_i\) for \(0 \le i \le n\) (i.e. the coefficients and constants of both polynomials are equal).
Vector addition: The sum of two polynomials \(\mathbf{u} = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n\) and \(\mathbf{v} = b_0 + b_1 x + b_2 x^2 + \cdots + b_n x^n\) is \(\mathbf{u} + \mathbf{v} = (a_0 + b_0) + (a_1 + b_1)x + \cdots + (a_n + b_n)x^n\).
Scalar multiplication: If \(\mathbf{u} = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n\), then \(\alpha \mathbf{u} = (\alpha a_0) + (\alpha a_1)x + \cdots + (\alpha a_n)x^n\).

Additional examples of vector spaces include the vector space of continuous functions and the vector space of infinite sequences.

A vector space with nonstandard operations

I previously said that we could define the operations of vector addition and scalar multiplication however we wanted as long as the vector space axioms are satisfied. Here’s an example of a vector space with some strange operations.

The vectors consist of all possible ordered pairs \((x_1, x_2)\). The two operations are defined as follows:

Vector addition: \((x_1, x_2) + (y_1, y_2) = (x_1 + y_1 + 1, x_2 + y_2 + 1)\)
Scalar multiplication: \(\alpha(x_1, x_2) = (\alpha x_1 + \alpha - 1, \alpha x_2 + \alpha - 1)\)

Surprisingly enough, this forms a vector space! It turns out that all 10 vector space axioms are satisfied even with these operations.

Strangely, the zero vector in this vector space is not \((0, 0)\), but instead \(\mathbf{0} = (-1, -1)\). To see why, let’s add the zero vector to a general vector:

\[ \begin{flalign} (x_1, x_2) + \mathbf{0} &= (x_1, x_2) + (-1, -1)\\ &= (x_1 - 1 + 1, x_2 - 1 + 1)\\ &= (x_1, x_2) \end{flalign} \]

Remember that the zero vector is defined as the vector such that \(\mathbf{u} + \mathbf{0} = \mathbf{u}\) for all vectors \(\mathbf{u}\) in the vector space. Because of this, \((-1, -1)\) is the zero vector in this vector space, since the sum of the zero vector and another vector \(\mathbf{u}\) is just \(\mathbf{u}\).

Properties of vector spaces

This is a list of properties that apply to vectors in vector spaces:

The zero vector \(\mathbf{0}\) is unique (i.e. there can only be one zero vector in a vector space).
Additive inverses are unique, i.e. for every vector \(\mathbf{u}\) in a vector space \(V\), there is only one additive inverse \(-\mathbf{u}\).
\(0\mathbf{u} = \mathbf{0}\) for every \(\mathbf{u}\) in a vector space \(V\).
\(\alpha\mathbf{0} = \mathbf{0}\) for every scalar \(\alpha\).
\(-\mathbf{u} = (-1)\mathbf{u}\) for every \(\mathbf{u}\) in a vector space \(V\).
- Note: \(-\mathbf{u}\) denotes the additive inverse of \(\mathbf{u}\), while \((-1)\mathbf{u}\) is the vector \(\mathbf{u}\) multiplied by the scalar -1. These are different concepts, but this property ensures that they will always be equal to each other.
If \(\alpha \mathbf{u} = \mathbf{0}\), then either \(\alpha = 0\) or \(\mathbf{u} = \mathbf{0}\).

Vector Spaces: Subspaces

A subspace \(W\) of a vector space \(V\) is a subset of \(V\) that is also a vector space and has the same definitions of vector addition and scalar multiplication as \(W\).

For example, consider the following subset of \(\mathbb{C}^2\), the set of column vectors of size 2:

\[ W = \left\{\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \bigg| x_1 - 3x_2 = 0 \right\}\]

This is the set of all column vectors of size 2 such that \(x_1 - 3x_2 = 0\). For example, these column vectors are in \(W\):

\[ \begin{bmatrix} 3 \\ 1 \end{bmatrix}, \begin{bmatrix} -6 \\ -2 \end{bmatrix}, \begin{bmatrix} 10 \\ 10/3 \end{bmatrix} \]

And these vectors are not in \(W\):

\[ \begin{bmatrix} 2 \\ 2 \end{bmatrix}, \begin{bmatrix} -3 \\ 1 \end{bmatrix}, \begin{bmatrix} 3/2 \\ 4/3 \end{bmatrix} \]

Is \(W\) a subspace? That is, using the definitions of vector addition and scalar multiplication in \(\mathbb{C}^2\), is \(W\) a vector space? To test if this subset is a subspace, we need to verify that the ten properties (axioms) of vector spaces hold.

First up, additive closure. If \(\mathbf{u}\) and \(\mathbf{v}\) are in \(W\), does that mean \(\mathbf{u} + \mathbf{v}\) is in \(W\)?

Let’s write these two vectors as follows:

\[ \mathbf{u} = \begin{bmatrix} u_1 \\ u_2 \end{bmatrix}, \, \mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} \]

We can write the sum of these two vectors as:

\[ \mathbf{u} + \mathbf{v} = \begin{bmatrix} u_1 + v_1 \\ u_2 + v_2 \end{bmatrix} \]

Now we need to know if this column vector satisfies the requirement to be in \(W\), \(x_1 - 3x_2 = 0\). For this vector, the first entry \(x_1\) is \(u_1 + v_1\), and the second entry is \(x_2 = u_2 + v_2\). Therefore, we must test if \((u_1 + v_1) - 3(u_2 + v_2) = 0\). Note that because \(\mathbf{u}\) and \(\mathbf{v}\) are in \(W\), we can ensure that \(u_1 - 3u_2 = 0\) and \(v_1 - 3v_2 = 0\).

\[ \begin{flalign} (u_1 + v_1) - 3(u_2 + v_2) &= u_1 + v_1 - 3u_2 - 3v_2\\ &= (u_1 - 3u_2) + (v_1 - 3v_2)\\ &= 0 + 0\\ &= 0 \end{flalign}\]

Therefore, if \(\mathbf{u}\) and \(\mathbf{v}\) are in \(W\), then \(\mathbf{u}+\mathbf{v}\) also meets the requirement to be in \(W\).

Now let’s test scalar closure. If \(\mathbf{u}\) is in \(W\), then is \(\alpha\mathbf{u}\) guaranteed to be in \(W\)? We can write \(\alpha\mathbf{u}\) as:

\[ \alpha\mathbf{u} = \alpha\begin{bmatrix}u_1 \\ u_2\end{bmatrix} = \begin{bmatrix}\alpha u_1 \\ \alpha u_2 \end{bmatrix} \]

To see if this vector \(\alpha\mathbf{u}\) is in \(W\), we need to test if \((\alpha u_1) - 3(\alpha u_2) = 0\). Once again, we know for sure that \(u_1 - 3u_2 = 0\), as \(\mathbf{u}\) is in \(W\).

\[ \begin{flalign} (\alpha u_1) - 3(\alpha u_2) &= \alpha (u_1 - 3u_2)\\ &= \alpha (0)\\ &= 0 \end{flalign}\]

Therefore, if \(\mathbf{u}\) is in \(W\), then \(\alpha\mathbf{u}\) is in \(W\).

Now let’s check if every vector \(\mathbf{u}\) in \(W\) has an inverse \(-\mathbf{u}\) that is in \(W\). The inverse of \(\mathbf{u}\) must be the following:

\[ -\mathbf{u} = \begin{bmatrix} -u_1 \\ -u_2 \end{bmatrix} \]

This time, we need to check if \((-u_1) - 3(-u_2) = 0\):

\[ \begin{flalign} (-u_1) - 3(-u_2) &= -u_1 + 3u_2 \\ &= -(u_1 - 3u_2)\\ &= -0\\ &= 0 \end{flalign}\]

Therefore, every vector \(\mathbf{u}\) in \(W\) has an inverse that is also in \(W\).

Is there a zero vector in \(W\)? The only candidate is the zero vector in \(\mathbb{C}^2\), and it meets the requirement to be in \(W\), as \(0 - 3(0) = 0\).

The other six properties (commutativity, associativity, etc.) hold for all vectors in \(W\), because any subset of column vectors from \(\mathbb{C}^2\) will retain these properties. Therefore, we don’t need to check them.

Because all ten properties of vector spaces hold, we can say that \(W\) is a vector space.

Testing if a subset is a subspace

There is a simpler way to check if a subset of a vector space is itself a vector space. A subset \(W\) of a vector space \(V\) is a subspace if it meets the following conditions:

\(W\) is nonempty.
If \(\mathbf{u}\) and \(\mathbf{v}\) are in \(W\), then \(\mathbf{u} + \mathbf{v}\) is in \(W\).
If \(\mathbf{u}\) is in \(W\) and \(\alpha\) is a scalar, then \(\alpha\mathbf{u}\) is in \(W\).

Specific subsets that are always subspaces

Given any vector space \(V\), the subsets \(V\) and \(\{\mathbf{0}\}\) will always be subspaces. These are known as the trivial subspaces of \(V\).

Null spaces and subspaces

The null space of any \(m \times n\) matrix \(A\) is a subspace of \(\mathbb{C}^n\).

For example, consider this subset:

\[ W = \left.\left\{\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \right| \begin{flalign} &4x_1 - 2x_2 + 7x_3 = 0, \\ &5x_1 + 3x_2 - x_3 = 0 \end{flalign} \right\}\]

This set is really just the solutions to the following homogeneous system of equations:

\[ 4x_1 - 2x_2 + 7x_3 = 0 \] \[ 5x_1 + 3x_2 - x_3 = 0 \]

The solution set to this system is the null space of the following matrix:

\[ \begin{bmatrix} 4 & -2 & 7\\ 5 & 3 & -1 \end{bmatrix}\]

Because the set \(W\) is a null space of a matrix, we immediately know that \(W\) is a subspace of \(\mathbb{C}^3\).

Vector Spaces: Linear Combinations and Spans

We can define linear combinations and spans for vector spaces in general the same way we did for column vectors. A linear combination of \(n\) vectors \(\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\) and \(n\) scalars \(\alpha_1, \alpha_2, ..., \alpha_n\) is:

\[ \alpha_1 \mathbf{u}_1 + \alpha_2 \mathbf{u}_2 + \cdots + \alpha_n \mathbf{u}_n \]

The span of a set of vectors \(S = \{\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\}\), where \(\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\) are vectors from a vector space \(V\), is the set of all possible linear combinations using these vectors.

Given a set of vectors \(S\) from a vector space \(V\), the span of \(S\) will always be a subspace.

As an example, let’s look at the following span in the vector space of \(2 \times 2\) matrices:

\[ \left\langle \left\{\begin{bmatrix}1 & 4\\ 2 & 1\end{bmatrix}, \begin{bmatrix}-2 & 1\\ 2 & 5\end{bmatrix}, \begin{bmatrix}2 & 2\\ 3 & 3\end{bmatrix}, \begin{bmatrix}-1 & 2\\ 1 & 2\end{bmatrix} \right\} \right\rangle \]

Is the matrix \(A = \begin{bmatrix}4 & -2 \\ 1 & 3 \end{bmatrix}\) in this span? If this matrix is in the span, then there are scalars \(\alpha_1\), \(\alpha_2\), \(\alpha_3\), and \(\alpha_4\) such that:

\[ \alpha_1\begin{bmatrix}1 & 4\\ 2 & 1\end{bmatrix} + \alpha_2\begin{bmatrix}-2 & 1\\ 2 & 5\end{bmatrix} + \alpha_3\begin{bmatrix}2 & 2\\ 3 & 3\end{bmatrix} + \alpha_4\begin{bmatrix}-1 & 2\\ 1 & 2\end{bmatrix} = \begin{bmatrix}4 & -2 \\1 & 3\end{bmatrix} \]

By equating matrix entries, we can turn this question into a system of linear equations:

\[ \begin{flalign} \alpha_1 - 2\alpha_2 + 2\alpha_3 - \alpha_4 &= 4\\ 4\alpha_1 + \alpha_2 + 2\alpha_3 + 2\alpha_4 &= -2\\ 2\alpha_1 + 2\alpha_2 + 3\alpha_3 + \alpha_4 &= 1\\ \alpha_1 + 5\alpha_2 + 3\alpha_3 + 2\alpha_4 &= 3 \end{flalign} \]

As usual, we can solve this system by row-reducing the augmented matrix:

\[ \begin{bmatrix} 1 & -2 & 2 & -1 & 4\\ 4 & 1 & 2 & 2 & -2\\ 2 & 2 & 3 & 1 & 1\\ 1 & 5 & 3 & 2 & 3 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & 0 & -10\\ 0 & \boxed{1} & 0 & 0 & -8\\ 0 & 0 & \boxed{1} & 0 & 7\\ 0 & 0 & 0 & \boxed{1} & 16 \end{bmatrix} \]

By doing this, we find a solution: \(\alpha_1 = -10\), \(\alpha_2 = -8\), \(\alpha_3 = 7\), and \(\alpha_4 = 16\). Therefore, the matrix \(A\) is in our span.

Notice how we were able to convert this question of whether the matrix \(A\) is in a span into a question of whether a system of linear equations is consistent. This idea of turning questions about vector spaces into questions about linear systems is an important technique that we will continue to encounter in the future.

Vector Spaces: Linear Independence

Just like with column vectors, we can also define what it means for a set of vectors from a vector space to be linearly independent.

Given scalars \(\alpha_1, \alpha_2, ..., \alpha_n\) and vectors \(\{\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\}\) from a vector space \(V\), a relation of linear dependence is a true equation of the form:

\[ \alpha_1 \mathbf{u}_1 + \alpha_2 \mathbf{u}_2 + \cdots + \alpha_n \mathbf{u}_n = \mathbf{0} \]

If all of the scalars are zero, then this is a trivial relation of linear dependence.

A set of vectors \(S = \{\mathbf{u}_1, \mathbf{u}_2, ..., \mathbf{u}_n\}\) is linearly independent if the only relation of linear dependence involving these vectors is the trivial one. Otherwise, \(S\) is linearly dependent.

Spanning Sets

A set \(S\) is known as a spanning set of a vector space \(V\) if \(S\) is a subset of \(V\) and the span of \(S\) is equal to \(V\) (i.e. \(\langle S \rangle = V\)). In this case, \(S\) is said to span \(V\).

For example, let’s consider the following span in \(P_2\), the vector space of polynomials of degree 2 or less:

\[ S_1 = \{3 - 2x + 5x^2, -7 + 6x - 4x^2, -1 + 2x + 6x^2\} \] \[ \langle S_1 \rangle = \langle \{3 - 2x + 5x^2, -7 + 6x - 4x^2, -1 + 2x + 6x^2\} \rangle\]

Does the set \(S_1\) span \(P_2\)? Let’s first check if some polynomials are in the span or not. For example, is \(1 + 2x + 17x^2\) in the span? This is equivalent to asking whether a linear combination of this form exists:

\[ \begin{flalign} \alpha_1(3-2x+5x^2) + \alpha_2(-7+6x-4x^2) + \alpha_3(-1+2x+6x^2) = 1 + 2x + 17x^2\\ (3\alpha_1 - 7\alpha_2 - \alpha_3) + (-2\alpha_1 + 6\alpha_2 + 2\alpha_3)x + (5\alpha_1 - 4\alpha_2 + 6\alpha_3)x^2 = 1 + 2x + 17x^2\\ \end{flalign} \]

By equating coefficients of the polynomial (i.e. setting the constant terms on both sides equal, as well as the \(x\) and \(x^2\) coefficients on both sides), we get the following system of equations:

\[ \begin{flalign} 3\alpha_1 - 7\alpha_2 - \alpha_3 = 1\\ -2\alpha_1 + 6\alpha_2 + 2\alpha_3 = 2\\ 5\alpha_1 - 4\alpha_2 + 6\alpha_3 = 17 \end{flalign}\]

Row-reducing the augmented matrix will tell us if there is a solution or not.

\[ \begin{bmatrix} 3 & -7 & -1 & 1\\ -2 & 6 & 2 & 2\\ 5 & -4 & 6 & 17 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 2 & 5\\ 0 & \boxed{1} & 1 & 2\\ 0 & 0 & 0 & 0 \end{bmatrix}\]

Note: Notice how each column in the first three columns of our original augmented matrix represents a polynomial in \(S_1\). For example, the first column represents the polynomial \(3 - 2x + 5x^2\). The last column represents the polynomial we are checking, which is \(1 + 2x + 17x^2\).

Because the last column is not a pivot column, there is a solution to the system of equations, meaning \(1 + 2x + 17x^2\) is in the span of \(S_1\).

Now let’s try doing the same thing for the polynomial \(1 + x + x^2\). We get the following system of equations and we want to know if it has a solution:

\[ \begin{flalign} 3\alpha_1 - 7\alpha_2 - \alpha_3 = 1\\ -2\alpha_1 + 6\alpha_2 + 2\alpha_3 = 1\\ 5\alpha_1 - 4\alpha_2 + 6\alpha_3 = 1 \end{flalign}\]

Let’s row-reduce the augmented matrix now:

\[ \begin{bmatrix} 3 & -7 & -1 & 1\\ -2 & 6 & 2 & 1\\ 5 & -4 & 6 & 1 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 2 & 0\\ 0 & \boxed{1} & 1 & 0\\ 0 & 0 & 0 & \boxed{1} \end{bmatrix}\]

Because the last column is a pivot column, the system is inconsistent, so \(1 + x + x^2\) is not in the span of \(S_1\).

Now that we’ve found a polynomial in \(P_2\) that isn’t in the span of \(S\), we can conclude that \(S\) does not span \(P_2\), since not every polynomial in \(P_2\) can be written as a linear combination of the polynomials in \(S_1\).

Now let’s look at this span:

\[ S_2 = \{2 + 5x + x^2, -1 + 2x + 3x^2, x + 2x^2\} \] \[ \langle S_2 \rangle = \langle \{2 + 5x + x^2, -1 + 2x + 3x^2, x + 2x^2\} \rangle\]

Let’s once again test if a specific polynomial is in the span, such as \(1 + 6x + 2x^2\).

When testing if this polynomial is in the span, we end up with the following system:

\[ \begin{flalign} 2\alpha_1 - \alpha_2 = 1\\ 5\alpha_1 + 2\alpha_2 + \alpha_3 = 6\\ \alpha_1 + 3\alpha_2 + 2\alpha_3 = 2 \end{flalign}\]

By row-reducing the augmented matrix, we get:

\[ \begin{bmatrix} 2 & -1 & 0 & 1\\ 5 & 2 & 1 & 6\\ 1 & 3 & 2 & 2 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & 1\\ 0 & \boxed{1} & 0 & 1\\ 0 & 0 & \boxed{1} & -1 \end{bmatrix}\]

The system is consistent, and so \(1 + 6x + x^2\) is part of the span of \(S_2\).

As it turns out, no matter what polynomial in \(P_2\) we choose, we will always be able to find the scalars \(\alpha_1\), \(\alpha_2\), and \(\alpha_3\) necessary to write the polynomial as a linear combination of the three polynomials in \(S_2\). To see why, let’s look at the coefficient matrix we’ve been using:

\[ A = \begin{bmatrix} 2 & -1 & 0\\ 5 & 2 & 1\\ 1 & 3 & 2 \end{bmatrix} \]

By row-reducing this matrix or finding its determinant (which is 11), we can conclude that this matrix is nonsingular, meaning that no matter what the vector of constants \(\mathbf{b}\) is, we will be able to find a solution to the system \(A\mathbf{x} = \mathbf{b}\). Therefore, every polynomial in \(P_2\) is part of the span of \(S_2\), and so \(S_2\) spans \(P_2\).

Vector Spaces: Bases

In this section, we’ll be looking at a special type of spanning set known as a basis. A basis of a vector space \(V\) is a set of vectors that is linearly independent and spans \(V\).

Remember that when we have a spanning set that is not linearly independent, we can write at least one of the vectors as a linear combination of the others, allowing us to create an equivalent span with fewer vectors. (This was shown in the Linear Dependence/Independence and Spans section.)

However, a linearly independent spanning set cannot be reduced any further. Therefore, a basis can be thought of as a spanning set that contains the least vectors possible.

For example, consider the span of the following set of vectors:

\[ \left\{ \begin{bmatrix} 3 \\ 2 \\ 7 \end{bmatrix}, \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix}, \begin{bmatrix} 6 \\ 3 \\ 4 \end{bmatrix}, \begin{bmatrix} -9 \\ -1 \\ -3 \end{bmatrix} \right\}\]

Let’s check to see if this set is linearly indpendent by row-reducing the matrix formed by combining these column vectors:

\[ \begin{bmatrix} 3 & 1 & 6 & -9\\ 2 & 2 & 3 & -1\\ 7 & 5 & 4 & -3 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & -2\\ 0 & \boxed{1} & 0 & 3\\ 0 & 0 & \boxed{1} & -1 \end{bmatrix}\]

By taking the columns of the original matrix that correspond to the pivot columns of the row-reduced matrix, we can find that this spanning set can be written more simply as:

\[ \left\{ \begin{bmatrix} 3 \\ 2 \\ 7 \end{bmatrix}, \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix}, \begin{bmatrix} 6 \\ 3 \\ 4 \end{bmatrix} \right\}\]

At this point, this set of vectors is linearly independent and spans \(\mathbb{C}^3\), so we have a basis of \(\mathbb{C}^3\).

The standard basis

The simplest example of a basis in \(\mathbb{C}^3\) is the following:

\[ \left\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \right\} \]

These are known as the standard unit vectors for \(\mathbb{C}^3\). Any vector in \(\mathbb{C}^3\) can be easily written as a combination of these three vectors, so the set spans \(\mathbb{C}^3\). Here are a few examples:

\[ \begin{flalign} \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} &= 1\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + 2\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + 3\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \\ \begin{bmatrix} -10 \\ 6 \\ -2 \end{bmatrix} &= (-10)\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + 6 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + (-2) \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \end{flalign} \]

The set is also linearly independent since none of the vectors can be written as a linear combination of the other two vectors.

A similar standard basis can be made for the vector space \(\mathbb{C}^n\) by creating \(n\) vectors such that the \(i\)th standard unit vector has a 1 as the \(i\)th entry and zeros for all other entries. For example, the standard unit vectors for \(\mathbb{C}^4\) are:

\[ \left\{ \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0\end {bmatrix}, \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1\end{bmatrix} \right\} \]

Notice that these vectors are just the columns of the \(4 \times 4\) identity matrix. In general, the \(i\)th standard basis vector of \(\mathbb{C}^n\) is the \(i\)th column of the \(n \times n\) identity matrix \(I_n\).

Because the columns of an \(n \times n\) nonsingular matrix are linearly independent and also span \(\mathbb{C}^n\), the columns of any nonsingular matrix form a basis for \(\mathbb{C}^n\).

Vector Spaces: Dimension

It turns out that all possible bases of a vector space have the same number of vectors. This means that the size of any basis of a vector space is a property of that vector space.

For example, since the vector space \(\mathbb{C}^3\) has a basis with 3 vectors (the standard basis), \(\mathbb{C}^3\) has a dimension of 3. In addition, all other possible bases of \(\mathbb{C}^3\) have 3 vectors.

In general, the vector space \(\mathbb{C}^n\) is \(n\), since the columns of the \(n \times n\) identity matrix (i.e. the \(n\) standard basis vectors) form a basis for \(\mathbb{C}^n\).

The vector space of all polynomials of degree \(n\) or less, \(P_n\), has dimension \(n + 1\), since the following set of polynomials is linearly independent and spans \(P_n\):

\[ \{1, x, x^2, ..., x^n\} \]

The vector space of \(m \times n\) matrices, \(M_{mn}\) has dimension \(mn\). For example, the vector space of \(2 \times 2\) matrices has the following basis of size 4:

\[ \left\{ \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 \\ 1 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 \\ 0 & 1 \end{bmatrix} \right\}\]

In general, for any vector space \(M_{mn}\), you can create a basis with \(mn\) matrices, each with one entry equal to 1 and the rest of the entries equal to 0.

One thing to keep in mind is that a vector space that only contains the zero vector has dimension 0.

Rank and nullity of a matrix

The dimensions of the column space and null space of a matrix are given special names: the rank and nullity respectively.

For example, consider the following matrix that we looked at in the Vector Spaces: Bases section:

\[ A = \begin{bmatrix} 3 & 1 & 6 & -9\\ 2 & 2 & 3 & -1\\ 7 & 5 & 4 & -3 \end{bmatrix}\]

What is the rank and nullity of this matrix? First, we need to determine the column space of this matrix. In the previous section, we did this by row-reducing the matrix:

By taking columns of the original matrix that correspond to the pivot columns, we get that the column space is the span of three vectors:

\[ C(A) = \left\langle \left\{ \begin{bmatrix} 3 \\ 2 \\ 7 \end{bmatrix}, \begin{bmatrix} 1 \\ 2 \\ 5 \end{bmatrix}, \begin{bmatrix} 6 \\ 3 \\ 4 \end{bmatrix} \right\} \right\rangle \]

Therefore, the rank of \(A\) is 3. Now let’s find the null space of \(A\). We can do that by row-reducing the augmented matrix for \(A\)’s homogeneous system:

\[ \begin{bmatrix} 3 & 1 & 6 & -9 & 0\\ 2 & 2 & 3 & -1 & 0\\ 7 & 5 & 4 & -3 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 0 & 0 & -2 & 0\\ 0 & \boxed{1} & 0 & 3 & 0\\ 0 & 0 & \boxed{1} & -1 & 0 \end{bmatrix}\]

Translating the row-reduced matrix back into equations:

\[ \begin{flalign} x_1 - 2x_4 = 0 &\to x_1 = 2x_4 \\ x_2 + 3x_4 = 0 &\to x_2 = -3x_4 \\ x_3 - x_4 = 0 &\to x_3 = x_4 \end{flalign} \]

Writing the solution set as a span:

\[ \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix} = \begin{bmatrix} 2x_4 \\ -3x_4 \\ x_4 \\ x_4 \end{bmatrix} = x_4\begin{bmatrix} 2 \\ -3 \\ 1 \\ 1 \end{bmatrix}\]

\[ N(A) = \left\langle \left\{ \begin{bmatrix} 2 \\ -3 \\ 1 \\ 1 \end{bmatrix} \right\} \right\rangle \]

Because the null space of \(A\) can be represented as the span of one vector (and a set with one nonzero vector is linearly independent), the nullity of \(A\) is 1.

The sum of a matrix’s rank and nullity will always equal the number of columns of the matrix. In this case, the rank of \(A\), which is 3, plus the nullity of \(A\), which is 1, equals the number of columns that \(A\) has, which is 4.

Since nonsingular matrices row-reduce to the identity matrix (which has \(n\) pivot columns, where \(n\) is the number of columns of the original matrix), a nonsingular matrix will always have a rank equal to the number of columns it has.

In addition, a nonsingular matrix will always have a nullity of zero, since for a matrix with \(n\) columns, the rank (which is \(n\) for a nonsingular matrix) plus nullity must equal \(n\).

Another way to show this is that the null space of a nonsingular matrix only contains the zero vector, meaning that its nullity must be zero.

Nonsingular Matrix Equivalences, Part 4

Let’s update our nonsingular matrix equivalences list again using what we know about vector spaces.

\(A\) is a nonsingular matrix.
The reduced row-echelon form of \(A\) is equal to the \(n \times n\) identity matrix.
The null space of \(A\) contains only the zero vector (i.e. the system \(A\mathbf{x} = \mathbf{0}\) has a single solution, the zero vector).
The linear system of equations with coefficient matrix \(A\) and vector of constants \(\mathbf{b}\) (i.e. the system \(A\mathbf{x} = \mathbf{b}\)) has a single unique solution for every possible choice of \(\mathbf{b}\).
The columns of \(A\) are linearly independent.
\(A\) is invertible (i.e. \(A\) has an inverse).
The column space of \(A\) is equal to \(\mathbb{C}^n\), the set of all possible vectors of size \(n\).
The determinant of \(A\) is nonzero (i.e. \(\det(A) \ne 0\)).
The columns of \(A\) form a basis for \(\mathbb{C}^n\).
The rank of \(A\) is \(n\).
The nullity of \(A\) is zero.

(If we are only considering vectors and matrices with real number entries, replace all occurrences of \(\mathbb{C}^n\) with \(\mathbb{R}^n\).)

Unit 5: Eigenvalues and Eigenvectors

A First Course in Linear Algebra link: http://linear.ups.edu/html/chapter-E.html (Eigenvalues)

Intro to Eigenvalues and Eigenvectors

Sometimes when we multiply a matrix by a vector, it has the same effect as multiplying that vector by a scalar. For example, consider this matrix and vector:

\[ A = \begin{bmatrix} 6 & 1\\ 4 & 6 \end{bmatrix},\, \mathbf{v} = \begin{bmatrix} 1 \\ 2 \end{bmatrix} \] \[ A\mathbf{v} = \begin{bmatrix} 6 & 1\\ 4 & 6 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 8 \\ 16 \end{bmatrix} = 8\mathbf{v} \]

Notice how \(A\mathbf{v}\) is equal to 8 times \(\mathbf{v}\). In this case, we call \(\mathbf{v}\) an eigenvector of \(A\) with eigenvalue 8.

In general, if \(A\mathbf{x} = \lambda \mathbf{x}\), then \(\mathbf{x}\) is known as an eigenvector of \(A\) with eigenvalue \(\lambda\). (Note that an eigenvector \(\mathbf{x}\) cannot be the zero vector, because that would trivially satisfy the equation \(A\mathbf{x} = \lambda \mathbf{x}\) no matter what \(A\) and \(\lambda\) are.)

How can we find the eigenvalues and eigenvectors of a matrix? We can first consider the equation \(A\mathbf{x} = \lambda \mathbf{x}\):

\[ A\mathbf{x} = \lambda\mathbf{x} \] \[ A\mathbf{x} - \lambda\mathbf{x} = \mathbf{0} \] \[ A\mathbf{x} - \lambda I_n\mathbf{x} = \mathbf{0} \] \[ (A - \lambda I_n)\mathbf{x} = \mathbf{0} \]

There are two cases here:

\(A - \lambda I_n\) is nonsingular, meaning that the system \((A - \lambda I_n)\mathbf{x} = \mathbf{0}\) has a single solution (the trivial one). This means that \(\mathbf{x} = \mathbf{0}\), but we are looking for values of \(\mathbf{x}\) that are nonzero (since eigenvectors cannot be the zero vector).
\(A - \lambda I_n\) is singular, meaning that the system \((A - \lambda I_n)\mathbf{x} = \mathbf{0}\) has infinitely many solutions. This means that there are nontrivial (nonzero) values of \(\mathbf{x}\) that satisfy the system, and these are the eigenvectors of \(A\) with eigenvalue \(\lambda\).

We want \(A - \lambda I_n\) to be singular, so we want to find the values of \(\lambda\) such that \(\det(A - \lambda I_n) = 0\).

The value of \(\det(A - \lambda I_n)\) is known as the characteristic polynomial, and the eigenvalues of \(A\) are the roots of this polynomial.

Going back to our previous example, the characteristic polynomial of \(A\) is:

\[ \begin{flalign} \det(A - \lambda I_n) &= \det\left(\begin{bmatrix} 6 & 1 \\ 4 & 6 \end{bmatrix} - \lambda\begin{bmatrix} 1 & 0\\ 0 & 1 \end{bmatrix}\right)\\ &= \det\left(\begin{bmatrix} 6 & 1 \\ 4 & 6 \end{bmatrix} - \begin{bmatrix} \lambda & 0\\ 0 & \lambda \end{bmatrix}\right)\\ &= \det \begin{bmatrix} 6 - \lambda & 1 \\ 4 & 6 - \lambda \end{bmatrix}\\ &= (6-\lambda)^2 - 4\\ &= 36 - 12\lambda + \lambda^2 - 4\\ &= \lambda^2 - 12\lambda + 32 \end{flalign}\]

Now let’s find the roots of this polynomial:

\[ \begin{flalign} &\lambda^2 - 12\lambda + 32 = 0\\ &(\lambda - 4)(\lambda - 8) = 0\\ &\lambda = 4, \lambda = 8 \end{flalign}\]

These are the eigenvalues of \(A\).

Now let’s find the eigenvectors associated with each eigenvalue. To do this, we need to solve the equation \((A - \lambda I_n)\mathbf{x} = \mathbf{0}\) for each eigenvalue.

Let’s start with \(\lambda = 4\):

\[ \begin{flalign} &\lambda = 4: \\ &(A - 4 I_n) \mathbf{x} = \mathbf{0}\\ &\begin{bmatrix} 6 - 4 & 1\\ 4 & 6 - 4 \end{bmatrix}\mathbf{x} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}\\ &\begin{bmatrix} 2 & 1\\ 4 & 2 \end{bmatrix}\mathbf{x} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{flalign}\] \[ \begin{bmatrix} 2 & 1 & 0 \\ 4 & 2 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & 1/2 & 0 \\ 0 & 0 & 0 \end{bmatrix} \] \[ x_1 + \frac{1}{2}x_2 = 0 \to x_1 = -\frac{1}{2}x_2 \] \[ \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} -(1/2)x_2 \\ x_2 \end{bmatrix} = x_2 \begin{bmatrix} -1/2 \\ 1 \end{bmatrix} \] \[ \text{eigenvectors for } \lambda = 4: \left\langle \begin{bmatrix} -1/2 \\ 1 \end{bmatrix} \right\rangle \]

Now let’s do the same with \(\lambda = 8\):

\[ \begin{flalign} &\lambda = 8: \\ &(A - 8 I_n) \mathbf{x} = \mathbf{0}\\ &\begin{bmatrix} 6 - 8 & 1\\ 4 & 6 - 8 \end{bmatrix}\mathbf{x} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}\\ &\begin{bmatrix} -2 & 1\\ 4 & -2 \end{bmatrix}\mathbf{x} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{flalign}\] \[ \begin{bmatrix} -2 & 1 & 0 \\ 4 & -2 & 0 \end{bmatrix} \xrightarrow{\text{RREF}} \begin{bmatrix} \boxed{1} & -1/2 & 0 \\ 0 & 0 & 0 \end{bmatrix} \] \[ x_1 - \frac{1}{2}x_2 = 0 \to x_1 = \frac{1}{2}x_2 \] \[ \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} (1/2)x_2 \\ x_2 \end{bmatrix} = x_2 \begin{bmatrix} 1/2 \\ 1 \end{bmatrix} \] \[ \text{eigenvectors for } \lambda = 8: \left\langle \begin{bmatrix} 1/2 \\ 1 \end{bmatrix} \right\rangle \]

The set of eigenvectors for an matrix \(A\) and an eigenvalue \(\lambda\) is known as the eigenspace of \(A\) for \(\lambda\).

The algebraic multiplicity of an eigenvalue is the highest power of \(x - \lambda\) that appears in the factored form of the characteristic polynomial.

The geometric multiplicity of an eigenvalue is the dimension of the eigenspace for \(\lambda\).

In our example, the algebraic and geometric multiplicity of \(\lambda = 4\) and \(\lambda = 8\) is 1.

Unit 6: Linear Transformations

A First Course in Linear Algebra link: http://linear.ups.edu/html/chapter-LT.html (Linear Transformations)

Intro to Linear Transformations

Let’s see what actually happens when we multiply a matrix \(A\) by a vector \(\mathbf{x}\). Here’s an example:

\[ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \]

This is what we get when we multiply \(A\) by an arbitrary vector \(\mathbf{x}\) with components \(x_1\), \(x_2\), and \(x_3\):

\[ \begin{flalign} A\mathbf{x} &= \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}\\ &= \begin{bmatrix} x_1 + 2x_2 + 3x_3\\ 4x_1 + 5x_2 + 6x_3\\ 7x_1 + 8x_2 + 9x_3 \end{bmatrix} \end{flalign}\]

Notice how the entries of the final result “feel linear”, that is, they look like linear expressions (they purely involve addition and multiplication without division or exponents).

There is a name for functions that behave this way: linear transformations. A linear transformation (also known as a linear map) is a function \(T(\mathbf{x})\) that meets these two requirements:

\(T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})\)
\(T(\alpha\mathbf{u}) = \alpha T(\mathbf{u})\)

The first property requires that \(T\) behaves linearly with vector addition. Applying \(T\) to two vectors \(\mathbf{x}\) and \(\mathbf{y}\) individually then adding the results gives the same result as adding the vectors first then applying \(T\) to the sum.

The second property requires that \(T\) behavies linearly with scalar multiplication. Multiplying \(\mathbf{u}\) by a scalar \(\alpha\) then applying \(T\) is the same as applying \(T\) to \(\mathbf{u}\) then multiplying by \(\alpha\).

Consider the following function:

\[ T\left(\begin{bmatrix} x_1 \\ x_2 \end{bmatrix}\right) = \begin{bmatrix} x_1 + 1 \\ x_2 + 1 \end{bmatrix} \]

Is \(T\) a linear transformation? We can test the first property:

\[ \mathbf{u} = \begin{bmatrix} u_1 \\ u_2 \end{bmatrix},\, \mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} \] \[ \begin{flalign} T(\mathbf{u} + \mathbf{v}) &= T\left(\begin{bmatrix} u_1 \\ u_2 \end{bmatrix} + \begin{bmatrix} v_1 \\ v_2 \end{bmatrix}\right)\\ &= T\left(\begin{bmatrix} u_1 + v_1 \\ u_2 + v_2 \end{bmatrix}\right)\\ &= \begin{bmatrix} u_1 + v_1 + 1 \\ u_2 + v_2 + 1\end{bmatrix} \end{flalign}\] \[ \begin{flalign} T(\mathbf{u}) + T(\mathbf{v}) &= T\left(\begin{bmatrix} u_1 \\ u_2 \end{bmatrix}\right) + T\left(\begin{bmatrix} v_1 \\ v_2 \end{bmatrix}\right)\\ &= \begin{bmatrix} u_1 + 1\\ u_2 + 1 \end{bmatrix} + \begin{bmatrix} v_1 + 1\\ v_2 + 1 \end{bmatrix}\\ &= \begin{bmatrix} u_1 + v_1 + 2 \\ u_2 + v_2 + 2\end{bmatrix} \end{flalign}\]

Because \(T\) doesn’t meet the requirements, \(T\) is not a linear transformation.

One property of linear transformations is that they must take the zero vector to the zero vector, i.e. \(T(\mathbf{0}) = \mathbf{0}\). (Think about it: If \(T(\mathbf{0}) \ne \mathbf{0}\), how would the requirements for a linear transformation be violated?)

A function that multiplies a matrix by a vector always ends up being a linear transformation.

Injective and Surjective Linear Transformations

Some linear transformations have special properties that I will discuss here.

Function notation

In this section, I will be using a special notation for functions. The notation \(f: A \to B\) represents a function \(f\) with domain \(A\) and codomain \(B\).

The domain of a function is the set of all possible inputs of a function. The codomain of a function is a set that includes all possible outputs of the function. Note that this is different than the function’s range, which is the set of all values that can actually come out of the function. The range of a function is always a subset of the codomain.

For example, consider the function \(f: \mathbb{R} \to \mathbb{R}\) defined by \(f(x) = x^2\).

The domain of \(f\) is \(\mathbb{R}\), meaning that we are only considering real number inputs.
The codomain of this function is \(\mathbb{R}\) (all real numbers). (We can specify the codomain to be any set we want as long as it contains all possible outputs of \(f\).)
The range of the function is all non-negative real numbers (since it’s impossible to get a negative number as an output of this function). Note that this is not the same as the codomain!

I will be using the same notation for linear transformations. For example, \(T: U \to V\) represents a linear transformation \(T\) with domain \(U\) and codomain \(V\).

Injective and surjective functions

An injective function is a function \(f\) such that if \(f(x) = f(y)\), then \(x = y\). In other words, no two different inputs correspond to the same output of \(f\).

The function \(f: \mathbb{R} \to \mathbb{R}\) defined by \(f(x) = x^3\) is injective because every output has only one input.
The function \(f: \mathbb{R} \to \mathbb{R}\) defined by \(f(x) = x^2\) is not injective because multiple inputs can correspond to the same output. For example, \(f(2) = f(-2) = 4\).

A surjective function is a function \(f\) such that for every \(y\) in the codomain of \(f\), there is an \(x\) such that \(f(x) = y\).

The function \(f: \mathbb{R} \to \mathbb{R}\) defined by \(f(x) = x^3\) is surjective because every element in the codomain (i.e. every real number) is a possible output of \(f\).
The function \(f: \mathbb{R} \to \mathbb{R}\) defined by \(f(x) = x^2\) is not surjective because negative numbers are not part of the possible outputs of \(f\). For example, there is no \(x\) in the domain such that \(f(x) = -1\).
The function \(f: \mathbb{R} \to [0, \infty)\) defined by \(f(x) = x^2\) is surjective because element in the codomain (i.e. every non-negative number) is a possible output of \(f\). (Notice that we’ve excluded negative numbers from the codomain of \(f\).)

An injective (or one-to-one) linear transformation is a transformation \(T: U \to V\) such that if \(T(\mathbf{x}) = T(\mathbf{y})\), then \(\mathbf{x} = \mathbf{y}\). In other words, no two distinct input vectors result in the same output vector when fed into \(T\).

A surjective (or onto) linear transformation is a transformation \(T: U \to V\) such that for every \(\mathbf{y}\) in \(V\), there is an \(\mathbf{x}\) in \(U\) such that \(f(\mathbf{x}) = \mathbf{y}\). In other words, every element of the codomain is a possible output of \(T\).

The kernel and range of linear transformations

The kernel of a linear transformation \(T: U \to V\) is the set of vectors that \(T\) takes to the zero vector, that is, the set of vectors \(\mathbf{u}\) in \(U\) such that \(T(\mathbf{u}) = \mathbf{0}\). In set notation, for a linear transformation \(T: U \to V\):

\[ \text{kernel of } T = \{\mathbf{u}\in U \,|\, T(\mathbf{u}) = \mathbf{0}\} \]

The range of a linear transformation \(T: U \to V\) is the set of all possible outputs of \(T\), that is, a vector \(\mathbf{v}\) in \(V\) is in the range of \(T\) if there is a \(\mathbf{u}\) in \(U\) such that \(T(\mathbf{u}) = \mathbf{v}\). In set notation:

\[ \text{range of }T = \{T(\mathbf{u}) \,|\, \mathbf{u} \in U\}\]

Injective/surjective linear transformations and their kernels/ranges

A linear transformation is injective if and only if its kernel consists only of the zero vector.

A linear transformation is surjective if and only if its range is equal to its codomain.

Credits / Special Thanks

All code, diagrams, and explanations (except those in the “Guest Explanations” section) were created by Eldrick Chen (also known as “calculus gaming”). This page is open-source - view the GitHub repository here.

Feel free to modify this website in any way! If you have ideas for how to improve this website, feel free to make those changes and publish them yourself, as long as you follow the terms of the GNU General Public License v3.0 (scroll down to view this license).

👋 Hello! I’m Eldrick, and I originally started making educational math websites as a passion project to help people at my school.

Despite being (mostly) the only one to directly work on this project, it wouldn’t have been possible if it wasn’t for the work of many others. Here are some people and organizations I want to credit for allowing me to build this website in the first place.

Tools used to create this page

Tools used to create the images and animations on this page:
- Desmos Graphing Calculator for 2D graphs
- Desmos 3D Graphing Calculator for 3D graphs
- Google Drawings for extra annotations, other types of diagrams, and sidebar icons
- imgonline.tools to make the website logo transparent
JavaScript libraries used to make this website do cool things (these are all licensed under Apache License 2.0):

MathJax to display mathematical formulas and expressions
Math.js for additional math functions

Fonts used on this page

The font used for this page is Monospace Typewriter, was created by Manfred Klein.
The font used in some of my diagrams is Open Sans.

Special thanks

Thanks to A First Course in Linear Algebra by Robert A. Beezer for allowing me to learn linear algebra for free! The content on my page is based on this resource.
Thanks to the creators of the Fast Inverse Square Root algorithm. I was inspired to create calculus websites after creating a page explaining this algorithm. Researching this algorithm ignited my passion for calculus and inspired me to create educational math websites!
- Thanks to YouTuber Nemean for introducing me to the Fast Inverse Square Root algorithm with his amazing video on the subject.
Thanks to Hevipelle and many others for creating the awesome game Antimatter Dimensions. This game inspired me to use this font for this website and also inspired the idea of naming each update in the update notes with a word or phrase rather than with a version number. Both of these things are done in Antimatter Dimensions and I’ve decided to do them on my website too!

Legal information

View Monospace Typewriter (this font)’s license here.
Open Sans is licensed under the Open Font License. License details on Google Fonts

Math.js copyright notice:

math.js
https://github.com/josdejong/mathjs

Copyright (C) 2013-2023 Jos de Jong <wjosdejong@gmail.com>

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

This page is licensed under the GNU General Public License v3.0. Details:

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

          GNU GENERAL PUBLIC LICENSE
              Version 3, 29 June 2007

Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
                  Preamble

The GNU General Public License is a free, copyleft license for
software and other kinds of works.

The licenses for most software and other practical works are designed
to take away your freedom to share and change the works.  By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.  We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors.  You can apply it to
your programs, too.

When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.

To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights.  Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.

For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received.  You must make sure that they, too, receive
or can get the source code.  And you must show them these terms so they
know their rights.

Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.

For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software.  For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.

Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so.  This is fundamentally incompatible with the aim of
protecting users' freedom to change the software.  The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable.  Therefore, we
have designed this version of the GPL to prohibit the practice for those
products.  If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.

Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary.  To prevent this, the GPL assures that
patents cannot be used to render the program non-free.

The precise terms and conditions for copying, distribution and
modification follow.
              TERMS AND CONDITIONS

0. Definitions.

"This License" refers to version 3 of the GNU General Public License.

"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.

"The Program" refers to any copyrightable work licensed under this
License.  Each licensee is addressed as "you".  "Licensees" and
"recipients" may be individuals or organizations.

To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy.  The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.

A "covered work" means either the unmodified Program or a work based
on the Program.

To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy.  Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.

To "convey" a work means any kind of propagation that enables other
parties to make or receive copies.  Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.

An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License.  If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.

1. Source Code.

The "source code" for a work means the preferred form of the work
for making modifications to it.  "Object code" means any non-source
form of a work.

A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.

The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form.  A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.

The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities.  However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work.  For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.

The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.

The Corresponding Source for a work in source code form is that
same work.

2. Basic Permissions.

All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met.  This License explicitly affirms your unlimited
permission to run the unmodified Program.  The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work.  This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.

You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force.  You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright.  Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.

Conveying under any other circumstances is permitted solely under
the conditions stated below.  Sublicensing is not allowed; section 10
makes it unnecessary.

3. Protecting Users' Legal Rights From Anti-Circumvention Law.

No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.

When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.

4. Conveying Verbatim Copies.

You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.

You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.

5. Conveying Modified Source Versions.

You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:

  a) The work must carry prominent notices stating that you modified
  it, and giving a relevant date.

  b) The work must carry prominent notices stating that it is
  released under this License and any conditions added under section
  7.  This requirement modifies the requirement in section 4 to
  "keep intact all notices".

  c) You must license the entire work, as a whole, under this
  License to anyone who comes into possession of a copy.  This
  License will therefore apply, along with any applicable section 7
  additional terms, to the whole of the work, and all its parts,
  regardless of how they are packaged.  This License gives no
  permission to license the work in any other way, but it does not
  invalidate such permission if you have separately received it.

  d) If the work has interactive user interfaces, each must display
  Appropriate Legal Notices; however, if the Program has interactive
  interfaces that do not display Appropriate Legal Notices, your
  work need not make them do so.

A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit.  Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.

6. Conveying Non-Source Forms.

You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:

  a) Convey the object code in, or embodied in, a physical product
  (including a physical distribution medium), accompanied by the
  Corresponding Source fixed on a durable physical medium
  customarily used for software interchange.

  b) Convey the object code in, or embodied in, a physical product
  (including a physical distribution medium), accompanied by a
  written offer, valid for at least three years and valid for as
  long as you offer spare parts or customer support for that product
  model, to give anyone who possesses the object code either (1) a
  copy of the Corresponding Source for all the software in the
  product that is covered by this License, on a durable physical
  medium customarily used for software interchange, for a price no
  more than your reasonable cost of physically performing this
  conveying of source, or (2) access to copy the
  Corresponding Source from a network server at no charge.

  c) Convey individual copies of the object code with a copy of the
  written offer to provide the Corresponding Source.  This
  alternative is allowed only occasionally and noncommercially, and
  only if you received the object code with such an offer, in accord
  with subsection 6b.

  d) Convey the object code by offering access from a designated
  place (gratis or for a charge), and offer equivalent access to the
  Corresponding Source in the same way through the same place at no
  further charge.  You need not require recipients to copy the
  Corresponding Source along with the object code.  If the place to
  copy the object code is a network server, the Corresponding Source
  may be on a different server (operated by you or a third party)
  that supports equivalent copying facilities, provided you maintain
  clear directions next to the object code saying where to find the
  Corresponding Source.  Regardless of what server hosts the
  Corresponding Source, you remain obligated to ensure that it is
  available for as long as needed to satisfy these requirements.

  e) Convey the object code using peer-to-peer transmission, provided
  you inform other peers where the object code and Corresponding
  Source of the work are being offered to the general public at no
  charge under subsection 6d.

A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.

A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling.  In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage.  For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product.  A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.

"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source.  The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.

If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information.  But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).

The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed.  Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.

Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.

7. Additional Terms.

"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law.  If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.

When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it.  (Additional permissions may be written to require their own
removal in certain cases when you modify the work.)  You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.

Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:

  a) Disclaiming warranty or limiting liability differently from the
  terms of sections 15 and 16 of this License; or

  b) Requiring preservation of specified reasonable legal notices or
  author attributions in that material or in the Appropriate Legal
  Notices displayed by works containing it; or

  c) Prohibiting misrepresentation of the origin of that material, or
  requiring that modified versions of such material be marked in
  reasonable ways as different from the original version; or

  d) Limiting the use for publicity purposes of names of licensors or
  authors of the material; or

  e) Declining to grant rights under trademark law for use of some
  trade names, trademarks, or service marks; or

  f) Requiring indemnification of licensors and authors of that
  material by anyone who conveys the material (or modified versions of
  it) with contractual assumptions of liability to the recipient, for
  any liability that these contractual assumptions directly impose on
  those licensors and authors.

All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10.  If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term.  If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.

If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.

Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.

8. Termination.

You may not propagate or modify a covered work except as expressly
provided under this License.  Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).

However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.

Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.

Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License.  If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.

9. Acceptance Not Required for Having Copies.

You are not required to accept this License in order to receive or
run a copy of the Program.  Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance.  However,
nothing other than this License grants you permission to propagate or
modify any covered work.  These actions infringe copyright if you do
not accept this License.  Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.

10. Automatic Licensing of Downstream Recipients.

Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License.  You are not responsible
for enforcing compliance by third parties with this License.

An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations.  If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.

You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License.  For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.

11. Patents.

A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based.  The
work thus licensed is called the contributor's "contributor version".

A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version.  For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.

Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.

In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement).  To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.

If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients.  "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.

If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.

A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License.  You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.

Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.

12. No Surrender of Others' Freedom.

If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all.  For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.

13. Use with the GNU Affero General Public License.

Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work.  The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.

14. Revised Versions of this License.

The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time.  Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

Each version is given a distinguishing version number.  If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation.  If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.

If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.

Later license versions may give you additional or different
permissions.  However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.

15. Disclaimer of Warranty.

THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

16. Limitation of Liability.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.

17. Interpretation of Sections 15 and 16.

If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
            END OF TERMS AND CONDITIONS
  How to Apply These Terms to Your New Programs

If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

To do so, attach the following notices to the program.  It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

  <one line to give the program's name and a brief idea of what it does.>
  Copyright (C) <year>  <name of author>

  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see <https://www.gnu.org/licenses/>.

Also add information on how to contact you by electronic and paper mail.

If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:

  <program>  Copyright (C) <year>  <name of author>
  This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
  This is free software, and you are welcome to redistribute it
  under certain conditions; type `show c' for details.

The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License.  Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".

You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<https://www.gnu.org/licenses/>.

The GNU General Public License does not permit incorporating your program
into proprietary programs.  If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library.  If this is what you want to do, use the GNU Lesser General
Public License instead of this License.  But first, please read
<https://www.gnu.org/licenses/why-not-lgpl.html>.

Keyboard Shortcuts

Search This Website

Table of Contents Hide

Website Info (start here!)

Units

Credits

Website Update History (Last update: ) Show

2025-09-26: “Eigenvalues and Linear Transformations” Update Hide

New Content

Section Improvements

2025-09-07: The Summer Update Hide

New Content

Section Improvements

2025-05-17: Initial Release Show

Website Content

Website Settings Show

What Is This Website? Hide

Unit 1: Systems of Linear Equations Show

Intro to Systems of Linear Equations Show

Vectors and Matrices Show

Representing Systems of Linear Equations Show

Reduced Row-Echelon Form Show

Gauss-Jordan Elimination Show

Consistent Systems of Equations and Free/Dependent Variables Show

Homogeneous Systems of Equations and Null Spaces Show

Singular and Nonsingular Matrices Show

Nonsingular Matrix Equivalences, Part 1 Show

Unit 2: Vectors Show

Vector Operations Show

Sets and Set Notation Show

Linear Combinations Show

Spanning Sets Show

Linear Dependence and Independence Show

Linear Dependence/Independence and Spans Show

Orthogonality and More Vector Operations Show

Nonsingular Matrix Equivalences, Part 2 Show

Unit 3: Matrices and Determinants Show

Matrix Operations Show

Matrix Multiplication Show

Matrix Inverses Show

Column and Row Spaces Show

Elementary Matrices Show

Determinants of Matrices Show

Nonsingular Matrix Equivalences, Part 3 Show

Unit 4: Vector Spaces Show

Intro to Vector Spaces Show

Vector Spaces: Subspaces Show

Vector Spaces: Linear Combinations and Spans Show

Vector Spaces: Linear Independence Show

Vector Spaces: Bases Show

Vector Spaces: Dimension Show

Nonsingular Matrix Equivalences, Part 4 Show

Unit 5: Eigenvalues and Eigenvectors Show

Intro to Eigenvalues and Eigenvectors Show

Unit 6: Linear Transformations Show

Intro to Linear Transformations Show

Injective and Surjective Linear Transformations Show

Credits / Special Thanks Hide

Tools used to create this page

Fonts used on this page

Special thanks

Legal information

Table of Contents

Website Update History (Last update: )

2025-09-26: “Eigenvalues and Linear Transformations” Update

2025-09-07: The Summer Update

2025-05-17: Initial Release

Website Settings

What Is This Website?

Unit 1: Systems of Linear Equations

Intro to Systems of Linear Equations

Vectors and Matrices

Representing Systems of Linear Equations

Reduced Row-Echelon Form

Gauss-Jordan Elimination

Consistent Systems of Equations and Free/Dependent Variables

Homogeneous Systems of Equations and Null Spaces

Singular and Nonsingular Matrices

Nonsingular Matrix Equivalences, Part 1

Unit 2: Vectors

Vector Operations

Sets and Set Notation

Linear Combinations

Spanning Sets

Linear Dependence and Independence

Linear Dependence/Independence and Spans

Orthogonality and More Vector Operations

Nonsingular Matrix Equivalences, Part 2

Unit 3: Matrices and Determinants

Matrix Operations

Matrix Multiplication

Matrix Inverses

Column and Row Spaces

Elementary Matrices

Determinants of Matrices

Nonsingular Matrix Equivalences, Part 3

Unit 4: Vector Spaces

Intro to Vector Spaces

Vector Spaces: Subspaces

Vector Spaces: Linear Combinations and Spans

Vector Spaces: Linear Independence

Vector Spaces: Bases

Vector Spaces: Dimension

Nonsingular Matrix Equivalences, Part 4

Unit 5: Eigenvalues and Eigenvectors

Intro to Eigenvalues and Eigenvectors

Unit 6: Linear Transformations

Intro to Linear Transformations

Injective and Surjective Linear Transformations

Credits / Special Thanks