Notes on Linear Algebra & Learning from Data, Strang
note: in progress!
Pt. 1: Highlights of Linear Algebra
Basic problems studied in part 1:
- Minimize
- Factor
1.1: Multiplication Ax Using Columns of A
We can interpret matrix multiplication by rows (traditional method) or columns:
can be interpreted as the inner products of the rows of with . On the other hand,
is a combination of the columns and . Both ways give the same result. The first is computational but unhelpful for intuition. With the vector approach, we understand as a linear combination of the columns of . The combinations of the columns fill out the column space of .
Independence: from left to right on the columns of , we can construct a matrix by taking any nonzero columns from independent from the current columns in thus far and add them to . For a matrix with columns, we will end up with having columns, forming a basis for the column space of . The number of columns in is the rank of and the dimension of the column space of and . Hence, the rank of a matrix is the dimension of its column space.
We can connect to with a third matrix s.t. , with shapes
For example,
Notice the multiplication from the column perspective: times the first column of is column 1 of , and similarly for the second column. When multiplies the third column of , we get 2(column 1) + 2(column 2), which is column 3 of .
In fact, , the row-reduced echelon form of A (without zero rows).
Note also from the above note on the shapes of and , that the number of independent columns equals the number of independent rows: column rank = row rank.
SVD of is when the first factor has orthogonal columns and the second factor has orthogonal rows.
1.2: Matrix-Matrix Multiplication AB
Inner products produce each of the numbers in . For example, row 2 of and column 3 of give in :
The other way to multiply is columns of times rows of .
Note that a column times a row produces a matrix (of rank one, all columns being multiples of and all rows being multiples of ). This is called the outer product. The column space is the line in the direction of , and the row space is the line in the direction of .
The full product using columns of times rows of : let be the columns of . Then must have rows . The product is the sum of columns times rows , i.e., the sum of rank 1 matrices
The outer product approach is essential to data science because we want to break down into pieces (i.e., rank 1 matrices). A dominant them in applied linear algebra is to factor into and look at the pieces of . Factoring takes longer than multiplying, especially if the pieces involve eigenvalues or singular values.
Five important factorizations are:
- comes from elimination. Combinations of rows take to and back to . is lower triangular and is upper triangular.
- comes from orthogonalizing the columns of as in Gram-Schmidt. has orthonormal columns () and is upper triangular.
- comes from the eigenvalues of a symmetric matrix . Eigenvalues on the diagonal of . Orthonormal eigenvectors in the columns of .
- is diagonalization when is with independent eigenvectors. Eigenvalues of on the diagonal of . Eigenvectors of in the columns of .
- is the Singular Value Decomposition of any matrix , with singular values in and orthonormal singular vectors in and .
1.3 The Four Fundamental Subspaces
Every matrix leads to four subspaces: two subspaces of and two more of .
- Column space , dim , subspace of
- Row space , dim , subspace of
- Nullspace , dim , subspace of
- Left nullspace , dim , subspace of
The ranks of and :
- Rank of rank of , rank of rank of
- Rank of (rank of ) + (rank of )
- Rank of rank of rank of rank of .
- If is and is , both with rank , then has rank .