The Column-Row Factorization A = CR
A new start for linear algebra
Gilbert Strang
MIT
Linear Algebra for Everyone (2020)
1
1 3 5
m = 3 rows
A= 2 3 7
n = 3 columns
1 3 5
Are the columns independent ? Go left to right
Column 1 OK Column 2 OK Column 3 ?
Column 3 = 2 (Column 1) +1 (column 2) Dependent
Column 3 is in the plane of Columns 1 and 2
a2
a3 = 2a1 + a2
1
2 2
4 = 2a1
1
2
2
1 3 1 3 5
Matrix C = 2 3 of independent columns in A = 2 3 7
1 3 1 3 5
The matrix A has column rank r = 2
The column space of A is a plane in R3
The column space contains all combinations of the columns
Column space of A = Column space of C ((but A 6= C))
3
Express the steps by multiplications Ax and CR
Ax = matrix times vector = combination of columns of A
1 3 5 2
2 3 7 1 = 2 (Column 1) + 1 (Column 2) − 1 (Column 3)
1 3 5 −1
0
= 0 (dot products of x with rows of A)
0
CR = Matrix times matrix = C times each column of R
Use dot products (low level) or take combinations of the columns of C
4
1 3 5 1 3 1 0 2
2 3 7 = 2 3 0 1 1 is A = CR
1 3 5 1 3
Check C times each column of R
1 3 1 1 1 3 0 3
2 3 0 = 2 2 3 1 = 3
1 3 1 1 3 3
1 3 2 5
2 3 1 = 2 (Column 1) + (Column 2) = 7
1 3 2a1 + a2 = a3 5
How to find CR for every A ? Elimination !
5
A = CR is (m by n) = (m by r) (r by n)
R = I F P and A = CR = C CF P
In reality we compute R before C !! The columns of I in R tell us
the independent columns of A in C.
The permutation P puts those columns in the right places (if they are not
the first r columns of A)
R = reduced row echelon form rref(A) (zero rows removed)
6
Here are the steps to establish A = CR
We know EA = rref(A) and A = E −1 rref(A) : E is m × m
Remove m − r zero rows from rref(A) and m − r columns from E −1
This leaves A = C I F P = CR Dependent columns of A are CF
7
C has r independent columns R has r independent rows
Rows of A = CR are combinations of the rows of R
Row space of A = Row space of R !
If A has 2 independent columns in C then A has 2 independent rows
in R
Column rank = Row rank = r GREAT THEOREM
Look at A = CR both ways : Combine columns of C Combine rows of R
8
r= 1 Rank one matrix A = (1 column) (1 row)
1 2 10 100 1 1 2 10 100
2 4 20 200 = 2 = CR
1 2 10 100 1
If the column space is a line in 3-dimensional space
then the row space is a line in 4-dimensional space
A adds up (Column k of C) (Row k of R) = New way to multiply CR
Rank r matrix = Sum of r matrices of rank 1
9
Geometry of A : Four Fundamental Subspaces
Column space C(A) = all combinations of columns = all Ax
Row space C(AT ) = all combinations of columns of AT = all AT y
Nullspace N(A) = all solutions x to Ax = 0
Nullspace of AT N(AT ) = all solutions y to AT y = 0
Dimensions r r n−r m−r
Row space is orthogonal to nullspace !
row 1 0
··· x = ·
row m 0
10
m rows and n columns r independent rows and columns
Row Ax = b Column
space x b space = all Ax
Rm
Rn
0 0
Nullspace Nullspace
of A Ax = 0 of AT
BIG PICTURE OF LINEAR ALGEBRA
Square invertible matrices m = n = r
Nullspaces = zero vector only
11
Magic factorization A = CW −1 R∗
C = r independent columns of A R∗ = r independent rows of A
W = r × r matrix = intersection of columns in C and rows in R∗
The factorization is just block elimination on A. The block pivot is W .
−1
1 3 5 1 3 1 3 1 3 5
A= 2 3 7 = 2 3 2 3 2 3 7
1 3 5 1 3
W is invertible and W R = R∗ from r rows of CR = A
12
Randomized linear algebra A ≈ CW −1 R∗
Large matrices / thin samples “Skeleton factors”
References to CU R∗3 R. Penrose (1956) On best approximate solutions
of linear matrix equations, Math. Proc. Cambridge Phil. Soc. 52 1719-.
Hamm and Huang (2020) Perspectives on CU R Decompositions
arXiv 1907.12668 and ACHA 48
Goreinov, Tyrtyshnikov, and Zamarashkin (1997) Pseudoskeleton
approximation LAA 261
Martinsson and Tropp (2020) Randomized numerical linear algebra :
Foundations and Algorithms Acta Numerica and arXiv : 2002.01387
Randomized Numerical Linear Algebra A ≈ CU R
13
Famous Factorizations of a Matrix
A = LU = (lower triangular L) (upper triangular R)
A = QR = (orthogonal columns in Q) (upper triangular R)
S = QΛQT = (eigenvectors in Q) (eigenvalues in Λ)
A = U ΣV T = (singular vectors in U and V ) (singular values in Σ)
Av k = σk uk (orthogonal vectors v mapped to orthogonal vectors u)
3 0 1 3 3 0 −1 −3
= =
4 5 1 9 4 5 1 1
14
Full rank r = m = n r = n indep. columns r = m indep. rows
A is invertible AT A is invertible AAT is invertible
A A A
Solve Ax = b AT Ab
x = AT b AAT y = b → x = AT y
x exact solution b least squares solution x minimum norm solution
x
The minimum norm solution x has no nullspace component / use the
pseudoinverse x = A+ b
15
Double Descent of Error
m>n m<n
m=n
Deep learning has found that overfitting can help ! A big question in the
theory of neural networks using ReLU
16
Video Lectures [Link]/courses/mathematics YouTube/mitocw
Math 18.06 Linear Algebra (including 2020 Vision)
Math 18.065 Deep Learning
Books
Introduction to Linear Algebra, (2016) [Link]/linearalgebra
Linear Algebra & Learning from Data (2019) [Link]/learningfromdata
Linear Algebra for Everyone (2020) [Link]/everyone
17