2200 Module NOTES
2200 Module NOTES
Zilore Mumba
Part-Time Lecturer
Department of Mathematics and Statistics
University of Zambia
April, 2016
Contents
Preface v
Introduction 1
1 Linear Equations 2
2 Matrices 4
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Determinants 21
i
CONTENTS
4 Vector Spaces 29
4.10 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
ii CONTENTS Chapter 0
CONTENTS
4.18 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
iv CONTENTS Chapter 0
Preface
This MAT2200 course module, An Introduction to Linear Algebra, was prepared to be used a
study aide for students taking the course in the Department of Mathemtics and Statistics of
the University of Zambia during the 2015 academic year. This module does not contain any
original work by the author, nor does is it intended to replace the recommended textbook for
the course.
The material content was extracted from two main sources: Linear Algebra, an Introduction,
by A. Morris (Van Nostrand, 2004) and Linear Algebra, 4th editon, by S. Lipschulz and M.
Lipson (McGrawHill, 2009), including a little bit of material from other sources. I tried as
much as possible to confine the material to (the best of my understaning of) the MAT2200
syllabus as stipulated by the Department of Mathemtics and Statistics.
Each chapter of this module starts with a list of expected outputs of that chapter. This is in-
tended to assist the student identify the knowledge level expected of them at the end of the
chapter, and therefore hopefully assist them in the study.
I have spent a good amount of time proof reading the document to ensure error-free content
in the text as well as in solutions to problems. However, I cannot guarantee that there is no
error, especially when a document is proof read only by the author. Therefore, I will appreci-
ate to be informed of any errors so that I can correct them.
Zilore Mumba
Part-Time Lecturer
Department of Mathematics and Statistics
University of Zambia
v
Introduction
Linear algebra is the study of linear transformations (any operation that transforms an input
to an output) and their algebraic properties. A transformation is linear if (a) every
amplification of the input causes a corresponding amplification of the output (e.g. doubling of
the input causes a doubling of the output), and (b) adding inputs together leads to adding of
their respective outputs.
A good understanding of linear algebra is essential for research in almost all areas, not only
in physics and of athematics (e.g. in engineering, Image compression, Linear Programming,
to mention only a few) but also in finance, economics and sociology. It forms the core of
research in several modern applications, such as quantum information theory. Efficient
methods for handling large matrices lie at the heart of fast algorithms for retrieval of data and
information
Many abstract objects which will be encountered in various topics of linear algebra, such as
"change of basis", "linear transformations"," bilinear forms", etc can conveniently be
represented by matrices.
We can understand the behavior of linear transformations in terms of matrix multiplication.
This is not quite saying that linear transformations are the same as matrices, for two
reasons: i) the correspondence only works for finite dimensional spaces; and ii) the matrix
we get depends on the basis we choose, e.g. a single linear transformation can correspond
to many different matrices depending on what bases we pick.
But due to the convenience of representing linear transformations by matrices, we will look at
manipulations on matrices in a little more detail, and use matrix representation throughout
the course.
In linear algebra we shall manipulate not just scalars, but also vectors, vector spaces,
matrices, and linear transformations. These manipulations will include familiar operations
such as addition, multiplication, and reciprocal (multiplicative inverse), but also new
operations such as span, dimension, transpose, determinant, trace, eigenvalue, eigenvector,
and characteristic polynomial.
1
Chapter 1: Linear Equations
Outputs
1. Understanding of two methods of finding solutions to systems of linear equations:
i) Cramer’s rule
Solutions
We can solve these systems of equations using:
a) Determinants (Cramer’s rule), or
b) Gauss Elimination Method
e.g., for system i), using Cramer’s rule
We define:
1 −2 1
∆ = 2 −1 1 ={1(1 − 1) − (−2(−2 − 4) + 1(2 + 4)}=-6
4 1 −1
1 1 1
∆x1 = 2 −1 1 ={1(1 − 1) − (−2)(−2 − 1) + 1(2 + 1)}=-3
1 1 −1
1 1 1
∆x2 = 2 2 1 ={1(−1 − 2) − (−2)(2 − 8) + 1(2 + 4)}=-3
4 1 −1
1 −2 1
∆x3 = 2 −1 2 ={1(−1 − 2) − (−2)(2 − 8) + 1(2 + 4)}=-9
4 1 1
Hence:
∆ −3 1
x1 = = =
∆x1 −6 2
∆ −3 1
x2 = = =
∆x2 −6 2
∆ −9 3
x3 = = =
∆x3 −6 2
Using the Gauss Elimination Method, we reduce the systems i) -iii) to simpler systems of
equations by a process of elimination.
solution
to i)
x1 − 2x2 + x3 = 1
2x1 − x2 + x3 = 2
4x1 + x2 − x3 = 1
Equation 2→ Equation 2 - 2xEquation 1
2
CHAPTER 1. LINEAR EQUATIONS
solution
( to ii)
x1 + x2 − 2 = 2
2x1 + 2x2 =3
Equation
( 2→ Equation 2 - 2xEquation 1
x1 + x2 =2
+ 0 = −1
this system is inconsistent, giving the false statement 0 = −1.
So the system has no solution.
solution
( to iii)
x1 − x2 + x3 − 2 = 1
2x1 − x2 + x3 =2
Equation
( 2→ Equation 2 - 2xEquation 1
x1 − x2 + x3 − 2 = 1
3x2 − x3 =0
From this simpler system we see that x2 = 31 x3 .
Whatever value we assign to x3 will give a value of x2 , and by substitution into equation 1, a
value of x1 .
Let x3 = t
if t = 0; x1 = 1, x2 = 0 = x3 .
if t = 3; x1 = 0, x2 = 1; x3 = 3.
From the above three examples we note that a system of linear equations can have:
i) no solution
ii) a unique solution
iii) more than one solution
We shall develop the solution process for systems of linear equations using the Gauss
Elimination Method because it is more efficient.
We shall also want to understand:
i) when does the solution exist?
ii) when is the solution the unique solution?
We shall concentrate our study through the use of matrices.
Chapter 1 3
Chapter 2: Matrices
Outputs
Ability to:
2. Carry out elementary row/column operations to reduce matrices to echelon and to row canon-
ical form
Preamble
We can list some of the subsets of the number system on which we can perform the usual
operations of addition and multiplication.
the set of real (rational and irrational) numbers → the real field R
the set of rational numbers → the rational field Q
the set of complex numbers → the complex field C
These subsets will form the basis of this course. We will frequently refer to an arbitrary field
K, which may be any of the fields above.
2.1 Introduction
Let K be an arbitrary field. A
rectangular array of elements of K
a11 a12 . . . a1n
a21 a22 . . . a2n
. . . . . .
am1 am2 . . . amn
where aij are scalars in K is called a matrix. The above matrix is also denoted
(aij ), i = 1, 2, ..., m, j = 1, 2, ..., n, or simply (aij ).
The element aij , called the ij entry or ij component appears in the ith row and jth column. A
matrix with m rows and n columns is called an m by n or mxn matrix.
e.g.
1 2 1
1 0 −3 1 1 3 1
0 2 0
a) 2 1 3 1 is a 3x4 matrix, b)
−3 3 1 is a 4x3 matrix and c) 2 1 4
1 0 1 1 4 7 6
1 1 1
is a 3x3 or 3-square matrix.
4
CHAPTER 2. MATRICES
A matrix, e.g. the matrix in c) above could be considered as the coefficient matrix of the
system
of homogeneous linear equations
x + 3y + z = 0
2x + y + 4z = 0
4x + 7y + 6z = 0
or
as the augmented matrix of the system of non-homogeneous linear equations
x + 3y = 1
2x + y = 4
4x + 7y = 6
We shall see how matrices may be used to find solutions to these systems.
Matrices are usually denoted by capital letters, A, B, etc., and elements of the field K by
lower case letters, a, b, etc.
Two matrices A and B are equal, denoted A=B, if they have the same shape and the
corresponding elements are equal.
A matrix with one row is also called a row vector and a matrix with one column is called a
column vector.
1
e.g. 1 2 3 is a 1x3 matrix or row vector, and 2 is a 3x1 matrix or column vector
0
A matrix
whose entries are
all zero is called a zero matrix.
0 0 . . . 0
0 0 . . . 0
e.g.
is a zero matrix and is denoted by 0.
. . . . . .
0 am2 . . . 0
a) The Diagonal (or main diagonal) of the n-square matrix A = (aij ) which consists of the
elements with the same subscripts,that is, a11 , a22 , ..., ann .
1 2 3
e.g. A = 4 5 6 is a 3-square matrix with diagonal elements 1, 5, 9.
7 8 9
b) A Diagonal Matrix is a square matrix whose non-diagonal entries are all zero.
a11 0 . . .
0 a11
0 a22 . . . 0 a22 . . .
i.e.: A =
or is a diagonal matrix.
. . . . . .
. . . . . .
0 am2 . . . ann . . . ann
1 0 0
e.g. A = 0 1 0 is a diagonal matrix.
0 0 2
c) An Upper Triangular Matrix (or simply triangular matrix), a square matrix whose entries
below the main diagonal are all zero.
i.e.:
a11 a12 . . . a1n a11 a12 . . . a1n
0 a22 . . . a2n a22 . . . a2n
A= .
or is an upper triangular matrix.
. . . . .
. . . . . .
0 0 . . . ann ann
1 2 3
e.g. A = 0 1 3 is upper triangular.
0 0 2
d) A Lower Triangular Matrix, a square matrix whose entries above the main diagonal are
all zero.
i.e.:
a11 0 . . .
0 a11 . . .
a21 a22 . . . 0 or a21 a22 . . .
A=
.
is a lower triangular
. . . . .
. . . . . .
am1 am2 . . . ann am1 am2 . . . ann
matrix.
1 0 0
e.g. A = 2 1 0 is lower triangular.
3 3 2
e) The trace of A, written tr(A), is the sum of the diagonal elements. Namely,
tr(A) = a11 + a22 + a33 + . . . + ann
The following theorem applies.
Theorem 2.2.1. Suppose A = aij and B = bij are n-square matrices and k is a scalar.
Then
i) tr(A + B) = tr(A) + tr(B)
ii) tr(kA) = ktr(A)
iii) tr(AT ) = tr(A)
iv) tr(AB) = tr(BA).
Theorem 2.3.1. i) (A + B) + C = A + (B + C)
ii) A + 0 = A
iii) A + (−A = 0
iv) A + B = B + A
v) k1 (A + B) = k1 A + k1 B
vi) (k1 + k2 )A = K1 A + K2 A
vii) (k1 k2 )A = k1 (k2 A)
viii) 1A = A and 0A = 0
Using vi) and viii) above, we can also show that A + A = 2A, A + A + A = 3A, etc.
Xp
Amn Bpq = Cmq , where Cij = ai1 b1j + ai2 b2j + ... + ain bnj = aik bkj
k=1
Multiplying S by C (i.e AB by C) the element in the ith row and the lth column of (AB)C is
n
X n X
X m
si1 c1l + si2 c2l + ... + sin ∗ cnl = sik ∗ ckl = (aij bjk )ckl
k=1 k=1 j=1
On the other hand, Multiplying A by T (i.e A by BC) the element in the ith row and the lth
column of A(BC) is
m
X m X
X n
ai1 t1l + ai2 t2l + ... + ain ∗ tnl = aij ∗ tjl = aij (bjk ckl )
j=1 j=1 k=1
The above sums are equal.
Transpose
The transpose of a matrix A, written AT is the matrix obtained by writing the rows of A, in
order, as columns.
! 1 3
1 2 1
if A = , AT = 2 4
3 4 7
1 7
If A is an mxn matrix, AT is an nxm matrix. the transpose operation on matrices satisfies the
following properties
2. Inverse Matrix of a square matrix A is a matrix B, denoted A−1 , with the property that
Proof
Let C be another inverse of A, then
AC = CA = I
and
C = CI = C(AB) = (CA)B = IB = B .
!
2 5
Example 2.5.1. Find the inverse of A =
1 3
Solution
! ! ! !
a b 2 5 a b 1 0
We seek a matrix B = such that AB = =
c d 1 3 c d 0 1
! ! ( (
2a + 5c 2b + 5d 1 0 2a + 5c = 1 2b + 5d = 0
AB = = i.e
a + 3c b + 3d 0 1 a + 3c = 0 b + 3d = 1
!
3 −5
Solving for a, b, c, d gives B = A−1 =
−1 1
1 3 3
Example 2.5.2. Find the inverse of A = 1 4 3
1 3 4
Solution
r s t 1 3 3 r s t
We seek a matrix B = u v w such that AB = I , or 1 4 3 u v w =
x y z 1 3 4 x y z
1 0 0
0 1 0
0 0 1
If a matrix has an inverse it is said to be invertible (or nonsingular, and detA 6= 0).
Otherwise the matrix is called noninvertible (or singular).
If A and B are invertible nxn matrices, then AB is invertible and (AB)−1 ) = B −1 A−1 .
Proof
if A and B are invertible, then
AA−1 = I = A−1 A, and
BB −1 = I = B −1 B
(AB)(B −1 A−1 ) = I = (A−1 B −1 )(BA), and so AB is invertible, and since this inverse
is unique,
(AB)−1 ) = B −1 A−1 .
−1 −1
By induction (A1 A2 ...An )−1 ) = A−1
n An−1 ...A1 .
5 −7 3
−4 1 0
In a skew-symmetric matrix the diagonal elements are all zero ([Link] make
a11 = −a11 , a22 = −a22 , a33 = −a33 ).
6. Normal Matrix A = (aij ) is one which commutes with its transpose AT , i.e. if
AAT = AT A. If A is symmetric, orthogonal, or skew-symmetric, then A is normal.
There are also other normal
! matrices.
6 3
e.g. Let A =
3 6
! ! !
6 3 6 −3 45 0
Then AT A = = and
−3 6 3 6 0 45
! ! !
6 −3 6 3 45 0
AT A = =
3 6 −3 6 0 45
T T
Because AA = A A, the matrix A is normal.
Example 2.6.1.
1 2
Let A = . If f (x) = 2x2 + 3x − 5, then
3 −4
1 2 1 2 1 0
F (A) = 2 +3 -5
3 −4 3 −4 0 1
7 −6 3 6 5 0 16 −18
= +3 -5 =
9 −22 9 −12 0 5 −27 61
If g(x) = x2 + 3x − 10 then
1 2 1 2 1 2 1 0 7 −6 3 6 10 0 0 0
g(A) = +3 -10 = +3 -5 =
3 −4 3 −4 3 −4 0 1 9 −22 9 −12 0 10 0 0
Exercises 2.6
5. Let (rxs) denote a matrix with shape rxs. Determine the shape of the following products,
(if the product is defined):
i) A2x3 B3x4 ii) A4x1 B1x2 iii) A3x4 B3x4 iv) A5x2 B2x3
8. Find a) AAT b) AT A, if
2 6 −3
1 2 0
i) A = , ii) A = 2 −1 3 , and iii) A = 3 2 6
3 −1 4
−6 3 2
x y
9. Matrices A and B are said to commute if AB=BA. Find all matrices which com-
z w
1 1
mute with
0 1
2
0 α α2
10. If = A(α) = 0 1 α, show that A(α)B(β) = A(α + β). Hence find the inverse
0 0 1
of A(α). Show that A(3α) − 3A(2α) + 3A(α) = 1, and hence find a cubic equation sat-
isfied by A(α).
e.g.
1 2 −3 0 1 0 1 7 −5 0 1 0 5 0 2
A= 0 0 5 2 −4, B = 0 0 0 0 1 , C = 0 1 2 0 4 .
0 0 0 7 3 0 0 0 0 0 0 0 0 1 7
Matrices A, B, C are in echelon form. circled elements are called distinguished elements (or
pivots) of the echelon matrix.
Definition 2.7.2. A matrix A is said to be in row canonical form (or row-reduced echelon
form) if it is an echelon matrix, that is, if it satisfies properties (1) and (2) above , and if it
satisfies the following additional two properties:
1) Each pivot (leading nonzero entry) is equal to 1.
2) Each pivot is the only nonzero entry in its column.
e.g:
1 0 −1 2
a) A = 0 1 1 3 is in echelon form, but not row reduced (in 3rd column, the
0 0 0 1
distinguished element is not the only nonzero entry).
1 0 0 3
b) B = 0 1 0 2 is in row reduced echelon form (all distinguished elements=1,
0 0 1 1
leading 1 is the only nonzero entry in the column)
0 1 0 2
c) C = 1 0 2 0 is not an echelon matrix.
0 0 0 0
The zero matrix of any size and the identity matrix I of any size are important special
examples of matrices in row canonical form.
Definition 2.8.2 (Row Equivalence, Rank of a Matrix). If A and B are two mxn matrices, then
A is said to be row equivalent to B, written A ∼ B , if B can be obtained from A by a
sequence of elementary row operations. In the case that B is also an echelon matrix, B is
called an echelon form of A.
1 0 −1 1 1 0 −1 1
e.g. A= 2 1 0 1 is is row equivalent to B = −1 1 0 2
−1 1 0 2 0 3 0 5
1 0 −1 1 1 0 −1 1 1 0 −1 1
since A = 2 1 0 1 R2 →R2 +2R3 0 3 0 5 R2 ↔R3 −1 1 0 2
−−−−−−−−→ −−−−→
−1 1 0 2 −1 1 0 2 0 3 0 5
Definition 2.8.3. The rank of a matrix A, written rank(A), is equal to the number of pivots in
an echelon form of A.
Theorem 2.8.1. Every mxn matrix is row equivalent to an mxn reduced echelon matrix.
1 2 −3 1 2
0 0 1 2 3, which is in echelon form.
0 0 0 0 −2
R3 →− 1 R3
1 2 −3 1 2 −−−−−2−→
1 2 0 7 0
R →R −3R3
0 0 1 2 3 −−2−−−2−−−→ 0 0 1 2 0 in row reduced echelon form.
R1 +3R2
0 0 0 0 −2 −− −−→ 0 0 0 0 1
R →R −11R
−−1−−−
1 3
−−−→
Exercises 2.7
a11 a12 . . . a1n b1
a21 a22 . . . a2n b2
(A|B) =
. . . . . . .
am1 am2 . . . amn bm
The matrices A = (aij )mxn and (A | B) = (aij | bi )mxn+1 are called the coefficient matrix
and augmented matrix, respectively, of the system of linear equations.
Definition 2.9.1. An n-tuple (x1 , x2 , ..., xn ) which satisfies each of the m equations in the
system is called a solution of the system. Two systems of equations are equivalent if every
solution of one system is a solution of the other system and vice versa. A system of linear
equations is called a homogeneous system if bi = 0(i = 1, 2, ..., m). A system with at least
one solution is called a consistent system. The solution (0, 0, ..., 0) of a homogeneous
system is called the trivial solution.
0 0 0 0
Theorem 2.9.1. If (A |B ) = (aij |bi ) is an mx(n + 1) matrix obtained from the mx(n + 1)
matrix (A|B) by an elementary row operation, then the systems
m m
0 0
X X
aij xj = bi (i = 1, 2, ..., m) and aij xj = bi (i = 1, 2, ..., m) are equivalent.
j=1 j=1
0 0 0 0
0 0 0 0
0 0 0 0
a11 a12 · · · a1n b1 a11 a12 . . . a1n b1 a11 a12 · · · a1n b1
0 a022 · · · a02n b02 0 a022 . . . a02n b02 0 a022 · · ·
0 0.
.
. ··· . .
.
. ··· . .
.
. ··· . .
0 0
0 0 · · · amn 0 0 0 . . . amn bm 0 0 ··· 0 0
i) Trivial Solution ii) Unique Solution iii) More than one Solution
Start
System of
Linear Eqtns,
m equations,
n unknowns
Reduce Aug-
mented Matrix
to Echelon Form
Yes No
Consistent R(A)=R(A:B)? Inconsistent
No solution
Stop
m eqtns=n less eqs
Many Solutions
unknowns?
Yes
Yes
b=0? Trivial Solution
No
Unique Solution
Exercises 2.9
2. Determine the values of λ for which the following systems of equations are consistent, and
for those values of λ find the complete solutions.
5x + 2y − z = 1
x + 5y + 3 = 0
i) 2x + 3y + 4z = 7 ii) 5x + y − λ = 0
4x − 5y + λz = λ − 5 x + 2y + λ = 0
Thus there are three types of elementary matrices corresponding to the three elementary
row operations:
i) Ri ↔ Rj
1 0 0 0 1 0
e.g. 0 1 0 R1 →R2 E01 = 1 0 0
−−−−→
0 0 1 0 0 1
ii) Ri → kRi .
1 0 0 1 0 0
e.g. 0 1 0 R2 →kR2 E02 = 0 k 0
−−−−−→
0 0 1 0 0 1
iii) Ri → kRj + Ri .
1 0 0 1 0 0
e.g. 0 1 0 E03 = 0 1 k
R →R +kR
−−2−−−2−−−→
3
0 0 1 0 0 1
1 3 4
First form the (block) matrix M=[A, I] and row reduce M to echelon form:
1 3 3 1 0 0 R2 →R2 −R1 1 3 3 1 0 0
1 4 3 0 1 0 −R−3−−−−−→
→R −R1
0 1 0 −1 1 0
−−−−−3−−→
1 3 4 0 0 1 0 0 1 −1 0 1
In echelon form, the left half of M is in triangular form; hence, A has an inverse. Next we
further row reduce M to its row canonical form:
1 3 3 1 0 0 R1 →R1 −3R2 1 0 0 7 −3 −3
0 1 0 −1 1 0−− −−−−−−→
R1 →R1 −R3
0 1 0 −1 1 0
−−−−−−−→
0 0 1 −1 0 1 0 0 1 −1 0 1
The identity matrix is now in the left half of the final matrix; hence, the right half is A−1 .
In other
words
7 −3 −3
A−1 = −1 1 0
−1 0 1
Exercises 2.10
1 0 2 1 5 12 1 2 1
2 −1 0 1
Example 2.12.1. Let A = 1 2 1 −1
2 9 4 −5
How do we find P?
Answer:
2 −1 0 1 1 0 0 R1 →R1 −2r2 0 −5 −2 3 1 −2 0
1 2 1 −1 0 1 0−−−−−−−→
R →R −2R2
1 2 1 −1 0 1 0
−−3−−−3−−−→
2 9 4 −5 0 0 1 0 5 2 −3 0 −2 1
0 1 52 − 35 − 51 2
0 −5 −2 3 1 −2 0 R3 →R3 −R1 5 0
−−−−−−−→
1 2 1 −1 0 1 0R1 → 15 R1 1 2 1 −1 0 1 0
−−−−−→
0 5 2 −3 0 −2 1 0 0 0 0 1 −4 1
0 1 52 − 35 − 51 2
1 0 51 − 15 − 25 1
5 0 R2 ↔R1 5 0
−−−−→ 0 1 2 − 3
1 2 1 −1 0 1 0 R1 →R1 −2R2 5 5 − 15 2
5 0
−−−−−−−−→
0 0 0 0 1 −4 1 0 0 0 0 1 −4 1
2 1
2 1
−5 5 0 1 0 5 5
Hence P = − 15 2
5 0
and P A = 0 1
2
5 − 35
1 −4 1 0 0 0 0
To find Q we reduce (P A4x4 |I) to row canonical form by elementary column operations
C →C − 15 C1
1 0 25 1
0 −−3−−−3−− 1 0 0 0 1 0 − 15 − 51
5 1 0 0 −→
2 3 C4 →C4 − 1
5 C1
0 1 − 0 1 0 0 0 1 0 0 0 1 − 2 3
− −−−−−−−→
5 5 C →C − 2 5 5
5 C2
0 0 0 0 0 0 1 0 −−3−−−3−− −→ 0 0 0 0 0 0 1 0
C →C − 3 5 C2
0 0 0 1 −−4−−−4−− −→ 0 0 0 1
0 − 15 − 15
1 1 0 0 0
0 1 − 52 3
5 ,
0 1 0 0
and if Q =
0 P AQ = .
0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
Definition 2.12.2 (Equivalent Matrices). Two mxn matrices A and B are equivalent if there
exist invertible matrices P and Q such that P AQ = B .
If P AQ = B then P −1 AQ−1 = A
0 0 0 0
and if P AQ = B and P BQ = C then (P P )A(QQ ) = C .
Equivalently, every mxn matrix A is equivalent to a matrix with a simple form N, where r
denotes the number of nonzero rows in the reduced echelon form of A. N may be regarded
as the canonical form of A under this equivalence relation. N is called the normal form of A.
Thus, the determinant of a 1x1 matrix A = a11 is the scalar a11 ; that is, det(a11 = a11 .
The determinant of order two may easily be remembered by using the following diagram:
+
a11 a12
a21 a22
−
Determinants of nxn matrices with n > 2 are calculated through a process of reduction and
expansion using minors and cofactors.
5 −8 0
21
CHAPTER 3. DETERMINANTS
7 6
|M11 | = = (7)(0) − (−8)(6) = 48
−8 0
−2 6
|M12 | = = (2)(0) − (5)(6) = −30
5 0
−2 7
|M13 | = = (−2)(−8) − (5)(7) = −19.
5 −8
The Cofactor of an nxn matrix A, denoted Aij , is defined in terms of its associated minor as:
Aij = (−1)i+j |Mij |.
The Cofactors of the minors a11 , a12 and a13 above are:
A11 = (−1)1+1 (48) = 48
A12 = (−1)1+2 (−30) = 30
A13 = (−1)1+3 (−3) = 19
Note that (−)i+j accompanying the minors form a checkerboard pattern with +’s along the
main
diagonal
+ − + ...
− + − ...
+ − + ...
...
We can as well choose to expand along a column of A, e.g. along column 1 we get
DetA = (−3)(38) − (−2)(0) + 5(24) = −24
Choosing to expand along a row or column having many zeros, if it exists (e.g. column 3),
greatly reduces the number of computations required to compute det A. This can be
facilitated by reducing the matrix by Elementary Row/Column Operations, see section 3.5.1
below.
vii) If a determinant has two proportional rows or two proportional columns, the determinant
is 0, |A| = 0 .
viii) if A is lower or upper triangular (i.e. has zeroes above or below the main diagonal), then
|A| = product of diagonal elements. Thus in particular |I| = 1, where I is the identity matrix.
ix) If A has an inverse, then |A−1 | = 1
|A|
Theorem 3.4.1. The determinant of a product of two matrices A and B is the product of
their determinants; that is,
det(AB) = det(A)det(B)
and
b) property ix)
Theorem 3.4.2. Let A be a square matrix. Then the following are equivalent:
i) A is invertible; that is, A has an inverse A−1 .
ii) AX = 0 has only the zero solution.
iii) The determinant of A is not zero; that is, det(A) 6= 0.
Thus using the above properties, we can evaluate determinants by the methods given in the
following sections.
x 2m 1
Example 3.5.1. i) Find the determinant of A = 3 1 1
x m 1
Solution
x 2m 1 0 2m 0
3 1 1 R1 →R1 −R3 3 1 1 = −m(3 − x) = m(x − 3).
−−−−−−−→
x m 1 x m 1
1 2 3
ii) Find the determinant of B = 4 5 6
7 8 9
Solution
1 2 3 −3 −3 −3
4 5 6 R1 →R1 −R2 −3 −3 −3 = 0.
−−−−−−−→
7 8 9 −3 −3 −3
1 1 + m −1
iii) Find the determinant of C = 3 3 + m −3
5 m −1
Solution
1 1 + m −1 0 1 + m −1
3 3 + m −3 C1 →C1 +C3 0 3 + m −3 = 4{(1 + m)(−3) − (−1)(3 + m)} =
−−−−−−−→
5 m −1 4 m −1
4{−3 − 3m + 3 + m} = 4(−2m) = −8m.
1 3 −2
Example 3.5.2. i) Find the determinant of A = 2 4 5
−4 −12 8
Solution
1 3 −2 1 3 −2
A= 2 4 5 f actor −4 f rom row3 2 4 5 =0
−−−−−−−−−−−−−−−−−→
−4 −12 8 1 3 −2
a a2 a3
ii) Find the determinant of B = b b2 b3
c c2 c3
Solution
a a 2 a3 1 a a2
B= b b b 2 3 1 b b2
f actorabcf romeachrow
−
− − −− − − − −− −− −− →
c c2 c3 1 c c2
0 a − b a2 − b2
R1 →R1 −R2
−−−−−−−→ (abc) 0 b − c b2 − c2
R →R −R3
−−2−−−2−−→ 1 c c2
0 1 a+b
(abc)(a − b).(b − c) 0 1 b + c
f actor(a−b) (b−c) in rows 1 and 2
−−−−−−−−−−−−−−−−−−−−→
1 c c2
0 2 2
Example 3.5.3. Find the determinant of A = 1 0 3
2 1 1
Solution
D=1
0 2 2 1 0 3
1 0 3 R1 ↔R2 0 2 2 D → (−1)D = −1
−−−−→
2 1 1 2 1 1
0 2 2 1 0 3
1 0 3 R3 →R3 −2R2 0 2 2 D → D = −1 No change to D
−−−−−−−−→
2 1 1 0 1 −5
0 2 2 1 0 3
1 0 3 R2 → 21 R2 0 1 1 D → (2)D = (2)(−1) = −2
−−−−−→
0 1 −5 0 1 −5
0 2 2 1 0 3
0 1 1 R3 →R3 −R2 0 1 1 D → D = −2 No change to D
−−−−−−−→
0 1 −5 0 0 −6
0 2 2 1 0 3
0 1 1 R3 →− 61 R3 0 1 1 D → (−6)D = (−6)D = (−6)(−2) = 12
−−−−−−→
0 0 −6 0 0 1
Solution
− − −
2 3 −4 2 3
0 −4 2 0 4
1 −1 5 1 −1
+ + +
(−40 + 6 + 0) − (−16 + 4 + 0)
= −46
Exercises 3.5
1. Using the methods in the preceding sections, evaluate the following determinants
1
2 1 2 2 3 1 a b c 2 −1 − 13
3 1
i) −1 1 5 , ii) 1 0 1 , iii) c a b , iv)4 2 −1
−1 2 3 3 3 2 b c a 1 −4 1
t+3 −1 1 a b + c a2
v) 5 t−3 1, vi) b c + a b2
6 −6 t + 4 c a + b c2
1 2 3
Example 3.6.1. For the matrix 2 3 2
3 3 4
A11 = 6, A12 = −2, A13 = −3, A21 = 1, A22 = −5, A23 = 3’ A31 = −5,
A32 = 4, A33 = −1
6 −2 −3 6 1 −5
The matrix of cofactors is, Aij = 1 −5 3, and the adjoint is adjATij = −2 −5 4.
−5 4 −1 −3 3 −1
Theorem 3.6.1. Let A be any square matrix. Then A(adjA) = (adjA)A = |A|I where I is
6 0, A−1 = |A|
the identity matrix. Thus, if |A| = 1
(adjA).
det(A) = −7, Thus A has an inverse by theorem 3.4.2, and by theorem 3.6.1 the inverse
6 1 −5
A−1 = −7
1
−2 −5 4
−3 3 −1
Exercises 3.6
1 1 1
1. Let A = 2 3 4 find a) adj A, b) A−1 using adj A.
5 8 9
1 t 0
2. For what values of t will the matrix A = 0 1 −1 be noninvertible? For all other val-
t 0 1
ues find the inverse.
x+1 0 −1
3. Find the adjoint of A, where A = 0 x+1 −2
1 1 x−2
1 −a b
4. Show that the matrix A = a 1 2 is invertible for any real values of a and b.
−b 0 1
Theorem 3.7.1 ((Cramer’s Rule)). The system of linear equations AX = b has a unique
solution if and only if |A| =
6 0, and the unique solution is
∆1 ∆2 ∆n
x1 = ∆ , x2 = ∆ ,. . . , xn = ∆ .
The above theorem applies only when there are the same number of equations as
unknowns, and gives a solution when ∆ 6= 0.
If ∆ = 0 the theorem does not say whether a solution exists, except in the case of a
homogeneous system of equations.
Exercises 3.7
2. Ability to:
1. We consider vectors with tail at the origin. In this case the vectors are completely
determined by their components, i.e. we write v = [a, b, c] meaning that the 3 components
of v are a, b, c, or v = ai + bj + ck.
In other words, if v is a vector with tail at the origin, the components of v are the same as
the co-ordinates of the head of v.
2. Two vectors, u and v, are equal, written u = v , if they have the same number of
components and if the corresponding components are equal.
If (x − y, x + y, z − 1) = (4, 2, 3)
z z z
P (a, b, c) 2v
u v −v
v
y y y
x x x
29
CHAPTER 4. VECTOR SPACES
3. There are two important operations we can carry out on vectors. These are i) addition, and
ii) scalar multiplication.
i) addition
The vector u+v can be obtained by placing the initial point of v on the terminal point of u
and joining the initial point of u to the terminal point of v. This is called the parallelogram
law of vector addition, i.e. it is the diagonal of the parallelogram formed by u and v.
0 0 0 0 0 0
if (a, b, c) and (a , b , c ) are the endpoints of the vectors u and v, then (a + a , b + b , c + c )
is the endpoint of the vector u+v.
Solution
= ((1 − 2i)(3 − 2i), (1 − 2i)4i, (1 − 2i)(1 + 6i)) + ((3 + i)(5 + i), (3 + i)(2 − 3i), (3 + i)5)
(2, −3, 4) = (x + y+ z, x + y, x) set the corresponding components equal to each other
x + y + z = 2
to get the system x + y = −3
x =4
Basic properties of vectors under the operations of vector addition and scalar multiplication
are given as
Exercises 4.2
1. Find
a) < u, v >, and b) < v, u >, if
i) u = (1, 2, 4), v = (3, 5, 1)
ii) u = (5, 4, 1), v = (3, −4, 1)
iii) u = (3 − 2i, 4i, 1 + 6i), v = (5 + i, 2 − 3i, 7 + 2i)
A vector u is called a unit vector if kuk = 1 or, equivalently, if < v, v >= 1. For any nonzero
1 v
vector v in Rn , the vector vb = kvk v = kvk is the unique unit vector in the same direction as v.
The process of finding vb from v is called normalizing v.
kuk2 = 12 + 22 + (−2)2 + 42 = 1 + 4 + 4 + 16 = 25
v √ 1 2 2 4
Then = 25 = 5 and vb = ( , , − , )
kvk 5 5 5 5
The following formula is known as the Schwarz inequality (or Cauchy–Schwarz inequality,
used in many branches of mathematics).
Using the above inequality, we can prove the triangle (or Minkowski) inequality.
The triangle inequality states that the length of a side of a triangle is less than the sum of the
other two sides.
z z
u
v v
v v +
+ u
v u
u u
y y
x x
u
u∗
v
y
x
The projection of a vector u onto a nonzero vector v is the vector denoted and defined by
< u, v > 2 < u, v >
proj(u, v) = v= v
kvk < v, v >
9 1
iii) proj(u, v) = (2, 4, 5) = (2, 4, 5)
45 5
Exercises 4.4
x y x 6 4 x+y
1. Find x, y, z, w if 3 = +
z w −1 2w z+w 3
1 2 3
2. a) Find i) 3u − 2v ii) 5u + 3v − 4w, If u = 3,v = 1, w = −2
−4 5 6
5. Find k so that
a) u and v are orthogonal, if u = (3, k, −2), v = (6, −4, −3)
√
b) kuk = 39, where u = (1, k, −2, 5)
1. We dealt with R2 , a vector space of all vectors of the form (x, y), where x and y are real
numbers.
We will write the set of all vectors in R2 as V2 (R) = {(x, y)|x, y ∈ R}.
For instance, (−4, 3.5) is a vector in V2 (R).
We can add two vectors in V2 (R) by adding their components separately, thus for
instance (1, 2) + (3, 4) = (4, 6).
We can multiply a vector in V2 (R) by a scalar by multiplying each component separately,
thus for instance 3(1, 2) = (3, 6).
Among all the vectors in V2 (R) is the zero vector (0, 0).
Vectors in V2 (R) are used for many physical quantities in two dimensions; they can be
represented graphically by arrows in a plane, with addition represented by the
parallelogram law and scalar multiplication by scaling.
2. The vector space V3 (R) is the space of all vectors of the form (x, y, z), where x, y, z are
real numbers: V3 (R) = {(x, y, z) : x, y, z ∈ R}.
We will write the set of all vectors in R3 as V3 (R) = {(x, y, z)|x, y, z ∈ R}.
Addition and scalar multiplication proceeds similar to V2 (R):
e.g. (1, 2, 3) + (4, 5, 6) = (5, 7, 9), and 4(1, 2, 3) = (4, 8, 12).
Among all the vectors in V3 (R) is the zero vector (0, 0, 0).
Vectors in V3 (R) are used for many physical quantities in three dimensions, such as
velocity, momentum, current, electric and magnetic fields, force, acceleration, and
displacement; they can be represented by arrows in space.
However, addition of a vector in V2 (R) to a vector in V3 (R) is not defined; e.g.
(1, 2) + (3, 4, 5) doesn’t make sense.
3. One can similarly define the vector spaces V4 (R), V5 (R), etc.
Vectors in these spaces are not often used to represent physical quantities, and are more
difficult to represent graphically, but are useful for describing populations in biology,
portfolios in finance, or many other types of quantities which need several numbers to
describe them completely.
Instead of restricting ourselves to the field of real numbers R only, we can work over any
arbitrary field K (e.g. the field of complex numbers C or the rational numbers Q).
iii) Suppose ku = 0 and k 6= 0. Then there exists a scalar k1 such that k1 k = 1. Thus
u = 1u = (k1 k)u = k1 (ku) = k1 0 = 0
A philosophical perspective:
we never say exactly what vectors are, only what vectors do. This is an example of
abstraction, which appears everywhere in mathematics (but especially in algebra): the exact
substance of an object is not important, only its properties and functions. (For instance,
when using the number “3” in mathematics, it is not important whether we refer to 3 rocks, 3
sheep, or whatever; what is important is how to add, multiply, and otherwise manipulate
these numbers, and what properties these operations have). This is tremendously powerful:
it means that we can use a single theory (linear algebra) to deal with many very different
subjects (physical vectors, population vectors in biology, portfolio vectors in finance,
probability distributions in probability, functions in analysis, etc.). [A similar philosophy
underlies “object-oriented programming” in computer science.] Of course, even though
vector spaces can be abstract, it is often very helpful to keep concrete examples of vector
spaces such as R2 and R3 handy, as they are of course much easier to visualize.
A set of r linearly independent vectors in Rn is called a basis of the subspace S. Any other
vector in this subspace is then a linear combination of the r vectors of this basis.
Theorem 4.8.1. Let V be a vector space such that one basis has m elements and another
basis has n elements. Then m = n.
if V has a basis with n elements, theorem 4.8.1 tells us that all bases of V have the same
number of elements, so this definition is well defined.
Accordingly, the vectors form a basis of Kn called the usual or standard basis of Kn .
Thus, as expected, Kn has dimension n. In particular, any other basis of Kn has n elements.
All such matrices form a basis of Mrxs called the usual or standard basis of Mrxs .
Accordingly, dimMrxs = rs.
Theorem 4.9.3. Let V be a vector space of finite dimension and let S = (u1 , u2 , . . . , ur ) be a
set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be
extended to a basis of V.
4.10 Subspaces
Definition 4.10.1. A nonempty set W of V is called a subspace of V if W is itself a vector
space over K with the same definitions of vector addition and scalar multiplication as in V.
By inspecting the defining axioms of a vector space we note that some of the axioms will
automatically be true in W, if W is a subspace of V, except the following, which must be
verified:
i) That W is closed under addition, i.e. if u, v ∈ W then u + v ∈ W .
ii) 0 ∈ W
iii) For each v ∈ W , −v ∈ W
iv) if α ∈ K, v ∈ W , then αv ∈ W .
Properties ii) and iii) may be combined into the following equivalent single statement:
ii) For every u, v ∈ W , a, b ∈ K , the linear combination au + bv ∈ W .
0 ∈ V and if α ∈ K, X, Y ∈ V , then
A(αX + Y ) = αAX + AY = 0, and
AX + AY ∈ V
Example 4.11.1. Find an R-basis for the solution space of the homogeneous system of lin-
ear equations
(
−w + x − 2y + 3z = 0
w + 2x + y − z = 0
Which gives
X + 51 z + 15 w = 0
Y − 75 z + 35 w = 0
The solution space is < (−1, 7, 5, 0), (−1, −3, 0, 5) >, which is an R-basis
We can state the following theorem on the solution space of a homogeneous System
Example 4.11.2. Please use the following two examples in Exercises 4.6 which follow
1. Determine whether the subset U = {(a, b, c)|a = 2b = 3c} of V3 (R) is a vector space
Solution
2. Show that the subset U = {(a, b, c)|a + b + c ≥ 0} of V3 (R) is a not a vector space
Solution
We have to show that one of the properties of theorem 4.10.1 does not hold.
Exercises 4.6
If the system was inconsistent (no solution), v would not be a linear combination of ui
Example 4.12.1. 1) The set (1, 3, −1), (2, 0, 1), (1, −1, 1) is linearly dependent, since if
(α + 2β + γ, 3α − γ, −α + β + γ) = (0, 0, 0)
13 −1 R2 →R2 −3R1 1 2 1 1 2 1
−−−−−−−−→
2 0 1 R3 →R3 +R1 0 3 2 R3 →R3 −R2 0 3 2
−−−−−−−→ −−−−−−−→
1 −1 1 R3 →− 1 R3 0 3 2 0 0 0
−−−−−2−→
(α + 2β − γ, α + 2β + γ, −α + 2γ) = (0, 0, 0)
1 2 −1 1 2 −1 1 2 −1
R →R −R1
1 1 −1 −−2−−−2−−→ 0 −1 2 R3 →R3 −R2 0 −1 2
R →R +R1 −
− −−−− −→
−1 0 2 −−3−−−3−−→ 0 2 1 0 0 5
The system has the trivial solution x=y=z=0, so the vectors are linearly independent
taking α = 1, β = −1 + i
then α + (α + β)i = 0
Evidently α = 0 and α + β = 0 or α = β = 0.
Solution
We must show that any vector in R2 can be written as a linear combination of u and v
(a, b) = xu + yv
Exercises 4.12
4. Suppose the vectors u, v and w are linearly independent. Show that the vectors u + v ,
5. Show that the vectors u = (1 + i, 2i) and w = (1, 1 + i) in C 2 are linearly dependent
over the complex field C, but linearly independent over the real field R.
6. Show that the subset {(1, 0, 1, 1), (1, 0, 2, 4)} of V4 (R) is linearly independent over R and
extend it to an R-basis of V4 (R).
7. Show that
i) {et , sint, t2 }
ii) {et , sint, cost}
are independent over R.
8. True or False:
a) {(1, 2, 0), (1, 1, 3), (2, 3, 3)} is a spanning set for R3 .
b) {(1, 0, 1), (2, 1, 1), (0, 1, −1)} is a linearly independent set of vectors in R3 .
c) Any five 2x2 matrices must be linearly dependent
d) (3, 2, −4) is the coordinate vector of −3 + x − 4x2 , relative to the ordered basis {x +
1, x − 1, 1 + x + x2 } of P2 .
Analogously, the columns of A may be viewed as vectors in Km called the column space of
A and denoted colsp(A). Observe that colsp(A) = rowsp(AT ).
Theorem 4.15.1. If A = (aij ) ∈ Mn (K) and cj = (c1j , c2j , . . . , ; cnj )(j = 1, 2, . . . , n) are
the columns of A, then cj = (c1j , c2j , . . . , ; cnj ) is linearly dependent over K if and only if det
A=0.
Corollary [Link]. The rows of a matrix A ∈ Mn (K) are linearly dependent over K if and
only if det A=0.
Solution
The number of linearly independent vectors in the subspace must be equal to the dimen-
sion of the subspace.
Solution
Solution
the The number of linearly independent vectors (independent spanning set) in this set is the
dimension of the subspace.
Example
S = {(1, 2, −1, 3, 4), (2, 4, −2, 6, 8), (1, 3, 2, 2, 6), (1, 4, 5, 1, 8), (2, 7, 3, 3, 9)}.
2 7 3 3 9 0 0 0 0 0
The nonzero rows of the echelon matrix M are (1, 2, −1, 3, 4), (0, 1, 1, −1, , 2), (0, 0, 2, 2, 1).
They form a basis of the row space of A and hence of W. Thus, in particular, dim W = 3.
Question 2: Can we extend the above basis to a basis of the whole space R5 , i.e find 5 lin-
early independent vectors?
Answer: Yes, we can add vectors, e.g. from the standard basis, (0, 0, 0, 1, 0) and (0, 0, 0, 0, 1)
{(1, 2, −1, 3, 4), (0, 1, 1, −1, 2), (0, 0, 2, 2, 1), (0, 0, 0, 1, 0), (0, 0, 0, 0, 1)}
1 2 1 1 2 1 2 1 1 2 1 2 1 1 2
2 4 3 4 7 0 0 1 2
3 0 0 1 2 3
−1
M = ∼ 0
−2 2 5 3 0 3 6 5∼ 0 0 0 0 -4
3 6 2 1 3 0 0 −1 −2 −3 0 0 0 0 0
4 8 6 8 9 0 0 2 4 1 0 0 0 0 0
The pivot positions are in columns C1, C3, C5. Hence, the corresponding vectors
{u1 , u3 , u5 } = {(1, 2, −1, 3, 4), (1, 3, 2, 2, 6), (2, 7, 3, 3, 9)}. form a basis of W, and dim W = 3.
Example 4.16.4. Given that the reduced row-echelon form of the matrix
1 1 0 1 0 1 0 1 0 0
−1 2 3 4 −1
is R = 0 1 2 2 0
A=
2 2 6 4
2
0
0 0 0 1
3 4 11 8 4 0 0 0 0 0
Find
Solution
a) dim A =3
b) i) A basis for the row space A is (1, 0, 1, 0, 0), (0, 1, 2, 2, 0), (0, 0, 0, 0, 1)
iii) A basis for null space of A will be given by the solution to the system:
x
1 0 1 0 0
y
0 1 2 2 0
z = 0
0 0 0 0 1
s
0 0 0 0 0
t
We note that t=0, z and s are free variables (do not appear with a nonzero entry as the first
entry in any row)
We emphasize that in the first algorithm we form a matrix whose rows are the given vectors,
whereas in the second algorithm we form a matrix whose columns are the given vectors
Exercises 4.8
1. Prove that {(1, 2, 0), (0, 5, 7), (−1, 1, 3)} is an R-basis for V3 (R) and find the coordinates
of (0, 13, 17) and (2, 3, 1) relative to this basis.
a b x 0
2. Let M = |a, b ∈ R and N = |x, y ∈ R be subspaces of M2 (R).
−b c y 0
Find R-basis for M, N, M ∩ N and M + N .
3. Let U be the subspace of V3 (R) generated by the two vectors u1 = (1, 2, 3) and u2 =
(3, −5, 1).
5. Are the vectors u = (3, −1, 0, −1) and v = (1, 0, 4, −1) in the subspace of V4 (R) gen-
erated by {(2, −1, 3, 2), (−1, 1, 1, −3), (1, 1, 9, 5)}?
Hence determine two R-bases for V4 (R), one containing u and one containing v
6. Let W be the subspace of R4 spanned by the vectors u1 = (1, −2, 5−3), u2 = (2, 3, 1, −4), u3 =
(3, 8, −3, −5) Find
a) a basis and dimension of W
b) Extend the basis of W to a basis of R4
9. Find the dimension and a basis of the solution space W of each homogeneous system:
x + 2y + z − 2t = 0
x + y + 2z = 0
a) 2x + 4y + 4z − 3t = 0 b) 2x + 3y + 3z = 0 c) −3t + x − 2y + z = 0
3x + 6y + 7z − 4t = 0 x + 3y + 5z = 0
Example 4.17.2. Let V = M2x2 , the vector space of 2x2 matrices. Let U consist of those
matrices whose second row is zero, and let W consist of those matrices whose second col-
umn is zero. Then
! !
a b a 0
U= , W = , and
0 0 c 0
! !
a b a 0
U +W = , U ∩W =
c 0 0 0
That is, U + W consists of those matrices whose lower right entry is 0, and U ∩ W con-
sists of those matrices whose second row and second column are zero.
Theorem 4.17.3. The vector space V is the direct sum of its subspaces U and W if and only
if: i) V = U + W , ii) U ∩ W = {0}.
Example 4.17.3. a) Let U be the xy-plane and let W be the yz-plane; i.e., U = {(a, b, 0)|a, b ∈
R} and W = {0, b, c)|b, c ∈ R}
However, R3 is not the direct sum of U and W, because such sums are not unique.
For example, (3, 5, 7) = (3, 1, 0) + (0, 4, 7) and also (3, 5, 7) = (3, −4, 0) + (0, 9, 7)
Now any vector (a, b, c ∈ R3 can be written as the sum of a vector in U and a vector in V
in one and only one way: (a, b, c) = (a, b, 0) + (0, 0, c) Accordingly, R3 is the direct sum
of U and W; that is, R3 = U ⊕ W.
4.18 Coordinates
Let V be an n-dimensional vector space over K with basis S = {v1 , v2 , . . . , vn }.Then any
vector v ∈ V can be expressed uniquely as a linear combination of the basis vectors in S ,
say as
n
X
v = α1 v1 + α2 v2 + . . . αn vn = αi vi , where αi ∈ K .
i=1
The n-tuple (αi ) ∈ Vn (K) are called coordinates of v relative to the K-basis S , and denoted
[v]S
Example 4.18.1. Given the K-basis S for V3 (K) = {(1, 0, 1), (2, −1, 1), (4, 1, 1)}
1
2
3
Giving α1 = 21 , α2 = 43 , α3 = − 14 ,
1 3 1
or 2, 4, −4 and so [v]S =
4
− 14
Example 4.18.2. Consider the vector space P2 (x) of polynomials of degree ≤ 2. The poly-
nomials:
Setting the coefficients of the same powers of t on LHS and RHS equal to each other, we
obtain the system
z=2
x + y − 2z = −5
x−y+z =9
2. Ability to:
Substituting k = 0 into condition (2), we obtain T (0) = 0. Thus, every linear mapping takes
the zero vector into the zero vector. Now for any scalars a, b ∈ K and any vector v, w ∈ V ,
we obtain T (av + bw) = T (av) + T (bw) = aT (v) + bT (w)
More generally, for any scalars ai ∈ K and any vectors vi ∈ V , we obtain the following basic
property of linear mappings:
T (a1 v1 + a2 v2 + . . . + am vm ) = a1 T (v1 ) + a2 T (v2 ) + . . . + am T (vm ).
5.1.1 Preliminaries
A mapping f : A → B is said to be one-to-one (or 1-1 or injective) if different elements of A
0 0
have distinct images; that is, If f (a) = f (a ) implies a = a .
55
CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES
4 4 4
x2
3. 3.0 3.
3 x3 −
2x
= x2
2. 2. 2.
1
x)
)=
)
f(
1. 1. 1.
h(x
g (x
−4.−3.−2.−1. 0 1. 2. 3. 4. −4.−3.−2.−1. 0 1. 2. 3. 4 −4.−3.−2.−1. 0 1. 2. 3. 4
−1. −1. −1.
−2. −2. −2.
−3. −3.
−4. −4. −4.
Figure 5.1: a) One-to-one (Injective) b) Onto (Surjective) b) Neither 1-1 nor Onto
The mapping T : V → V defined by T(v)=v, that is, the function that assigns to each element
in V, itself, is called the identity mapping. It is usually denoted by I . Thus, for any v ∈ V , we
have I(v) = v .
T is not linear
= T (u) + kT (a, b, 0)
= kT (u)
T is linear
Example 5.2.2. Please use the following two examples in Exercises 5.2 which follow
Exercises 5.2
ImT = {T (v)|v ∈ V }
Note
Both kernel and image are sets of vectors. The kernel is the set of (input) vectors from the
domain of T, and the image is the set of all functional values (output vectors) in the range of
T.
Example 5.3.1. Consider T : R3 → R3 , the projection of a vector v into the xy-plane,
that is,
T (x, y, z) = (x, y, 0)
Clearly the image of T is the entire xy-plane, that is, points (or vectors) of the form (x, y, 0).
Moreover, the kernel of T is the z-axis, that is, points (or vectors) of the form (0, 0, c). That
is,
Theorem 5.4.1. Let V be of finite dimension, and let T : V ∈ U be a linear mapping. Then
dimV = dim(Ker T ) + dim(Im T ) = nullity(T ) + rank(T )
Solution
(1, 0, 0, 0) → (1, 1, 0)
(0, 1, 0, 0) → (−1, 2, 3)
(0, 0, 0, 1) → (1, 1, 0)
then we form the matrix with the generators of R3 as rows and row reduce
1 1 0 1 1 0
S = −1 2 3 ∼ 0 1 1
1 −1 −2 0 0 0
Thus, (1, 1, 0) and (0, 1, 1) form a basis of ImT . Hence, dim(ImT ) = 2 and rank(T ) =
2.
Set corresponding components equal to each other to form the following homogeneous sys-
tem
α1 − α2 + α3 + α4 = 0
α1 + 2α2 − α3 + α4 = 0
3α2 − 2α3 =0
0 0 3 −2 0 0 0 0
Note that the transpose of the matrix of the generators (normal basis) of R4 is called the
matrix of the linear transformation with respect to the standard basis ei , i.e
T (α1 , α2 , α3 , α4 ) = (α1 − α2 + α3 + α4 , α1 + 2α2 − α3 + α4 , 3α2 − 2α3 ).
can be written as: T (α1 , α2 , α3 , α4 ) = Te (α1 , α2 , α3 T , where
, α4 )
α1
1 −1 1 1
α2
Te = 1 2 −1 1 and (α1 , α2 , α3 , α4 )T =
α .
3
0 0 3 −2
α4
T : R2 → R2 , defined by
T (x, y) = (2y, 3x − y)
T (1, 0) → (0, 3)
!
0 2
[T ]std = is the matrix we would usually use to represent T by matrix multiplication.
3 −1
We call [T ]std the matrix representation of T relative to the standard basis of R2 .
When we use the standard basis, the rows of [T ]std are the coefficients of x, y, z in the
components of T (x, y, z)
Example 5.5.2. Let V be the vector space of functions with basis S = (sint, cost, e3t ),
df (t)
and let D : V → V be the differential operator defined by D(f (t)) = .
dt
The matrix representing D in the basis S is:
0 −1 0
and so [D]S = 1 0 0.
0 0 3
Note that the coordinates of D(sint), D(cost), D(e3t ) form the columns, not the rows, of D.
Example 5.5.3. Now consider a new basis S = {(1, 3), (2, 5)} for our earlier linear trans-
formation T : R2 → R2 , deifned by T (x, y) = (2y, 3x − y)
We consider the images of these basis vectors, and write them as linear combinations of
the given basis vectors.
gives
2) Write the image of u1 , i.e. v = (1, 0, 3) as a linear combination of the three basis vectors
(1, 0, 3) = α(1, 0, 1) + β(−2, 1, 1) + γ(1, −1, 1), or
(1, 0, 3) = (α − 2β + γ, 0α + β − γ, α + β + γ)
and
solve the system
α − 2β + γ = 1
0α + β − γ = 0 for α, β and γ
α+ β+γ =3
1 −2 1 1 1 −2 1 1 1 −2 1 1
0 1 −1 0 R3 →R3 −R1 0 1 −1 0 R3 →R3 −R2 0 1 −1 0
−−−−−−−→ −−−−−−−→
1 1 1 3 0 3 0 2 0 0 3 2
1 0 −1 1 1 0 −1 1
0 1 −1 0 R2 →3R2 +R3 0 3 0 2 R1 →3R1 +R3
R1 →R1 +2R2
−−−−−−−−→ −−−−−−−−→ −−−−−−−−→
0 0 3 2 0 0 3 2
3 0 0 5
0 1 0 2, giving
0 0 3 2
(1, 0, 3) = 53 (1, 0, 1) + 32 (−2, 1, 1) + 32 (1, −1, 1)
working similarly, we find the images of u2 = (−2, 1, 1) and u3 = (1, −1, 1) under the
transformation, and write each image as a linear combination of the three basis vectors
(−3, −1, −2) = − 11 1 4
3 (1, 0, 1) + 3 (−2, 1, 1) + 3 (1, −1, 1)
12
(2, −2, 2) = 3 (1, 0, 1) + 30 (−2, 1, 1) + 36 (1, −1, 1)
and
5 −11 12
1
[T ]S = 2 1 0
3
2 4 6
!
−8 −11
and P =
3 4
0
To find the change-of-basis matrix Q from the "new" basis S to the "old" basis S .
we write each of the old basis vectors u1 and u2 of S as a linear combination of the basis
0
vectors v1 and v2 of S .
We have
T (u1 ) = αv1 + βv2 or (1, 2) = α(1, −1) + β(1, −2) = (α + β, 2α + 5β)
T (u2 ) = αv1 + βv2 or (3, 5) = α(1, −1) + β(1, −2) = (α + 3β, 2α + 5β)
T (u1 ) = 4(1, 2) − 3(3, 5)
T (u2 ) = 11(1, 2) −!8(3, 5)
4 11
and Q =
−3 −8
Q is the inverse of P, so we can find Q by forming the matrix M = (P : I) and row reducing
M to row canonical form:
−8 −11 1 0 R2 →8R2 +3R1 −8 −11 1 0
−−−−−−−−→
3 4 0 1 0 −1 3 8
−8 −11 1 0 R1 →R1 −11R2 −8 0 −32 −88
−−−−−−−−→
0 −1 3 8 0 −1 3 8
R →− 1 R1
−8 0 −32 −88 −−1−−−8−→
1 0 4 11
R →−R2
0 −1 3 8 −−2−−−→ 0 1 −3 −8
!
4 11
Thus Q = P −1 =
−3 −8
A given linear transformation can be represented by matrices with respect to many choices
of bases for the domain and range. Finding the matrix of a linear transformation relative to
a given basis, e.g. [T ]std turns out to be easy, whereas finding the matrix of T relative to other
bases is more difficult. Here’s how you can use change-of-basis matrices to make things
simpler
0
Suppose you have a linear transformation U ⇒ V with bases S and S for U and V respec-
0
tively, and you want the matrix representation of T relative to these bases, i.e. [T ]SS
Find
1) [T ]std
std , the matrix representation of T relative to the standard basis (find this from the
(basis elements written in terms of the standard basis, used as the columns of the matrix)
0
3) [T ]Sstd = ([T ]std
S0
)−1
0 0
Then [T ]SS = [T ]Sstd [T ]std std
std [T ]S
1 1 −2 1 0 1
Therefore
4 −1 −2 1 1 −1 " # 7 7
0
2 1
[T ]SS = −2 1 1 1 2 0 = −2 −3
1 1
1 0 1 1 1 −2 2 2
A
U =⇒ R2
w w
w w
B −1 [T ]A w w [T ]
w w
B
V =⇒ R3
We calculate
i) [T ]std
S from
(1, 2) = α(1, 0) + β(0, 1) = 1(1, 0) + 2(0, 1)
(3, 5) = α(1, 0) + β(0, 1)!= 3(1, 0) + 5(0, 1)
1 3
Whence [T ]std
S =
2 5
ii) [T ]std
S from
(1, −1) = α(1, 0) + β(0, 1) = 1(1, 0) − 1(0, 1)
(1, −2) = α(1, 0) + β(0, 1) =! 1(1, 0) − 2(0, 1)
1 1
Whence [T ]std
S =
−1 −2
Then ! ! ! !
0 1 1 1 0 1 3 4 11
[T ]SS = =
−1 −2 0 1 2 5 −3 −8
Exercises 5.5
3. If {u1 , u2 } and {v1 , v2 , v3 } are R-bases for V2 (R) and V3 (R) respectively and if a linear
transformation T from V2 (R) into V3 (R) is defined by
T u1 = v1 + 2v2 − v3
T u2 = v1 − v2
find the matrix of T relative to these bases. Find also the matrix of T relative to the R-bases
{−u1 + u2 , 2u1 − u2 } and {v1 , v1 + v2 , v1 + v2 + v3 } for V2 (R) and V3 (R) respectively.
What is the relationship between these two matrices?
6. Find the rank and nullity of a linear transformation from V4 (R) to V3 (R) defined by
T (α, β, γ, δ) = (α − γ + 2δ, 2α + β + 2γ, β + 4γ)
Is g(x) injective?
Take x, y ∈ R and assume that g(x) = g(y).
Therefore 2g(x) + 3 = 2g(y) + 3. We can cancel out the 3 and divide by 2, then we get
g(x) = g(y).
Since g(x) is bijective, then it is injective, and we have that x=y.
Is g(x) surjective?
Take some y ∈ R, we want to show that y = g(x) that is, y = 2g(x) + 3
y−3
Subtract 3 and divide by 2, again we have 2 = g(x)
y−3
As before, if g(x) was surjective then we denote w = 2 , since g(x) is surjective there is
some x such that g(x) = w. Now g(x)=y as required.
Example 5.7.2. Please use the following two examples in Exercises 5.7 which follow
Solution
Hence, F is nonsingular.
To find G−1 :
Method 1
G−1 = (2x − y, x − y)
Method 2
Find the
matrixrepresentation of G relative to
the given
basis (in this case, standard basis)
1 −1 2 −1
[G] = and the inverse [G]−1 =
1 −2 1 −1
2 −1 x
Then G−1 = = (2x − y, x − y)
1 −1 x
Exercises 5.7
1. Show that
a) the linear Transformations defined below are nonsingular, and
b) Give a rule fo T −1 like the one which defines T
i) T : R2 → R3 defined by T (x, y) = (x + y, x − 2y, 3x + y).
ii) T : R3 → R3 defined by T (α, β, γ) = (3α − β, α − β + γ, −α + 2β − γ)
2. If {v1 , v2 , v3 , v4 } is an R-basis for the vector space V, for what values of λ is the linear trans-
formation defined by
T v1 = v1 + λv2
T vi = 2vi−1 + λvi (i=2,3,4)
nonsingular?
G(x, y, z) = (x − z, y).
a) F + G, b) 3F and c) 2F - 5G.
Solution
The the collection of all linear mappings from V into U with the above operations of addition
and scalar multiplication forms a vector space over K, denoted by Hom(V,U) (Note Hom
stands for "homomorphism").
Solution
b) The mapping F ◦ G is not defined, because the image of G is not contained in the do-
main of F.
2. Ability to:
6.1 Introduction
There are further concepts in the structure of vector spaces that did not appear in our
investigation in chapter 4, such as "length", "angle" between two vectors, "orthogonality" of
vectors, etc. (although some of these concepts did appear in 4 on section 4.1, 4.2, 4.2 and
4.4). Here we place an additional structure on a vector space V to obtain an inner product
space.
Also, we will adopt the notation used for vector spaces in chapter 4, i.e.:
u, v, w are vectors in V
a, b, c, k are scalars in K
Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise
stated or implied
6.2 Preliminaries
Recall that:
a) Referring to fig. 6.1a), the length or norm of a vector v = (α, β, γ) ∈ V3 (R), denoted kuk,
√ p p
is defined as kuk = OR2 + P R2 = OQ2 + QR2 + P R2 = α2 + β 2 + γ 2
and
i) if λ ∈ K , kλvk = |λ|kvk
ii) if kvk = 1, v is called a unit vector (normalised)
u
every nonzero vector can be normalised by setting v =
kvk
iii) kvk = 0 iff v = 0.
b)
If u and v are two vectors in V3 , represented by the points P and Q respectively, fig. 6.1b
The distance between two vectors u, v denoted d(u, v) is the distance between P and Q, or
0
length P Q = length OP .
0 0 0 0
P is the point (α − α , β − β , γ − γ q
)
0 0 0
d(u, v) = (α − α , β − β , γ − γ ) = (α − α0 )2 , (β − β 0 )2 , (γ − γ 0 )2 = ku − vk
73
CHAPTER 6. INNER PRODUCT SPACES
Note
p
i) kvk = (u, v)
ii) Two vectors u, v are perpendicular, or orthogonal if cosθ = 0
Exercises 6.2
i) (1, 2, 1)
ii) (-3, 2, 5)
4. Find the angle between each pair of the following pairs of vectors
i) (3, -2, 1) and (1, -1, 1) ii) (2, 1, -1) and (1, 0, 2)
We can now state the properties of the inner product defined above, as a basis for the
definition of inner product spaces.
The vector space V with an inner product is called an inner product space. A real inner
product space is called a Euclidean space. A complex inner product space is called a unitary
space.
Henceforth and unless specified otherwise, V will denote an inner product space.
Example 6.4.1. Please use the following two examples in Exercises 6.4 which follow
Solution
i) Method 1
Let w = (z1 , z2 )
u + w = (x1 + z1 , x2 + z2 )
= x1 y1 + z1 y1 − x2 y1 − z2 y1 − x1 y2 − z1 y2 + 3x2 y2 + 3z2 y2
= x1 y1 − x2 y1 − x1 y2 + 3x2 y2 + z1 y1 − z2 y1 − z1 y2 + 3z2 y2
= (u, v) + (w, v)
(u, u) = x1 x1 − x2 x1 − x1 x2 + 3x2 x2
= x1 2 − 2x1 x2 + x2 2 + 2x22
ii) Method 2
Because A is real and symmetric (u, v) = (v, u), we need only show that A is positive def-
inite. The diagonal elements 1 and 3 are positive, and the determinant |A| = 3−1 = 2 is
positive. A is positive definite (refer chapter 7, section 7.8.5).
Solution
The two functions differ in the terms: e.g. α2 β1 is negative in (u, v), but positive in (v, u).
Exercises 6.4
3. Which of the following are inner products on C[−1, 1], the vector space of real valued con-
tinuous functions defined on C[−1, 1], if If f, g ∈ C[−1, 1]?
R1
i) (f, g) = −1 f (x)g(x) dx
R1
ii) (f, g) = x2 −1 f (x)g(x) dx
4. a) Verify that the following is an inner product on R2 , where u = (x1 , x2 ) and v = (y1 , y2 )
(u, v) = x1 y1 − 2x1 y2 − 2x2 y1 + 5x2 y2
b) Consider the vectors u = (1, −3) and v = (2, 5) in R2
Find
i) (u, v) with respect to the standard inner product in R2
ii) (u, v) with respect to the inner product in R2 in a) above
5. Show that each of the following is not an inner product on R3 , where u = (x1 , x2 , x3 ) and
v = (y1 , y2 , y3 ):
a) (u, v) = x1 y1 + x2 y2
b) (u, v) = x1 y2 x3 + y1 x2 y3 .
6. Find the values of k so that the following is an inner product on R2 , where u = (x1 , x2 )
and v = (y1 , y2 )
Solution
(u, v) = 2.1 + 1.0 + (−1).2 = 0
conclusion: u,v are orthogonal
Definition 6.5.1. If u, v ∈ V and (u, v) = 0, then u and v are said to be orthogonal (or
perpendicular) to each other. A subset S of V is called an orthogonal set if the elements of S
are mutually orthogonal. An orthogonal set is called an orthonormal set if each vector has
unit length, i.e. if kvk = 1.
Example 6.5.1. 1) Refer to section 4.9 (page 39) for examples of bases
The stadndard bases for Vn (R) and Vn (C) are orthonormal relative to the standard inner
product.
2) Exercise
In V4 (R) with standard inner product, find the vectors orthogonal to u = (1, 1, 2, −1)
Solution
Let v = (α1 , α2 , α3 , α4 )
= (1, 1, 2, −1).(α1 , α2 , α3 , α4 ) = 0
α1 + α2 + 2α3 − α4 ) = 0
The set of all vectors v orthogonal to u is gievn by all the solutions to this linear equation,
e.g.
(−1, 1, 0, 0), (0, −2, 1, 0), (0, 1, 0, 1) is an R-basis for the solution space to the linear equa-
tion and all R-linear combinations of these vectors are orthogonal to u.
Lemma [Link]. An orthogonal set of nonzero vectors in an inner product space V is linearly
independent.
Given an arbitrary basis of an inner product space V, is possible to find an orthonormal ba-
sis {u1 , u2 , . . . , un } of V?
Answer
Consider the set {v1 = (1, 1, 1), v2 = (0, 1, 1), v3 = (0, 0, 1)}
v1 (1, 1, 1) 1 1 1
u1 = = √ = (√ , √ , √ )
kv1 k 3 3 3 3
Next we set w2 = v2 − (v2 , u1 )u1
2 1 1 1 2 1 1
= (0, 1, 1) − √ ( , , ) = (− , , )
3 3 3 3 3 3 3
We then normalise w2
w2 2 1 1
i.e. u2 = = (− √ , √ , √ )
kw2 k 6 6 6
Finally we set w3 = v3 − (v3 , u1 )u1 − (v3 , u2 )u2
1 1 1 1 1 2 1 1 1 1
= (0, 0, 1) − √ ( , , ) − √ (− √ , √ , √ ) = (0, − , )
3 3 3 3 6 6 6 6 2 2
Then we normalise w3
w3 1 1
u3 = = (0, − √ , √ )
kw3 k 2 2
Theorem 6.6.2. Let Let {v1 , v2 , . . . , vn } be an arbitrary basis of an inner product space V.
Then there exists an orthonormal basis {u1 , u2 , . . . , un } of V, such that the transition matrix
from vi to ui is triangular, i.e. for i=1,2,...,n ui = αi1 v1 , αi2 v2 , . . . , αii vi .
v1
Proof. Set u1 = {u1 } is normalised
kv1 k
w2
Next set w2 = v2 − (v2 , u1 )u1 and u2 = kw2 k
by lemma [Link] w2 (and hence u2 is orthogonal to u1 . {u1 , u2 } is orthonormal
w3
Next set w3 = v2 − (v3 , u2 )u2 − (v3 , u1 )u1 and u3 = kw3 k
by lemma [Link] w3 (and so u3 ) is orthogonal to u1 , u2 . {u1 , u2 , u3 } is orthonormal
In general, after getting {u1 , u2 , . . . , ui }
wi+1
set wi+1 = vi+1 − (vi+1 , ui )ui − (vi+1 , ui−1 )ui−1 − · · · − (vi+1 , u1 )u1 and ui+1 = kwi+1 k
Note that wi+1 6= 0 since vi+1 ∈
/ L(v1 , v2 , . . . , vn )
As above, the set {u1 , u2 , . . . , ui+1 } is orthonormal.
1
Example 6.6.1. Extend the orthonormal set {v1 = (2, 0, −1, 2), v2 = (2, 1, 0, −2)} to
3
give an orthonormal basis for V4 (R)
Solution
1
{v1 = (2, 0, −1, 2), v2 = (2, 1, 0, −2), v3 = (1, 0, 0, 0), v4 = (0, 0, 0, 1)} is an R-basis
3
for V4 (R)
we set u1 = v1
u1 = v1
1
set w3 = v3 − (v3 , u2 )u2 − (v3 , u1 )u1 = (1, −2, 2, 0)
9
1
w4 = v4 − (v4 , u3 )u3 − (v4 , u2 )u2 − (v4 , u1 )u1 = (1, −2, 2, 0)
9
{ 13 (2, 0, −1, 2), 31 (2, 1, 0, −2), 31 (1, −2, 2, 0), 13 (0, 2, 2, 1)} is an an orthonormal basis for V4 (R).
Exercises 6.6
4. Let V be the subspace of C[0,1] containing real polynomials of degree at most 3. Apply
the Gram–Schmidt orthogonalization process to the R-basis {1, x, x2 , x3 } for V.
2) Ability to:
ii) use Eigenvectors and Eigenvalues to diagonalise orthogonal and unitart matrices
Recall section 2.6 page 11, that we can form polynomials in the matrix A:
f (x) = a0 + a1 x + ... + an xn , where ai are scalars.
we define f A) to be the matrix f (A) = a0 + a1 A + ... + an An
In the case where f (A) is the zero matrix, then A is called a zero or root of the polynomial
f (A).
Now suppose that T : V → V is a linear operator on a vector space V. We can define f(T) in
the same way as we did for matrices:
f (T ) = an T n + . . . + a1 T + a0 I , where I is now the identity mapping.
We also say that T is a zero or root of f (t) if f (T ) = 0; the zero mapping.
Find f (T )(x, y)
Solution
In general linear transformations are determined by what they do to a basis. The theory of
Eigenvectors and Eigenvalues helps us understand to what extent (and how) a linear
transformation can be understood as a scaling in varios directions.
81
CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS
Note
a) each eigenvvector has a unique eigenvalue associated to it.
b) each eigenvvalue has more than one eigenvector associated to it.
c) eigenvector (German: eigen=own) a vector which keeps its own direction when acted
upon by T. Other names: principal value, proper value, characteristic value.
The set of all vectors v , (including 0), such that T v = λv is called the eigenspace of T
corresponding to λ)
A scalar λ is an eigenvalue of an nxn matrix A when det(A − λI) = 0.
Recall: for a square matrix A − λI , dim(ker(A − λI) 6= 0 if and only if it is not invertible, i.e.
if and only if its determinant=0.
Thus to find the eigenvalues of A − λI , we are looking for scalars λ such that
det(A − λI) = 0
det(A − λI) will always be a polynomial of degree n in λ.
a11 − λI a12 . . . a1n
a21 a22 − λI . . . a2n
A − λI =
. . . . . .
am1 am2 . . . amn−λI
Its determinant, ∆(λ) = det(A − λI), which is a polynomial in λ, is called the characteristic
polynomial of the matrix A.
We also call ∆(λ) = det(A − λI) = 0 the characteristic equation of A.
Once we know the eigenvalues of a matrix A, we can find bases for the kernel, ker(A−λI).
ker(A − λI) is called the eigenspace corresponding to the eigenvalue λ.
Eigenvalues tell us that the linear transformation is scaling by the amount λ. Eigenvectors
tell us where the the scalings are done.
For the above example, we want bases for the eigenspaces for ker(A−2I) and ker(A+
I).
! ! (
0 3 x 0x + 3y = 0
We set E1 : A − 2I)X = or
0 −3 y 0x − 3y = 0
(" #)
1
is a basis for the eigenspace E2 corresponding to the eigenvalue λ = 2.
0
! !
3 3 x
and E2 : (A + I)X = or 3x + 3y = 0
0 0 y
(" #)
−1
is a basis for the eigenspace E−1 corresponding to the eigenvalue λ = −1.
1
3 0 −5 3−λ −5
0
2) Let B = 0 3 0, det(B − λI) = 0 3−λ 0
0 0 0 0 0 0−λ
and this polynomial has zeros λ = 3, with algebraic multiplicity 2, and λ = 0 with alge-
braic multiplicity 1.
0 0 −5 x 0 0 1 x
E3 : (B − 3I)X = 0 0 0 y ∼ 0 0 0 y or 0x + 0y + 1z = 0
0 0 −3 z 0 0 0 z
This gives z=0,with x and y as free variables, so we take x=1, y=0 and x=0,y=1
( 1 0 )
E3 = 0 , 1
0 0
1 0 − 35 ( 5 )
x 3
E0 : (B − 0I)X = 0 1 0 y E0 = 0
0 0 0 z 1
Note
λ = 1 algmult(1)=2
! (" #)
0 1 1
C −I = E0 =
0 0 0
Theorem 7.1.3. Nonzero eigenvectors belonging to distinct eigenvalues are linearly inde-
pendent.
Exercises 7.1
2. a) True or False:
If A and B are similar matrices, say B = P −1 AP , where P is invertible, then A and B have
the same characteristic polynomial
b) matrices A and B can have different characteristic polynomials (and so be nonsimilar
matrices) but may have the same minimal polynomial.
They must be similar ’to some degree’ because they represent the same linear
transformation, only with respect to different bases.
It is intuitive that they have the same: rank, nullity, determinant, eigenvalues.
We shall say that nxn matrices A and B are similar if they represent the same linear
Question
When is a square matrix diagonalisable?
Answer
An nxn matrix A is diagonalisable if and only if there is a basis R consisting of the
eigenvectors of A.
Theorem 7.3.2. An n-square matrix A is similar to diagonal matrix B if and only if A has n
linearly independent eigenvectors In this case the diagonal elements of B are the
corresponding eigenvalues.
Exercises 7.3
1 0 −1
1. 1 2 1,
2 2 3
1 −1 −1
2. 1 −1 0,
1 0 −1
2 1 0 0
2 1 0 0
3. A =
0 0 0
.
3
2 1 3 0
If a matrix A has distinct eigenvalues, then the minimum and characteristic polynomial of A
coincide. A matrix A is diagonalisable if and only if its minimum polynomial factors into
distinct linear factors.
We saw in section 7.1, page 81, that characteristic equation of A is (3 − λ)2 (−λ) = 0,
Exercises 7.4.1
2. If a matrix A has has the characteristic polynomial (λ − 1)3 (λ + 1)2 (λ − 3), find all the
possible minimum polynomials of A.
3. By calculating the minimum polynomial, determine which of the following matrices are di-
agonalisable
1 −1 −1 2 0 −1 0 0 3
4 −1
i) ii) 0
, 3 2, iii) −1 2 2 , iv) 0 4 0,
1 2
0 −1 0 1 −1 −1 3 0 0
1 1
0 0 1 2 0 2
v) 0 1 0, vi) 0 1 0
1 1
1 0 0 2 0 2
3 + 7i 0
Example 7.5.1. Determine A∗ for the matrix A =
2i 4 − i
Solution
3 + 7i 2i ∗ 3 − 7i −2i
A= = A =
0 4−i 0 4+i
3 − 7i 0
A∗ = AT =
−2i 4 + i
3i 1 + i 0
−1 2 3
D= 2 0 −1 is Hermitian. All A real symmetric matrices are Hermitian.
3 −1 4
Exercises 7.6
3. exists an orthogonal matrix P, such that ists a unitary matrix U, such that U ∗ AU
P T AP is diagonal is diagonal
Solution
Step 1
Step 2
Step 3
1 1 −1
normalise {v1 , v2 } to get P = 2 1 1
2 i
2) Let D = . Verify that A is normal. Find a unitary matrix P such that P ∗ AP is di-
i 2
agonal. Find P ∗ AP .
Solution
Step 1
λ = 2 + i, 2 − i
Step 2
Step 3
Exercises 7.7
2. True or False:
Every symmetric matrix is diagonalizable.
2) q(x, y, z) = x2 + 3xy − 4xz − 15z 2 is a quadratic form because the degree of each
term is 2.
Note
Example 7.8.3. Find the quadratic form associated with the matrix
−3 −5
A=
−5 4
Solution
x T x
Let X = , X =
y y
−3 −5 x
then X T AX = x y = −3x2 − 10xy + 4y 2
−5 4 y
Note
• in both examples above, A is a symmetric matrix, i.e. A = AT ;
! !
a h x
• the general quadratic form is X T AX = x y = ax2 + 2hxy + by 2 ;
h b y
Example 7.8.4. Find the quadratic form in 3 variables x,y,z associated with the matrix
1 4 7
A = 4 2 5
7 5 3
Solution
Let X T = (x, y, z)
1 4 7 x
X T AX = x y z 4 2 5 y = x2 + 8xy + 14zx + 2y 2 + 10zy + 3z 2
7 5 3 z
e f c z
a d e x
T
X AX = x y z d b f y
e f c z
x
= ax + dy + ez, dx + by + f z, ex + f y + cz) y
z
= ax2 + dyx + ezx + dxy + by 2 + f zy + exz + f yz + cz 2
Solution
• there are no xy, yz and xz terms (the remaining entries in matrix = 0).
1 0 0 x x
x2 + y 2 + z 2 = x y z 0 1 0 y = x y z I y
0 0 1 z z
Answer; Recall that a symmetric matrix is orthogonally diagonalizable, i.e. there exists a
matrix P, such that P T AP is diagonal.
Question
How can we do this?
Answer
Let X = P Y , where X = (x1 , x2 ), Y = (y1 , y2 )
Let D be the diagonal matrix with leading diagonal entries being the eigenvalues of A.
Then X T AX = Y T DY
Here’s why
X T AX = (P Y )T A(P Y )
= P T Y T A(P Y )
= Y T (P T AP )
= Y T DY
Solution
10 − λ −4
det(A − λI) = = (10 − λ((4 − λ) − 16 λλ2 − 14λ − 24 = 0
−4 4 − λ
λ = 2, 12
(
8x1 + 4x2 = 0
8 −4 x1 0
E2 : (A − 2λ = = or and so 2x1 = x2
−4 2 x2 0 −4x1 + 2x2 = 0
1 1
E2 = , normalised to √1
2 5 2
−2 −4 x1 0
E12 : (A − 12λ = = or −x1 − 2x2 = 0 and so x2 = −2x1
−4 −8 x2 0
2 1 2
E12 = , normalised to √
−1 5 −1
1 1 2
3) the normalised matrix P = (E1 , E2 ) = √
5 2 −1
1 T 1 2 10 −4 1 1 2 2 0
4) the daigonal matrix D = P AP = √ √ =
5 2 −1 −4 4 5 2 −1 0 12
T DY
2 0 y1
5) f (Y ) = Y = y1 y2
0 12 y2
y
= (2y1 12y2 ) 1 = 2y12 + 12y22
y2
λ1 0 ... 0
y1
0 λ2 . . . 0
y2
Proof. X T AX = Y T DY = y1 y2 . . . yn 0 . . . . . .
0 ..
.
0 0 λk . . . 0
yn
0 0 . . . λn
3. y
2.
1.
x
0
4 y
−3. −2. −1. 0 1. 2. 3.
2
−1. 0
x
−4 −2 2 4
−2.
−2
−4
−3.
Figure 7.1:
a) Ellipse b) Hyperbola
Conics in Standard Form
and the the ellipse or hperbola is rotated. Converting Q(X) to diagonal form transforms Q(X)
0 0
2 + a 0x 2 = C 0 0
into stadard form: Q(X) = a11 x11 22 22
Example 7.8.7. Identify and sketch the graph of the conic given by the equation 5x2 −6xy+
5y 2 = 8
Solution
5−λ −4
det(A − λI) = , (5 − λ)2 − 9 = 0 or λ2 − 10λ + 16 = 0; λ = 2, 8
−4 5 − λ
! ! !
3 −3 x1 0
E2 : (A − 2λ = = or 3x1 − 3x2 = 0 and so 2x1 = x2
−3 3 x2 0
3. y
0
0
2. 0 x
y x
0
4
1. y 4
2
x 2
0
−3. −2. −1. 1. 2. 3.
−2
−1. −2
−4
−4
−2.
−3.
Figure 7.2:
a) Ellipse b) Hyperbola
Rotated Conics
" # " #
1 1
E2 = , normalised to √1
1 2 1
! ! !
−3 −3 x1 0
E8 : (A − 8λ = = or −3x1 − 3x2 = 0 and so x2 = −x1
−3 −3 x2 0
" # " #
1 1 1
E8 = , normalised to √
−1 2 −1
!
1 1 1
the normalised matrix P = (E1 , E8 ) = √
2 1 −1
0 0
Taking new axes 0x , 0x in the directions u = √1 √1 ,v= √1 − √12 , (i.e. putting
2 2 2
0 0 0 0
x= √1 (x + y ), y = √1 (x −y )
2 2
0 0 0 0
0 0 x2 y2 x2 y2
the equation becomes x2+y2 =8 or + = 1; + 2 =1
4 1 22 1
This is an ellipse with: length = 2 on x-axis, length = 1 on y-axis
Example Identify and sketch the graph of the conic given by the equation 2x2 −2xy+
√ 7.8.8. √
2y 2 − 2 2x + 4 2y = 8
Solution
x
Convert the first three terms to X T AX , where X =
y
√ √
The terms −2 2x, 4 2y are not quadratic, but we can write them in matrix form as
√ √ √ √
x
−2 2x + 4 2y = (−2 2, 4 2)
y
√ √ √ √
Let B = (−2 2, 4 2), then 2x2 − 2xy + 2y 2 − 2 2x + 4 2y = 8 can be written as
X T AX + BX = 8
2−λ −1
det(A − λI) = ; λ2 − 4λ − 3 = 0 or λ = 1, 3
−1 2 − λ
1 −1 1
E1 : (A − λ = and x1 = x2 ; E1 =
−1 1 1
−1 −1 1
E3 : (A − 3λ = and x2 = −x1 ; E3 =
−1 −1 −1
1 1 1
P = (E1 , E3 ) = √
2 1 −1
0 0
and the equation of the conic is x 2 + 3y 2 + Bx = 8
What is Bx? 0
x x
Recall that X = , but we need BX in terms of Y = 0
y y
0
x
BX = B(P Y ) Recall (Principal Axes Theorem says) that X = P Y , where Y = 0
y
0 0
√ √ 1 1
1 x x 0 0
= −2 2 4 2) √ 0 = 2 6 0 = = 2x + 6y and so
2 1 −1 y y
02 02 0 0
x + 3y + 2x + 6y = 8
4.
00
0
0 y y x
y 3. x
00
2.
1.
x
−4. −3. −2. −1. 0 1. 2. 3. 4.
−1.
−2.
−3.
−4.
√ √
Figure 7.3: 2x2 − 2xy + 2y 2 − 2 2x + 4 2y = 8
2 1 0
1 0 −1
0 1 −1 1 −2 1
0 0 0 0 0 0
0.5 1 −1 0.5 1 −1 0.5 1 −1
a) x2 + y 2 b) x2 − y 2 c) −x2 − y 2
• negative definite if
q(x, y) < 0, ∀x ∈ Rn , X 6= 0
• indefinite if q(x, y) assumes both positive and negative values
q(x, y) > 0, ∀x ∈ Rn , X 6= 0
• positive semidefinite if
q(x, y) ≥ 0, ∀x ∈ Rn , X 6= 0
• negative definite if
q(x, y) > 0, ∀x ≤ Rn , X 6= 0
Exercises 7.8
3. Find the principal axes, centre and sketch the graph of the following conics
i) xy = 2
ii) 3x2 − 2y 2 + 12xy = 42
iii) 3x2 − 2y 2 + 12xy = 42
iv) 7x2 + 4y 2 − 4xy = 24
! !
4 −3 3 3 10 −25 −5
1. i) , ii) not defined (ND)., iii) 3A + 4B − 2C =
2 −5 −1 −4 7 −2 10
2. x = 2, y = 4, z = 1, w = 3
! !
10 2 26 18
3. A2 = , a) A3 =
3 7 27 −1
! !
−4 8 0 0
i) f (A) = A3 − 3A2 − 2A + 4I = , ii) g(A) =
12 −16 0 0
!
3
4. u =
5
5. i) A2x3 B3x4 = C2x4 , ii) A4x1 B1x2 = C4x2 , iii) A3x4 B3x4 = N D, iv) A5x2 B2x3 = C5x3
−1 −8 −10 !
15 −21
6. a) AB = 1 −2 −5, BA =
10 −3
9 22 15
b) AB = 6 1 −3 , BA = N D
! !
0 0 5 5
c) AB = , BA =
0 0 −5 −5
1 2 4
1 2
3
0 3 4
7. At = , i) B t = 2 4 −5
1 4 4
3 −5 0
0 5 4
! 12 10 −1
5 1
8. a) AAt = , At A = −1 5 −4
1 26
12 −4 16
4 −2 6
t
b) AA = −2 At A = 14
1 −3,
6 −3 9
49 0 0 49 0 0
t
c) AA = 0 49 0, t
A A = 0 49 0
0 0 49 0 0 49
102
CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS
! !
x y a b
9. A = , or
0 x 0 a
1 1
5
1 −2 3 1 0 2 2
1 0 0 8
0 1 −5 6 1 2 1
1. i) A = 0 1 0 − 18 , ii) B = , iii) C =
0 0 0 0 0 0 0
1
0 1 1
8
0 0 0 0 0 0 0
1 −2 3 −1 1 −1 2 1
iv) D = 0 3 −4 4 , v) E = 0 1 −3 −1,
0 0 7 −10 0 0 −4 −1
1 0 1+i 1
vi) F = 0 1+i 1−i
1
2 2
0 0 0 0
2. i) Yes, ii) No
3. True
ii) λ = 1, ( 13 , − 23 )
−2 4 6
1. 41 1 2 −1,
1 −2 −1
11 −9 1
1
2. −7 9 −2,
3
2 −3 1
−3 2 1
1
3. 2 −1 0 1
5 −2 −1
chapter 3
a+b+c
2. 0, b − c,
2
5 1
− 12
−5 −1
1 2 2
1. AdjA = 2 4 −2, A−1 = −1 −2 1
1 −3 1 − 12 1
2 − 12
1 −t −t
1
2. t = ±1, −t 1 1
2
(1 − t )
−t t2 1
x2 − x −1 x+1
3. AdjA = −2 x2 − x − 1 2(x + 1)
1. a) ∆ = 5, ∆x = 5, ∆y = 5, ∆z = 5
x = 4, y = −2, z = 3;
b) x = 3, y = −1, z = 2
chapter 4
3. kuk = 13
4. k = ±3
1 7
5. i) k = ii) k = −
4 13
1. x = 2, y = 4, z = 1, w = 3
−1 13
2. a) i) 7 ii) 24
−22 −29
5. a) u and v are orthogonal, if u.v = 0, i.e is < (3, k, −2), (6, −4, −3) >= 0
18 − 4k + 6 = 0, or k=3
√ p √
b) kuk = 39, where u =< (1, k, −2, 5), (1, k, −2, 5) >= 30 + k 2 = 39, or k = ±3
1 1
6. a) u
b= √1 (5, −7), b) vb = (1, 2, −2, 4), b= √
c) u (6, −4, 9)
74 5 133
0 0 0 −1 0 0 0 −1 0
3. M = 2A + 3B − C .
5. Hint: Show tha u, w can be written as a linear combination over the complex field C, but
not over the real field R.
7. i) Hint: Recall, the that set {u1 , u2 , u3 } is linearly independent if au1 + bu2 + cu3 = 0 ⇒
a=b=c=0
if aet + bsint + ct2 = 0,
letting t = 0, a1 + b(0) + c(0) = 0 ⇒ a = 0
pi 2
letting t = 2, 0et + b(1) + 0( pi4 ) = 0 ⇒ b = 0
letting t = pi, 0eπ + b(0) + c(pi2 ) = 0 ⇒ c = 0
aet + bsint + ct2 = 0 ⇒ a = b = c = 0, Accordingly, u, v, w are linearly independent
ii) similarly for {et , sint, cost}
3. −4u1 + 3u2 .
4. u - No, e.g. {(2, −1, 3, 2), (−1, 1, 1, 3), (3, −1, 0, −1), (1, 0, 0, 0)}
v - Yes, e.g. {(1, 0, 4, −1), (2, −1, 3, 2), (1, 0, 0, 0), (0, 1, 0, 0)}
5. a) (1,-2,5,-3) and (0,7,-9,2) form a basis of the row space of A and dim W = 2.
b) (1,-2,5,-3), (0,7,-9,2), (0, 0, 1, 0), and (0, 0, 0, 1) are linearly independent (they form an
echelon matrix), and so they form a basis of R4 , which is an extension of the basis of W.
chapter 5
1 −1 0 5 −11 12
1
1. i) 1 2 −1, ii) 2 2 1 0
2 1 1 2 4 −6
3 0 2
2. 0 −1 0
−2 0 −2
1
3. λ 6= 8
chapter 6
2. (0, 2, -1)
√
3. i) 6
√
ii) 38
√
iii) 2
4. i) arccos−1 ( √6 )
42
π
ii) 2
1. i) ii)
ii) i)
iii) i) and ii)
i) ii) iii) -13 iv) -71
2. k > 9
chapter 7
iv) a) True
Using λI = P −1 λIP , we have
∆B (λ) = det(λI − B) = det(λI − P −1 AP ) = det(P −1 λIP − P −1 AP )
= det(P −1 (λIA)P ) = det(P −1 )det(λI − A)det(P )
determinants are scalars and commute, and so det(P −1 )det(P ) = 1
Hence ∆B (λ) = det(λI − A) = ∆A (λ)
b) True
Exercises 7.3, page 87
1 2 1
i) P = −1 −1 −1, D = (1, 2, 3)
0 −2 −2
0 1+i 1−i
ii) P = 1 1 1, D = (−1, i, −i)
−1 1 1
0 −1 0 1
0 2 0 1
iii) P =
, D = (−3, 0, 3, 3), λ(3 + λ)(3 − λ)2
−1 0 1 0
1 0 1 0
! !
−i 2 1 − 2i 1
1. i) , ii)
i −3i 2−i 1
2. AA∗ = I .
Exercises
7.7, page
92
− √15 √2
5
0 2 2 1 3 2 6
1 1
i) P = 0 0 1, ii) P = 2 −1 2, iii) P = −6 3 2
3 7
√2 1
√ 0 −1 2 2 −2 −6 3
5 5
! ! ! !
0 i 1 1 2 1 1 2
1. i) ii) iii) iv)
i 1 1 1 1 2 2 1
2. i) 23 x2 + 12 y 2 ii) 32 x2 + 21 y 2