LinearAlgebraII Nairobi
LinearAlgebraII Nairobi
UNIVERSITY OF NAIROBI
FACULTY OF SCIENCE
[Link] SCIENCE PROGRAMME LECTURE NOTES
SMA 204: LINEAR ALGEBRA II
WRITTEN BY :
REVIEWED BY:
EDITED BY:
Copyright ⃝
c 2011 Benz, Inc. All rights reserved. Printed in Kenya.
Contents
Preface iv
1 DETERMINANTS 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Some Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Synopsis on Determinants . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Determinants by Cofactor Expansion . . . . . . . . . . . . . . . . 5
1.3.3 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.4 Determinants by Row/Column Reduction . . . . . . . . . . . . . 13
1.4 Determinants of Block Triangular Matrices . . . . . . . . . . . . . . . . . 18
1.5 Properties of the Determinant Function . . . . . . . . . . . . . . . . . . . 19
1.6 Applications of Determinants . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6.1 Finding Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . 23
1.6.2 Equivalent Conditions and systems of Linear Equations . . . . . . 25
1.6.3 Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.6.4 Area, Volume and Equations of Lines and planes . . . . . . . . . 28
1.7 Solved Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
i
2.2.1 Applications of the Eigenvalue Problem . . . . . . . . . . . . . . . 48
2.2.2 Finding Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . 48
2.2.3 Characteristic Polynomial of a Square Matrix . . . . . . . . . . . 50
2.3 Polynomial Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.4 Algebraic Multiplicity and Geometric Multiplicity of an Eigenvalue . . . 54
2.4.1 Characteristic Polynomials of Block Triangular Matrices . . . . . 56
2.5 Similarity and Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5.1 Similar Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5.2 Diagonalization of Square Matrices . . . . . . . . . . . . . . . . . 59
2.6 Orthonormal diagonalization . . . . . . . . . . . . . . . . . . . . . . . . 64
2.6.1 Diagonalization of Symmetric Matrices . . . . . . . . . . . . . . . 65
2.7 Solved Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4 LINEAR FUNCTIONALS 92
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.2 Dual Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.3 Dual Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.3.1 Determining Dual Basis given a Basis for a Vector Space . . . . . 94
4.4 Solved Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
ii
5.2.1 Bilinear Forms and Matrices . . . . . . . . . . . . . . . . . . . . . 100
5.2.2 Symmetric Bilinear Forms, Quadratic Forms . . . . . . . . . . . . 100
5.2.3 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2.4 Classification of Real Symmetric Bilinear Forms . . . . . . . . . . 102
5.3 Testing for Positive Definiteness . . . . . . . . . . . . . . . . . . . . . . . 103
5.3.1 Change of Variable in a Quadratic Form . . . . . . . . . . . . . . 104
5.3.2 Geometric View of Principal Axes . . . . . . . . . . . . . . . . . . 106
5.4 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.5 Solved Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Bibliography 122
iii
Preface
The study of linear algebra is indispensable for a prospective student of pure, statistics
or applied mathematics. The subject of linear algebra is one of the fundamental areas of
mathematics, and is the foundation for the study of many advanced topics, not only in
mathematics, but also in engineering and the physical sciences. It prepares students for
the demands of physical sciences. A thorough understanding of the concepts of linear
algebra has also become increasingly important for the study of advanced topics in eco-
nomics and the social sciences. Topics such as eigenvalue problem, determinants, bilinear
and quadratic forms, linear operators are fundamental in mathematics and physics as
well as engineering, economics, and many other areas.
The only absolute prerequisites for mastering the material in the book are SMA 203: Lin-
ear Algebra I and an interest in mathematics and a willingness occasionally to suspend
disbelief when a familiar idea occurs in an unfamiliar guise. But only an exceptional
student would profit from reading the book unless he/she has previously acquired a fair
working knowledge of vector spaces and matrices.
This book is a development of various courses designed for second year students of math-
ematics, humanities and third year students of education at the University of Nairobi,
whose preparation has been some rudimentary knowledge of matrices and vector spaces.
The lectures have been designed to facilitate learning by outreach and distance learning
students on their own.
iv
Objectives
At the end of this course unit the learner will be able to:
• Define a linear functional; find the dual basis of a given vector space
• Comprehend the concepts of a bilinear form and a quadratic form; give a matrix
representation of bilinear and quadratic forms; classify real quadratic forms; test
for postiche definiteness; apply change of variable in a quadratic form; apply
these concepts in constrained optimization
• Comprehend the notions of unitary and orthogonal matrices; apply these con-
cepts in some actions like rotations and reflections.
The learner will leave with an understanding of the basic results of linear algebra and an
appreciation of the beauty and utility of mathematics. The learner will also be fortified
with the mathematical maturity required for subsequent courses in abstract algebra, real
v
analysis, and elementary topology.
The organization of the book is as follows:
In Chapter one, we introduce the notion of a determinant and study its properties. We
explore two methods of computing determinants of matrices. We also explore some ap-
plications of determinants.
In Chapter two, we give a general theory on the eigenvalue problem. We spend some
time showing how to find eigenvalues and eigenvectors of a matrix. This study shows how
to find the characteristic polynomial of a square matrix. We also investigate similarity
and diagonalization, and show that similar matrices have same characteristic polynomial.
The fourth chapter is devoted to the study of linear functionals, which are scalar-valued
linear transformations. The most important concept here is to find the dual basis for a
given vector space.
In Chapter five, we generalize the notions of linear mappings and linear functionals. We
introduce the notion of a bilinear form, which gives rise to a quadratic form. We give
some applications of these concept in constrained optimization.
Finally in Chapter six we continue our study of orthogonal matrices and extend it to
orthogonal linear operators. We look at some properties of unitary and orthogonal ma-
trices and operators. Lastly, but not least, we study some special orthogonal matrices
in two and three dimensions and their applications in rotations and reflections.
Each chapter has several examples that are solved in detail. The idea is to remove the
mystery and show the student how to solve problems. Exercises at the end of each
chapter have been designed to correspond to the solved problems in the text so that
vi
the student can reinforce ideas learned while reading the chapter. A considerable effort
was expended in designing the exercises to ensure an appropriate level of difficulty. The
material in all these chapters constitute the course unit of Linear Algebra II, a course
designed mainly for undergraduate students in the programmes of Bachelor of Education
(both Science and Arts) and Bachelor of Science by distance learning. It can also be
used by the undergraduate students in Bachelor of Science and Bachelor of Arts.
This book has been presented and developed with all the rigour linear algebra requires
at this level. Every definition is stated carefully, every theorem is carefully stated, and
many of these theorems have complete proofs. I have enjoyed writing this book and it
is my hope that you will enjoy reading it. I hope that students find their time spent
reading book profitable. I hope instructors find it flexible enough to fit the needs of their
course. I also hope that everyone will send me their comments and suggestions, which
can help in improving this edition.
Acknowledgements
This book could not have been written without direct and indirect assistance from many
sources. I had the good fortune of interacting with a number of special people and insti-
tutions. I seize this opportunity to express my gratitude to all of them. It is a pleasure
to acknowledge with great pleasure the encouragement I received for this project from
my former students and colleagues at the University of Nairobi and beyond. I have
been fortunate to meet my mentors at Syracuse University, especially Prof. Jack Ucci,
Prof. A. Lutoborski and Prof. Steven Diaz. Their enthusiasm, intellectual integrity
were important ingredients to make me continue. This experience has greatly influenced
my work and my thinking and I hope some of it is reflected in this text. To them I owe a
special debt of gratitude. I am indebted to Dr. James K. Katende and Mwalimu Clau-
dio Achola for their constructive comments when they were reviewing the manuscript,
which have led to improvements including corrections, revisions and additional exercises.
I wish to acknowledge, with thanks, my colleagues who have encouraged me to acquire
expert knowledge of the typesetting software LATEX. As always, I am grateful to the
University of Nairobi, whose facilities have made it possible to write this book. They
vii
have also provided an environment that helped me develop academically and profession-
ally and given me the resources needed to make this book a success. I am also thankful
to the School of Mathematics for giving me the opportunity to teach linear algebra for
years, which has indeed helped this project to take shape. I would also like to thank
the Open and Distance Learning(ODL) Programme for giving me the opportunity at
several review sessions at Multimedia University, Mbagathi and Kenya Wildlife Training
Institute, Naivasha to organize and typeset this book. In these circumstances it would
have been extraordinarily slothful not to produce a book. I read the galleys myself and
checked the calculations in the text. If the book contains any errors or stylistic mis-
judgement or shortcomings, they are entirely mine!
Last, but not least, I am grateful to my family for their continuing and unwavering
support. To them I dedicate this book.
viii
Notation and Symbols
The following symbols and notation will be used throughout the book.
ix
Chapter 1
DETERMINANTS
1.1 Introduction
Determinants are among the most useful topics of linear algebra, with numerous applica-
tions in engineering, physics, economics, mathematics, and other sciences. In geometry
they offer a natural setting for writing very elegant formulas that compute areas and vol-
umes, as well as equations of geometric objects such as lines, circles, planes, spheres, etc.
Objectives
At the end of this lecture, you should be able to:
1
In order to discuss determinants of square matrices, we need to introduce some defini-
tions, terminologies and notations frequently used in linear algebra.
1.2 Matrices
A field could be: the set of real numbers, R, the complex numbers, C, etc.
A matrix A is usually presented in the form:
a11 a12 · · · a1n
a21 a22 · · · a2n
A= . .. ... ..
.. . .
am1 am2 · · · amn
The element aij , called the ij-entry or ij-element, appears in row i and column j.
We will denote such a matrix by simply writing A = [aij ].
A matrix with m rows and n columns is called an m by n matrix, written m × n. The
pair of numbers m and n is called the size or order of the matrix. In denoting the size
of a matrix we always list the number of rows first and the number of columns second.
Two matrices A and B are equal, written A = B, if they have the same size and if the
corresponding elements are equal.
A matrix with only one row is a row matrix or row vector, and a matrix with only one
2
column is called a column matrix, or column vector. A matrix whose entries are zero is
called a zero matrix and will be denoted by 0.
By interchanging the rows and columns of A, the transpose matrix AT is generated.
Namely, if
a11 a12 ··· a1n a11 a21 ··· am1
a21 a22 · · · a2n a12 a22 · · · am2
A= . .. .. .. then A T
= . .. .. .. .
.. . . . .. . . .
am1 am2 · · · amn a1m a2m · · · anm
The rows of AT are the columns of A and the ijth element of AT is aj i, i.e., (AT )ij = aji .
If A is an m × n matrix, then AT is an n × m matrix.
Matrices whose entries are all real numbers are called real matrices and said to be
matrices over R. Analogously, matrices whose entries are all complex numbers are
called complex matrices and are said to be matrices over C.
An m × n matrix A is said to be square if it has the same number of rows as columns(i.e.
if m = n). Square matrices figure importantly in applications of linear algebra, but
non-square matrices are also encountered in common physical applications,especially in
least squares data analysis.
Remark 1.1
Often when dealing with 1 × 1 matrices, for instance, A = [−2], we will drop the
surrounding brackets and just write A = −2.
There are a lot of notational issues that we are going to have to get used to in this
course, especially when reading books by different authors. We will stick to a particular
convention:
Upper case letters denote matrices and lower case letters denote numbers or entries of
a matrix. The entry in the ith row and the jth column of a matrix A is denoted by aij
3
or (A)ij . The first (leftmost) subscript will always give the row the entry is in and the
second (rightmost) subscript will always give the column the entry is in. We denote the
determinant of a square matrix A by det(A) or |A|.
a11 a12 . . . a1n
a21 a22 · · · a2n
Definition 1.2 In an n × n matrix A = . . . . .. , the entries
.. ... .
an1 an2 · · · ann
a11 , a22 , ..., ann
2. Identity matrix - matrix with 1′ s on the main diagonal and 0′ s elsewhere(off the
main diagonal). That is,
1 0 ··· 0
1 ··· 0
0
I= .... . . ..
. . . .
0 0 ··· 1
In this course we will be mainly concerned with such real square matrices.
1.3 Determinants
4
1.3.1 Synopsis on Determinants
For any square array of numbers, i.e., a square matrix, we can define a determinant: a
scalar number, real or complex. In this chapter we will give the fundamental definition
of a determinant and use it to prove several elementary properties. These properties
include: determinant addition, scalar multiplication, row and column addition or sub-
traction, and row and column interchange. As we will see, the elementary properties
often enable easy evaluation of a determinant, which otherwise could require an exceed-
ingly large number of multiplication and addition operations. Every determinant has
cofactors, which are also determinants but of lower order (if the determinant corresponds
to an n x n array, its cofactors correspond to (n − 1) × (n − 1) arrays). We will show how
determinants can be evaluated as linear expansions of cofactors. We will then use these
cofactor expansions to prove that a system of linear equations has a unique solution if
the determinant of the coefficients in the linear equations is not 0. This result is known
as Cramer’s rule, which gives the analytic solution to the linear equations in terms of
ratios of determinants. The properties of determinants established in this chapter will
play (in the chapters to follow) a big role in the theory of eigenvalues and eigenvectors,
inverses of square matrices, analytic geometry, and in the theory of matrices as linear
operators in vector spaces.
There are two main methods to compute determinants of square matrices. These are:
1. Cofactor Expansion.
5
a11 a12 a13
Let B =
a21 a22 a23 . The determinant of B can be written in terms of 2 × 2
a31 a32 a33
determinants:
( ) ( ) ( )
a22 a23 a21 a23 a21 a22
det(B) = a11 det − a12 det + a13 det
a32 a33 a31 a33 a31 a32
or explicitly as
det(B) = a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 )
In the same manner we can define 4 × 4, 5 × 5,..., etc, determinants. We can continue
similarly and define n × n determinants in terms of (n − 1) × (n − 1) determinants called
minors.
Definition 1.3 Let A be a square matrix. The minor of aij or the the (i, j) minor,
denoted by Mij , of A is the determinant of the sub-matrix obtained by deleting the ith
row and the jth column of A.
Remark 1.2
In the examples above, we have introduced what is known as the cofactor expansion
of a determinant about the first row. Each entry of the first row was multiplied by
the corresponding minor. Each such product was multiplied by ±1, depending on the
position of the entry. The signed products were added together. In fact there is nothing
special about the choice of the first row in the computation of the determinant. We
could have used any other row or column. Here is how.
Let A be a square matrix. First we assign a sign to each entry of A according to a
checkerboard pattern of pluses and minuses.
+ − + ...
− + − ...
+ − + ...
.. .. .. ..
. . . .
6
Then we pick any row or column and multiply each signed entry by the corresponding
minor. Finally, we add all these products. Note that the sign of the (i, j) position in
the checkerboard pattern is given by (−1)i+j .
We try to expand a determinant about the row or column with the most zeros. This
avoids the computation of some of the minors.
Definition 1.4 Let A be a square matrix. The cofactor of aij or (i, j) cofactor
denoted by Cij , of A is the signed (i,j) minor. That is, Cij = (−1)i+j Mij .
det(A) = a11 M11 − a12 M12 + a13 M13 = a11 C11 + a12 C12 + a13 C13 (1.1)
Note that the definition is recursive. For example, to process the determinant of a square
matrix of size n = 4, we must process the determinant for a matrix of size n = 3, n = 2,
and n = 1. We can generalize (1.1) to any n × n matrix, n ≥ 3.
Remark 1.3
In fact there is nothing special about the choice of the first row in the computation of
the determinant of A. We could have used any other row or column of A.
Thus
det(A) = ai1 Ci1 + ai2 Ci2 + ai3 Ci3 + · · · + ain Cin (along row i)
= a1s C1s + a2s C2s + a3s C3s + · · · + ans Cns (along column s), 1≤ i ≤ n or 1 ≤ s ≤ n
This says that the determinant of a square matrix is independent of the row or column
used to compute it in cofactor expansion.
7
Definition 1.6 The process of computing determinant of a square matrix A across a
row or down a column is called cofactor expansion of determinant.
4 2 1
Example 1.3 Let A = −2 −6 3 . Find det(A) using the given cofactor expan-
−7 5 0
sion:
(a). Expand along the first row of A.
(b). Expand along the third row.
(c). Expand along the second column.
Solution
(a). The minors along the first row are:
−6 3 −2 3 −2 −6
M11 = = −15, M12 = = 21, M13 = = −52
5 0 −7 0 −7 5
and the corresponding cofactors are
C11 = (−1)1+1 M11 = −15, C12 = (−1)1+2 M12 = −21, C13 = (−1)1+3 M13 = −52,
respectively. Thus, the determinant of A by cofactor expansion along the first row is
2 1 4 1 4 2
M31 = = 12, M32 = = 14, M33 = = −20
−6 3 −2 3 −2 −6
and the corresponding cofactors are
C31 = (−1)3+1 M31 = 12, C32 = (−1)3+2 M32 = −14, C33 = (−1)3+3 M33 = −20,
respectively. Thus, the determinant of A by cofactor expansion along the third row is
8
(c). The minors along the second column are:
−2 3 4 1 4 1
M12 = = 21, M22 = = 7, M32 = = 14
−7 0 −7 0 −2 3
C12 = (−1)1+2 M12 = −21, C22 = (−1)2+2 M22 = 7, C32 = (−1)3+2 M32 = −14,
respectively. Thus, the determinant of A by cofactor expansion along the second column
is
det(A) = a12 C12 + a22 C22 + a32 C32 = 2(−21) + (−)6(7) + 5(−14) = −154
Proposition 1.1 Suppose that a square matrix A has a zero row or a zero column.
Then det(A) = 0.
Proof . We simply use cofactor expansion along the zero row or the zero column.
Suppose A is an n × n matrix and row i of A is a row of zeroes. We compute det(A) via
cofactor expansion about row i.
∑n i+j
det(A) = j=1 (−1) aij Mij
∑n
= j=1 aij Cij
∑n
= j=1 0
= 0.
The proof for the case of a zero column is entirely similar and could also be derived by
employing the transpose of the matrix.
a11 · · · a1n
. .
Definition 1.7 Consider an n × n matrix A = .. . . ... . If aij = 0 when-
an1 · · · ann
ever i > j, then A is called an upper triangular matrix. If aij = 0 whenever i < j,
then A is called a lower triangular matrix. We say that A is triangular if it is
either upper triangular or lower triangular.
9
1 2 3 1 0 0
Example 1.4 A =
0 4 5 is upper triangular; B = 2 3 0 is lower trian-
0 0 7 4 5 6
gular.
Definition 1.8 A square matrix A is diagonal if it is both upper and lower triangular.
2 0 0 0
0 −3 0 0
Example 1.5 A = is diagonal.
0 0 1 0
0 0 0 8
Theorem 1.2 If A is a triangular matrix of order or size n, then
Note that Ckk = (−1)k+k Mkk = (−1)2k Mkk = Mkk , where Mkk is the determinant of the
upper triangular matrix formed by deleting the kth row and kth column of A. Since
this matrix is of size k − 1, we can apply our induction assumption(hypothesis) to write
the following
det(A) = akk Mkk
= akk (a11 a22 a33 · · · a(k−1)(k−1) )
= a11 a22 a33 · · · akk
as required
10
9 2 3 8
0 −3 0 5
Example 1.6 If A = , then det(A) = 9(−3)(1)(7) = −189.
0 0 1 6
0 0 0 7
Remark 1.4
Note that
1. A reduced row echelon form matrix is always in echelon form.
2. A zero row of a matrix is a row that consists entirely of zeros and a nonzero row
is a row that has at least one nonzero entry.
3. The first nonzero entry of a nonzero row is called a leading entry.
4. If a leading entry happens to be 1, we call it a leading 1 .
11
Step 3: Obtain zeros below the leading entry by adding suitable multiples of the top
row to the rows below that.
Step 4: Temporarily cover(ignore) the top row and repeat the same process starting
with Step 1 applied to the leftover submatrix. Repeat this process with the rest of the
rows. (At this stage, the matrix is already in echelon form).
Step 5: Starting with the last nonzero row, work upward: For each row obtain a lead-
ing 1 and introduce zeros above it by adding suitable multiples to the corresponding rows.
A matrix B obtained from A by this process is said to be in Reduced Row Echelon Form.
−1 0 1 2 1 −6 0 2 ( )
0 2 −6
M =
0 0 −3 4
, N = 0
0 1 4
, P =
0 −1 1
0 0 0 0 0 0 0 0
M and N are in echelon form. N is also in RREF. M is not in RREF because condition
3 fails. Matrix P is not in echelon form, because condition 2 fails.
Matrices A and B are not in echelon form and hence are not in RREF. Matrix C is in
RREF.
Definition 1.11 The following operations, performed on the rows of a matrix, are called
elementary row operations:
1. Interchanging two rows: Ri ←→ Rj
2. Adding a constant multiple of one row to another: Ri := Ri +cRj , where c is a scalar.
3. Multiplying a row by a nonzero scalar: Ri := cRi .
12
Definition 1.12 Two matrices are (row) equivalent if one can be obtained from the
other by a finite sequence of elementary row operations. Sometimes we use the abbrevi-
ation A ∼ B for the statement ”matrix A is equivalent to B”.
det(B) = −det(A)
(b). If B is obtained from A by adding a multiple of one row of A to another row, then
det(B) = det(A)
Proof
(a). The proof is by induction on n, the size of matrix A. It is easily checked that the
result holds when n = 2. When n > 2, we use cofactor expansion by a third row, say
row i. Then
∑
n
det(B) = aij (−1)i+j det(Bij ),
j=1
where the (n − 1) × (n − 1) matrices Bij are obtained from the matrices Aij by inter-
changing two rows of Aij , so that det(Bij ) = −det(Aij ). It follows that
∑
n
det(B) = − aij (−1)i+j det(Aij ) = −det(A),
j=1
13
as required.
(b). Again, the proof is by induction on n. It is easily checked that the result holds
when n = 2. When n > 2, we use cofactor expansion by a third row, say row i. Then
∑
n
det(B) = aij (−1)i+j det(Bij ),
j=1
where the (n − 1) × (n − 1) matrices Bij are obtained form the matrices Aij by adding
a multiple of one row of Aij to another row, so that det(Bij ) = det(Aij ). It follows that
∑
n
det(B) = aij (−1)i+j det(Aij ) = det(A),
j=1
as required.
(c). This is simpler. Suppose that the matrix B is obtained from the matrix A by
multiplying row i of A by a nonzero constant c. Then
∑
n
det(B) = caij (−1)i+j det(Bij )
j=1
Note now that Bij = Aij since row i has been removed respectively from B and A. It
follows that
∑
n
det(B) = caij (−1)i+j det(Aij ) = [Link](A),
j=1
as required.
det(B) = −det(A)
14
(c). If B is obtained from A by multiplying one column of A by a nonzero constant c,
then
det(B) = [Link](A)
Proof. This follows from the proof of Proposition 1.3 by replacing ”row” with ”column”
.
Remark 1.5
Elementary row and column operations can be combined with cofactor expansion to
calculate the determinant of a given square matrix.
2 3 2 5
1 4 1 2
Example 1.9 Consider the matrix A = . Adding −1.C3 to C1 , we have
5 4 4 5
2 2 0 4
2 3 2 5
1 4 1 2
det(A) = det . Adding − 21 R4 to R3 , we have
5 4 4 5
2 2 0 4
0 3 2 5
0 4 1 2
det(A) = det . Using cofactor expansion by column C1 , we have
0 3 4 3
2 2 0 4
3 2 5 3 2 5
det(A) = 2(−1)4+1 det
4 1 2 = −2det 4 1 2 . Adding −R1 to R3 , we have
3 4 3 3 4 3
3 2 5
det(A) = −2det
4 1 2 . Adding 1.C2 to C3 , we have
0 2 −2
3 2 7
det(A) = −2det
4 1 3 . Using cofactor expansion by R3 , we have
0 2 0
15
( ) ( )
3 7 3 7
det(A) = −2.2(−1)3+2 det = [Link] . Using the formula for the deter-
4 3 4 3
minant of 2 × 2 matrices, we conclude that det(A) = 4(9 − 28) = −76.
1 2 3 1 2 3 1 2 3 1 2 3
Example 1.10 10 30 40 = 10 1 3 4 = 10 0 1 1 = 10 0 1 1 =
3 1 1 3 1 1 3 1 1 0 −5 −8
10(−8 + 5) = 10(−3) = −30.
Remark 1.6
Example
1.11 Use
elementary row operations to evaluate the determinant of
0 1 5
A= 3 −6 9 .
2 6 1
Solution
Interchanging R1 and R2 , we have
3 −6 9 1 −2 3 1 −2 3
det(B1 ) = 0 1 5 = (−1)det(A) = (−1)(3) 0 1 5 = (−1)(3) 0 1 5
2 6 1 2 6 1 0 10 −5
1 1 5 1 −2 3
= (−1)(3) 0 20 40 = (−1)(3)(−55) 0 1 5 = 165.
0 0 −55 0 0 1
Remark 1.7
If a square matrix has two proportional rows( in particular, two rows are the same),
then the determinant is zero. This result is stated and proved in the following theorem.
16
Theorem 1.6 Let A be an n × n matrix. If the jth row (or column) of A is a multiple
of the kth row(or column) of A, then det(A) = 0.
Proof . We prove the result for columns. Let A = [A1 , A2 , ..., An ], where A1 , A2 , A3 , ..., An
denote the n column vectors of A and suppose that Aj = cAk . Define B to be the ma-
trix B = [A1 , A2 , ..., Aj , ..., Ak , ..., An ] and observe that det(A) = [Link](B). Now if we
interchange the jth and kth columns of B, then the matrix B remains the same, but
the determinant changes sign. This [det(B) = −det(B)] can happen only if det(B) = 0;
and since det(A) = [Link](B), then det(A) = 0.
Theorem 1.7 If A, B and C are n × n matrices that are equal except that the sth
column(or row) of A is equal to the sum of the sth columns(or rows) of B and C, then
det(A) = det(B) + det(C).
Proof . Let A = [A1 , A2 , ..., Aj , ..., Ak , ..., An ] and B = [A1 , A2 , ..., Aj + cAk , ..., An ] by
the theorem above(Theorem 1.7), det(B) = det(A) = det(A) + det(Q), where Q =
[A1 , A2 , ..., cAk , ..., Ak , ..., An ]. But by an earlier result, det(Q) = 0; so det(B) = det(A),
and the theorem is proved.
Remark 1.8
17
Cofactor Expansion Row Reduction
order n Additions Multiplications Additions Multiplications
3 5 9 5 10
5 119 205 30 45
10 3,628,799 6,235,300 285 339
The number of operations for the cofactor expansion of an n × n matrix grows like n!
Since 30! ≈ 2.65×1032 , a 30×30 matrix would require over 1032 operations! If a computer
could do one trillion operations per second, it would still take over one trillion years to
compute the determinant of this matrix using cofactor expansion(given the matrix has
no zero entries)-yet row reduction would take only a few seconds!
Definition 1.13 Let A = [Aij ] be a square matrix such that the non-diagonal blocks
below or above the diagonal block/ the non-diagonal blocks are all zero matrices. Then
A is called a block triangular matrix/ block diagonal matrix.
1 2 3 1 2 0
Example 1.12 Let A = 4 5 6 and B = 3 4 0 . Then A is a triangular
0 0 9 0 0 7
block matrix while B is a diagonal block matrix.
( )
A1 B
Theorem 1.9 Let A be a square matrix. If A = , then det(A) = det(A1 ).det(A2 ).
0 A2
If B = 0, i.e. a zero matrix, then still det(A) = det(A1 ).det(A2 ).
Note that in the theorem above, the zero matrix denoted by 0 and B need not be square.
We generalize our result to a finite number of diagonal blocks:
Theorem 1.10 Suppose A is an upper(lower) triangular block matrix with the diagonal
blocks A1 , A2 , ..., An . Then det(A) = det(A1 ).det(A2 )...det(An ).
18
2 3 4 7 8
−1 5 3 2 1
Example 1.13 Find |M | where M = 0 0 2 1 5 .
0 0 3 −1 4
0 0 5 2 6
Let Mn×n denote the vector space of all n × n matrices over a field K. Then the
determinant function | | : Mn×n −→ C(or R) defined by
A 7−→ det(A)
Assuming that the sizes of the matrices are such that the operations can be performed(i.e.
are compatible), the transpose operation has the following properties:
19
(i). (At )t = A
(ii). (A + B)t = At + B t
(iv). (AB)t = B t At
Note that the Proposition 1.12 says that we can effectively replace ”row” by ”column”
in most results.
Proof. We prove the case for diagonal matrices. The case when A and B are are not
triangular can be proved similarly by first reducing them to triangular form.
a11 ... 0 b11 · · · 0 a11 b11 · · · 0
. . . . ..
Let A = . .. and B = .. .. . Then AB = ... .. .. .
. ··· . . .
0 . . . ann 0 · · · bnn 0 · · · ann bnn
Therefore
det(AB) = (a11 b11 )(a22 b22 )...(ann bnn ) = (a11 a22 ...ann )(b11 b22 ...bnn ) = det(A).det(B).
20
Proposition 1.14 Let R be an n × n matrix in reduced row echelon form, and let R
contain no row of zeros. Then R = In , where In denotes the n × n identity matrix.
Remark 1.9
No row of zeros in Proposition 1.14 implies that each column of R contains a leading 1.
So the leading 1s are on the main diagonal. Hence the result.
Definition 1.17 An elementary matrix E is a simple matrix which differs from the
identity matrix in a minimal way(i.e. by a single elementary row or column operation).
( )
1 2
Example 1.15 Let A = . Applying the elementary operation R3 := R3 − 3R1
3 4
( )
b= 1 2
on A gives A . It is easy to check that this elementary row operation is
0 −2
( )
1 0 b
represented by the matrix E1 = . Clearly, E1 A = A.
−3 1
Proof. (=⇒) If A is invertible, then there exists A−1 such that AA−1 = I. Taking
determinants both sides of this equation and using the fact that det(I) = 1, we have
21
So neither determinant on the right is zero. Thus det(A) ̸= 0.
(⇐=) Conversely, assume det(A) ̸= 0. Then we shall show that A is row-equivalent
to the identity matrix I. Let R be the Row Reduced Echelon Form (RREF) of A.
That is Ek ...E3 E2 E1 A = R, where E1 , E2 , ..., Ek are the matrices corresponding to the
elementary row operations in the reduction process. Note that each of these matrices
have an inverse. Hence
A = E1−1 E2−1 ...Ek−1 R.
Therefore
det(A) = det(E1−1 )det(E2−1 )...det(Ek−1 )det(R).
Since R is in reduced row echelon form, it must be the identity matrix I or it must have
at least one row of zeros. But if R has a row of zeros, then det(R) = 0, which could
imply that det(A) = 0. This contradicts our assumption that det(A) ̸= 0. So R cannot
have any zero rows by the previous result. Hence R must be I. We therefore conclude
that A is row equivalent to I and so A is invertible.
Proof. Since A is invertible, there exists A−1 such that AA−1 = I and det(A) ̸= 0.
Thus det(AA−1 ) = det(I). Equivalently, det(A).det(A−1 ) = 1, from which we obtain
that det(A−1 ) = 1
det(A)
.
1 0 3
Example 1.16 Find |A−1 | for the matrix A =
0 −1 2 .
2 1 0
Solution. One way to solve this problem is to find A−1 , then evaluate det(A−1 ). It is
simpler, however to apply the corollary above.
1 0 3
It is easy to check that |A| = 0 −1 2 = 4.
2 1 0
Thus |A−1 | = 1
|A|
= 41 .
22
1.6 Applications of Determinants
Definition 1.18 Let A = [aij ] be an n × n matrix over a field K and let Cij denote
the cofactor of aij . The classical adjoint of A denoted by adj(A) , is the transpose
of the matrix of cofactors of A, namely
adj(A) = [Cij ]t .
2 3 −4
Example 1.17 Let A =
0 −4 2 . The cofactors of the nine elements of A are:
1 −1 5
−4 2 0 2 0 −4
C11 = + = −18, C12 = − = 2, C13 = + = 4.
−1 5 1 5 1 −1
3 −4 2 −4 2 3
C21 = − = −11, C22 = + = 14, C23 = − = 5.
−1 5 1 5 1 −1
3 −4 2 −4 2 3
C31 = + = −10, C32 = − = −4, C33 = + = −8.
−4 2 0 2 0 −4
−18 2 4
The matrix of cofactors is Cij = −11 14 5 . The transpose of this matrix of
−10 −4 −8
cofactors yields the classical adjoint of A. That is,
−18 −11 −10
adj(A) =
2 14 −4
.
4 5 −8
23
where I is the identity matrix. Thus, if det(A) ̸= 0, then
−1 1 ( )
A = adj(A) .
det(A)
Proof. By looking at the product of a matrix A with its adjoint
a a ... a1n
11 12
a
21 a22 ... a2 C11 C21 ... Cj1 ... Cn1
. ..
.. .. ..
.
. . C12 C22 ... Cj2 ... cn2
[Link](A) = .. .. .. .. .. .. ,
ai1 ai2 ... ain . . . . . .
.. .. .. ..
. . . . C1n C2n ... Cjn ... Cnn
an1 an2 ... ann
the entry in the ith row and jth column of this product is
If i = j, then this sum is simply the cofactor expansion of A along the ith row, which
means that the sum is det(A).
If i ̸= j, then the sum is zero: To see this, consider the following matrix B in which the
jth row of A has been replaced with the ith row of A:
a11 a12 · · · a1n
a21 a22 · · · a2n
.. .. .. ..
. . . .
ai1 ai2 ... ain
B= .. .. .. ..
.
. . . .
a
i1 ai2 ... ain
. .. .. ..
..
. . .
an1 an2 ... ann
24
Since B has two identical rows, we have that det(B) = 0.
Therefore [Link](A) has the form
|A| 0 ... 0
|A| ...
0 0
[Link](A) = .. .. .. .. = |A|.I
. . . .
0 0 ... |A|
Therefore
[ 1 ]
A adj(A) = I.
|A|
This implies that
−1 1 ( )
A = adj(A) .
det(A)
2 3 −4
Example 1.18 Find A−1 for A =
0 −4 2 .
1 −1 5
Solution. It is easy to check that det(A) = −46 ̸= 0. Thus A does have an inverse.
The adjoint of A has been computed in Example 1.17. Thus
( ) −18 −11 −10 9 11 5
1
23 46 23
−1 1 .
A = adj(A) = − 2 14 −4 = − 23 − 231 7 2
23
det(A) 46
4 5 −8 − 232
− 46
5 4
23
25
The notion of a determinant can be used to check whether or not a system of n linear
equations in n unknowns has a unique solution, more than one solution or no solution,
without even having to solve the system.
2x2 − x3 = −1
(a). 3x1 − 2x2 + x3 = 4
3x1 + 2x2 − x3 = −4
2x2 − x3 = −1
(b). 3x1 − 2x2 + x3 = 4
3x1 + 2x2 + x3 = −4
0 2 −1
(a). 3 −2 1 =0
3 2 −1
0 2 −1
(b). 3 −2 1 = −12
3 2 1
where the ith column of Ai is the column of constants b in the system of equations.
26
Proof. Let the system be represented by Ax = b. Since |A| ̸= 0, we can write
x1
( 1 ) x
2
x = A−1 b = adj(A) b = . .
|A| ..
xn
Example 1.20 Use Cramer’s Rule to solve the following system of linear equations for
x, y and z.
−x + 2y − 3z = 1
2x + z = 0
3x − 4y + 4z = 2
−1 2 −3
|A| = 2 0 1 = 10.
3 −4 4
Since |A| =
̸ 0, we know that the solution exists and is unique and Cramer’s Rule may
be applied to solve for x, y and z as follows:
1 2 −3
0 0 1
2 −4 4 4
x= =
10 5
27
−1 1 −3
2 0 1
3 2 4 3
y= =−
10 2
−1 2 1
2 0 0
3 −4 2 8
z= =−
10 5
The area of the triangle whose vertices are (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ) is given by
x1 y1 1
1
Area = ± det x
2 2y 1
(1.2)
2
x3 y3 1
where the sign (±) is chosen to give a positive area.
28
Proof. We prove the case for yi > 0. Assume x1 ≤ x3 ≤ x2 , and that (x3 , y3 ) lies above
the line segment connecting (x1 , y1 ) and (x2 , y2 ).(see figure below).
The area of the given triangle is equal to the sum of the areas of the first two trapezoids
less the area of the third.
Therefore
1 1 1
Area of T riangle = (y1 + y3 )(x3 − x1 ) + (y3 + y2 )(x2 − x3 ) − (y1 + y2 )(x2 − x1 )
2 2 2
29
= 12 (x1 y2 + x2 y3 + x3 y1 − x1 y3 − x2 y1 − x3 y2 )
x1 y1 1
1
= 2 x2 y2 1
x3 y3 1
If the vertices do not occur in the order x1 ≤ x3 ≤ x2 or if the vertex (x3 , y3 ) is not
above the line segment connecting the other two vertices, then the formula may give the
negative of the area. So the area will be the absolute value of this area.
Example 1.21 Find the area of the triangle whose vertices are (1, 0), (2, 2) and (4, 3).
1 0 1
1 3
2 2 1 =− .
2 2
4 3 1
Therefore Area = | − 32 | = 32 .
Remark 1.10
Suppose the three points in part (a) are collinear (i.e. they lie on the same line). Then
the determinant in (1.2) would be zero. This can be used to find the equation of a
straight line.
Consider the collinear points (0, 1), (2, 2) and (4, 3). The determinant giving the area of
the ”triangle” having these three points as vertices is
0 1 1
1
2 2 1 = 0.
2
4 3 1
30
Generalization: Three points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ) are collinear if and only
if
x1 y 1 1
det x2 y2 1
= 0.
x3 y 3 1
Two-point Form of the Equation of a Straight Line: The equation of the line
through the distinct points (x1 , y1 ) and (x2 , y2 ) is given by
x y 1
det x1 y1 1 = 0.
x2 y 2 1
Example 1.22 Use determinant to find the equation of the line segment through the
points (2, 4) and (−1, 3).
Solution. Applying the determinant formula for the equation of the line passing through
these two points we have:
x y 1
2 4 1 = 0.
−1 3 1
Expanding by cofactor expansion along the top row we obtain
4 1 2 1 2 4
x −y +1 = x − 3y + 10 = 0.
3 1 −1 1 −1 3
x − 3y = −10.
Remark 1.11
We can use determinant method to find equations of curves defined by more complicated
equations, for instance, equation of a circle passing through three points (x1 , y1 ), (x2 , y2 )
and (x3 , y3 ).
Example 1.23 Find the equation of the circle passing through the three points (1, 1), (−2, 0),
and (0, 1).
31
Solution
Instead of working with the specific equation of the form
x2 + y 2 + ax + by + c = 0,
we need to have coefficients in all terms so as to create a system of equations with more
than one solution. We try an equation of the form
r1 (x2 + y 2 ) + r2 x + r3 y + r4 = 0.
Combining this general equation with the specific equations that must hold in order for
the given points to lie on the curve leads to the system
r1 (x2 + y 2 ) + r2 x + r3 y + r4 = 0
r1 (12 + 12 ) + r2 (1) + r3 (1) + r4 = 0
r1 ((−1)2 + 02 ) + r2 (−2) + r3 (0) + r4 = 0
r1 (02 + 12 ) + r2 (0) + r3 (1) + r4 = 0
Since this system will have more than one solution, the determinant of the coefficient
matrix is zero. Writing out this equation we obtain the coefficients ri of the equation
for the circle.
That is
x2 + y 2 x y 1
2 1 1 1
= 0.
1 −2 0 1
1 0 1 1
Expanding, we have −x2 − y 2 + x − 2y + 3 = 0 or x2 + y 2 − x + 2y − 3 = 0, which upon
completing squares we have (x − 21 )2 + (y + 1)2 = 17
4
, which is a circle centred at ( 12 , −1),
√
17
and radius 2
.
32
(c). Volume of Tetrahedron
The volume of a tetrahedron whose vertices are (x1 , y1 , z1 ), (x2 , y2 , z2 ), (x3 , y3 , z3 ) and
(x4 , y4 , z4 ) is given by
x1 y1 z1 1
x y z 1
1 2 2 2
V olume = ± det (1.3)
6 x3 y3 z3 1
x4 y4 z4 1
where the ± sign is chosen to give a positive area.
Example 1.24 Find the volume of the tetrahedron whose vertices are (0, 4, 1), (4, 0, 0), (3, 5, 2)
and (2, 2, 5).
0 4 1 1
1 4 0 0 1 1
= (−72) = −12.
6 3 5 2 1 6
2 2 5 1
33
(d). Test for Coplanar Points in Space
Theorem 1.20 If four points in 3-dimensional space happen to lie in the same plane,
then the determinant in the formula for volume (1.3) turns out to be zero. Thus, four
points (x1 , y1 , z1 ), (x2 , y2 , z2 ), (x3 , y3 , z3 ) and (x4 , y4 , z4 ) are coplanar if and only if
x y 1 z1 1
1
x y 2 z2 1
2
det = 0.
x3 y 3 z3 1
x4 y 4 z4 1
This test provides the determinant form for the equation of a plane passing through
three points in space.
The equation of the plane passing through the distinct points (x1 , y1 , z1 ), (x2 , y2 , z2 ) and
(x3 , y3 , z3 ) is given by
x y z 1
x y z 1
1 1 1
det = 0.
x 2 y 2 z2 1
x 3 y 3 z3 1
Example 1.25 Find the equation of the plane passing through the points (0, 1, 0), (−1, 3, 2)
and (−2, 0, 1).
Solution. Using the determinant form of the equation of the plane passing through
three distinct points produces
x y z 1
0 1 0 1
det = 0.
−1 3 2 1
−2 0 1 1
4x − 3y + 5z = −3.
34
1.7 Solved Exercises
a + b + 2c a b
1. Show that c b + c + 2a b = 2(a + b + c)3 .
c a c + a + 2b
we get
a + b + 2c a b a + b + 2c −(a + b + c) 0
c b + c + 2a b = 0 a+b+c −(a + b + c)
c a c + a + 2b c a c + a + 2b
1 −1 0
= (a + b + c)2 0 1 −1
c a c + a + 2b
= (a + b + c)2 (c + a + 2b + c + a)
= 2(a + b + c)3
1 1 1
2. Show that x y z = (x − y)(y − z)(z − x).
x2 y 2 z 2
35
1 1 1 1 0 0
x y z = x y−x z−x
x 2
y 2
z 2
x 2
y −x
2 2
z 2 − x2
(y − x) (y 2 − x2 )
=
(z − x) (z 2 − x2 )
1 1
= (y − x)(z − x)
(y + x) (z + x)
= (y − x)(z − x)(z − y)
= (x − y)(y − x)(z − x)
1 a b+c
3. Show that 1 b c+a = 0.
1 c a+b
1 a b+c 1 a a+b+c
1 b c+a = 1 b a+b+c
1 c a+b 1 c a+b+c
1 a 1
= (a + b + c) 1 b 1
1 c 1
= (a + b + c).0
= 0
36
4. Consider the system
kx + y + z = 1
x + ky + z = 1
x + y + kz = 1
Use determinants to find those values of k for which the system has:
(c). no solution
Solution. (a). The system has a unique solution when the determinant det(A) of the
k 1 1
coefficient matrix A is not equal to zero. It easy to check that det(A) = 1 k 1 =
1 1 k
k − 3k + 2 = (k − 1) (k + 2). The system has a unique solution when (k − 1) (k + 2) ̸= 0,
3 2 2
(b). Using (1.4) the system has more than one solution when k = 1.
(c). Using (1.4) the system has no solution(i.e. is inconsistent) when k = −2.
x + 3y − z = 4
2x − y + z = 3
3x − 2y + 2z = 5
37
1 3 −1
Solution The coefficient matrix for the system is A =
2 −1 1 and
3 −2 2
4
b = 3
. A simple computation gives det(A) = −2. Since det(A) is not zero, we
5
can apply Cramer’s rule. We first need to compute det(Ai ), i = 1, 2, 3 where Ai is the
matrix obtained by replacing the i-th column of A by the vector b.
4 3 −1 1 4 −1 1 3 4
det(A1 ) = 3 −1 1 = −2, det(A2 ) = 2 3 1 = −4, det(A3 ) = 2 −1 3 = −6
5 −2 2 3 5 2 3 −2 5
|A1 | −2 |A2 | −4 |A3 | −6
Hence x = |A|
= −2
= 1, y = |A|
= −2
= 2 and z = |A|
= −2
= 3.
a1 x x ··· x
··· x
x a2 x
6. Find det(A) by elementary operations, where A =
x x a3 ··· x .
.. .. .. ..
. . . .
x x x · · · an
a1 x x ··· x
x − a1 a2 − x 0 ··· 0
det(A) = x − a1 0 a3 − x · · · 0 .
.. .. .. ...
. . .
x − a1 0 0 · · · an − x
Taking out (x − a1 ) from column 1, (a2 − x) from column 2 and so on, we obtain
a1
a1 −x
x
a2 −x
··· x
an −x
−1 1 0 ··· 0
det(A) = (a1 − x)(a2 − x)...(an − x) −1 0 1 ··· 0 .
.. .. .. . .
. . . .
−1 0 0 ··· 1
38
a1 x
Putting a1 −x
=1+ a1 −x
and adding all columns to the first one we get
1+ x
a1 −x
+ · · · anx−x x
a2 −x
··· x
an −x
0 1 0 ··· 0
det(A) = (a1 − x)(a2 − x)...(an − x) 0 0 1 ··· 0 .
.. .. .. . .
. . . .
0 0 0 ··· 1
The last matrix is upper triangular and its determinant is the product of the diagonal
entries:
x x
(1 + + ··· + ).1n−1 .
a1 − x an − x
Thus
x x
det(A) = (a1 − x)(a2 − x)...(an − x)(1 + + ··· + ).
a1 − x an − x
x + 2y − z = 4
2x − y + 3z = 3
4x + 3y − 2z = 5
1 2 −1
Solution The coefficient matrix A =
2 −1 3
and det(A) = 15. Since det(A) ̸=
4 3 −2
0, we can use Cramer’s rule to solve the system. It is easy to check that det(A1 ) =
0, det(A2 ) = 45 and det(A3 ) = 30, where Ai , i = 1, 2, 3 is the matrix obtained from A
by replacing the i-th column of A with the vector b.
|A1 | |A2 | |A3 |
Thus x = |A|
= 0, y = |A|
= 3 and z = |A|
= 2. Thus the three planes intersect at
the point P (0, 3, 2).
( )
2 4
8. Find A−1 by the classical adjoint (i.e. adjugate) method where A = .
6 8
Solution Clearly det(A) = −8 ̸= 0. Therefore A−1 exists. We compute all the four
39
cofactors of A.
M11 = 8, C11 = 8 ; M12 = 6, C12 = −6
M21 = 4, C21 = −4 ; M22 = 2, C22 = 2
( ) ( )
C11 C12 8 −6
Therefore the matrix of cofactors is Cij = = and it is easy
C21 C22 −4 2
( )
8 −4
to see that adj(A) = .
−6 2
( ) ( )
8 −4 −1 1
Therefore A−1 = 1
det(A)
adj(A) = − 81 = 2
.
−6 2 3
4
− 41
1.8 Exercises
cos θ − sin θ
(a).
sin θ cos θ
1 1 1 1
1 x 1 1
2.(a). Show that = (x − 1)3 .
1 1 x 1
1 1 1 x
1 2 3 4 5
2 3 4 5 1
(b). Evaluate the determinant 3 4 5 1 2 .
4 5 1 2 3
5 1 2 3 4
40
(b + c) c b
3.(a). Show that c (c + a) a = 4abc.
b a (a + b)
(b). Prove that if all the entries of a 3 × 3 matrix A are equal to ±1, then det(A) is
an even number.
a1 b1 a1 x + b1 y + c1 a1 b1 c1
a2 b2 a2 x + b2 y + c2 = a2 b2 c2
a3 b3 a3 x + b3 y + c3 a3 b3 c3
1 a bc
1 b ca = (b − a)(c − a)(c − b)
1 c ab
1 a a3 1 a a2
1 b b3 = (a + b + c) 1 b b2
1 c c3 1 c c2
1 + c1 1 1 1
1 1 + c2 1 1 1 1 1 1
= c1 c2 c3 c4 (1 + + + + ).
1 1 1 + c3 1 c1 c2 c3 c4
1 1 1 1 + c4
41
α2 + 1 αβ αζ
(ii). 2
αβ β +1 βζ
2
αζ βζ ζ +1
sin α cos α 1
(iii). sin β cos β 1 .
sin ζ cos ζ 1
0 1 1 a
1 0 1 b
= a2 + b2 + c2 − 2ab − 2bc − 2ac + 2d.
1 1 0 c
a b c d
5. (a). Consider the two matrices below, and suppose you already have computed
det(A) = −120. What is det(B)? Why?
0 8 3 −4 0 8 3 −4
−1 2 −2 5 0 −4 2 3
A= , B =
−2 8 4
3 −2 8 4 3
0 −4 2 3 −1 2 −2 5
(b). Solve the equation
1 1 1 ··· 1
1 1−x 1 ··· 1
1 1 2 − x ··· 1 = 0.
.. .. .. ..
. . . .
1 1 1 ··· n−x
42
Show that g = g(x1 , x2 , ..., xn ) = (−1)n Vn−1 (x) where x = xn and Vn−1 is the Van-
dermonde’s determinant (named after French mathematician and musician Alexandre
Theophile Vandermonde(1735-1796), one of the founders of the theory of determi-
nants) defined by
1 1 ··· 1 1
x1 x2 ... xn−1 x
Vn−1 = x21 x22 ... x2n−1 x2 .
.. .. .. ..
. . ··· . .
xn−1
1 xn−1
2 ... xn−1
n−1 x
n−1
7. Using Cramer’s rule, find x, y, and z for each of the following systems of equations.
3x − 4y + 2z = 1
2x + 3y − 3z = −1
5x − 5y + 4z = 7.
3x + 4y − 2z = 3
2x + 2y − 3z = 1
−x + y − 2z = −2.
4x + 7y − z = 7
3x + 2y + 2z = 9
x + 5y − 3z = 3
x + 3y = 0
2x + 6y + 4z = 0
−x + 2z = 0.
6
x
− 2
y
+ 1
z
= 4
2
x
+ 5
y
− 2
z
= 3
4
5
x
− 1
y
+ 3
z
= 63
4
.
43
x+y+z = 1
x + 1.0001y + 2z = 2
x + 2y + 2z = 1
8. Do the following planes intersect? Give a reason for your answer(Hint: Do not use
Cramer’s rule.)
x+y+z = 2
2x + 2y + 2z = 4
x y z
2
+ 2
+ 2
= 1
−2 1 0
9. Let A =
2 6 2
.
1 8 4
(c). Find det(A). Does A−1 exist? Give a reason for your answer. If so, find A−1 .
1+α1
α1
1 ··· 1
1 1+α2
··· 1 1+
∑
(a). α2
= ∏ αi .
.. .. .. αi
. . .
1 1 ··· 1+αn
αn n×n
44
α β β ··· β
β α β ··· β
{ (α − β)n (1 + nβ
), if α ̸= β
(b). β β α ··· β = α−β
.. .. .. . . . 0, if α = β
. . . . ..
β β β ··· α
n×n
1 + α1 α2 ··· αn
α1 1 + α2 · · · αn
(c). .. .. .. .. = 1 + α1 + α2 + ... + αn .
. . . .
α1 α2 ··· 1 + αn
n×n
0 3 2
14.(a). Find det(A), where A =
1 5 1 .
−4 2 −1
0 3 2
(b). With no further computation, what is det
10 50 10 ?
−4 2 −1
0 3 2
(c). With no further computation, what is det 10 56 14
?
−4 2 −1
45
Chapter 2
EIGENVALUES AND
EIGENVECTORS
2.1 Introduction
We welcome you to the second lecture on eigenvalues and eigenvectors of square matrices.
In this chapter we will prove that every square matrix has at least one eigenvalue and
an eigenvector to go with it. We will also determine the maximum number of eigenval-
ues a square matrix may have. The determinant function will be a powerful tool here.
However, it is possible, with some more advanced machinery, to compute eigenvalues
without ever making use of determinants.
46
Objectives
At the end of this lecture, you should be able to:
Remark 2.1
Note that we omit the case x = 0 since A0 = λ0 is true for all values of λ. An eigenvalue
λ = 0, however, is possible. The equation
Ax = λx
is equivalent to
(λI − A)x = 0, x ̸= 0, (2.1)
where I is the n × n identity matrix. If equation (2.1) is to have non-zero solutions, then
λ must be chosen so that the n × n matrix λI − A is singular. That is, det(λI − A) = 0.
47
Therefore the eigenvalue problem consists of two parts:
1. Find all scalars λ such that the matrix A − λI is singular, i.e. det(λI − A) = 0.
2. Given that λI − A is singular, find all the non-zero vectors x such that
(λI − A)x = 0.
Clearly if we know an eigenvalue of A, then the elementary row operation techniques
provide an efficient way to find the eigenvectors.
Many problems in sciences lead to the eigenvalue problem. Eigenvalue analysis is used
in solving systems of differential equations, solving optimization problems and diago-
nalization of linear transformations. Eigenvalue analysis is also used in the design of
car stereo systems so that the sounds are directed correctly for listening, vibration in
bridges, storey buildings, aircraft, suspension systems of cars and aerospace appliances,
e.g. rockets, missiles, etc and in describing the evolution of ”discrete-time systems”.
48
( )( ) ( ) ( )
2 0 1 2 1
Solution. Ax1 = = =2 .
0 −1 0 0 0
Remark 2.2
From Proposition 2.2 and Proposition 2.3, we conclude that the set of all eigenvectors
of a given eigenvalue λ, together with the zero vector, is a vector subspace of Rn . This
special subspace of Rn is called the eigenspace of A, and is usually denoted EA (λ).
Clearly EA (λ) = Ker(λI − A).
Proof
First note that 0 ∈ EA (λ) by definition and 0 ∈ Ker(λI − A). Now consider any nonzero
49
vector x ∈ Rn or Cn . Then
x ∈ EA (λ) if f λx = Ax
if f λx − Ax = 0
if f λIx − Ax = 0
if f (λI − A)x = 0
if f x ∈ Ker(λI − A).
Definition 2.1 tells us that the eigenvalues of an n × n matrix A correspond to the roots
of the characteristic polynomial of A. That is, they are the solutions of the characteristic
equation.
χA (λ) = |λI − A|
(λ − 2) 12
=
−1 (λ + 5)
= (λ − 2)(λ + 5) − (−12)
= λ2 + 3λ − 10 + 12
= λ2 + 3λ + 2
= (λ + 1)(λ + 2) = 0
50
( ) ( )
−3 12 1 −4
• For λ = λ1 = −1, we have −I − A = which reduces to .
−1 4 0 0
( )( ) ( )
1 −4 x1 0
Solving = , we have x1 − 4x2 = 0. Letting x2 = t ̸= 0, we
0 0 x2 0
conclude that every eigenvector of λ1 is of the form
( ) ( ) ( )
x1 4t 4
v1 = = =t , t ̸= 0.
x2 t 1
( )
4
In particular, v 1 = is an eigenvector of A corresponding to the eigenvalue λ = −1.
1
( ) ( )
−4 12 −1 3
• For λ = λ2 = −2, we have −2I − A = which reduces to .
−1 3 0 0
( )( ) ( )
−1 3 x1 0
Solving = , we have −x1 + 3x2 = 0. Letting x2 = t ̸= 0, we
0 0 x2 0
conclude
( that) every
( eigenvector
) ( of)λ2 is of the form ( )
x1 3t 3 3
v2 = = =t , t ̸= 0. In particular, v 2 = is an eigenvector
x2 t 1 1
of A corresponding to the eigenvalue λ = −2.
2 1 0
Example 2.3 . Find the eigenvalues and corresponding eigenvectors of A = 0 2 0 .
0 0 2
λ−2 −1 0
χA (λ) = |λI − A| = 0 λ−2 0 = (λ − 2)3 = 0.
0 0 λ−2
51
s = x1 , t = x3 , we find that the eigenvectors of λ = 2 are of the form
x1 s 1 0
v=
x2 = 0 = s 0 + t 0 , s, t ̸= 0.
x3 t 0 1
1 0
In particular, v 1 =
0 and v 2 = 0 are eigenvectors of A corresponding to the
0 1
eigenvalue λ = 2.
Since λ = 2 has two linearly independent eigenvectors, the dimension of its eigenspace
is 2.
3 −2 0
Example 2.4 Find the eigenvalues and eigenspaces of A =
−2 3 0
.
0 0 5
Solution. It is easy to show that the characteristic polynomial of A is (λ − 1)(λ − 5)2 .
Therefore the eigenvalues of A are λ = 1 and λ = 5.
−2 2 0 x1 0
• λ = 1: 2 0
−2 x2 = 0 , with solutions x − 3 = 0, x2 =
0 0 −4 x3 0
x1 t 1
t, x1 = t and t ̸= 0. So v1 =
x2 = t = t 1 .
x3 0 0
{ 1 }
So the eigenvector basis for λ = 1 is
1 . That is, the eigenspace for λ = 1 is
0
{ 1 }
the span of 1 .
0
2 2 0 x1 0
• λ=5: 2 2 0 x2 = 0 . Solving, we have x1 = −x2 , x2 =
0 0 0 x3 0
x2 , x3 = x3 (i.e it is a free variable or can take any value).
52
x1 −x2 −1 0
Thus v =
x2 = x2 = x2 1 + x3 0
. Let x2 = s ̸= 0, x3 = t ̸= 0.
x3 x3 0 1
−1 0
This implies that v = s 1 + t 0
.
0 1
Thus v can be expressed as a linear combination of two linearly
independent
vectors as
{ −1 }
0
shown above. So a basis for the eigenspace for λ = 5 is 1
, 0 .
0 1
where I is the identity matrix of the same size as A. We say that A is a root of f (t) if
f (A) = 0, the zero matrix.
−1 3 2
Example. Let p(x) = 14 + 19x − 3x2 − 7x3 + x4 , and A =
1 0 −2 .
−3 1 1
We will compute p(A). First we compute the necessary powers of A. Note that A0 is
defined
as the identity
matrix. It
is easy to show
that
−2 1 −6 19 −12 −8 −7 49 54
3 4
A2 =
5 1 0 , A = −4 15
8 , A = −5 −4 −30 .
1 −8 −7 12 −4 11 −49 47 43
−139 193 166
Then p(A) = 14I + 19A − 3A2 − 7A3 + A4 = 27 −98 −124 .
−198 118 20
Note that p(x) factors as p(x) = 14 + 19x − 3x2 − 7x3 + x4 = (x − 2)(x − 7)(x + 1)2 .
Therefore
53
p(A) = (A − 2I)(A − 7I)(A + I)2 .
This example shows that it is natural to evaluate a polynomial with a matrix, and that
the factored form of the polynomial is as good as (or may be better than) the expanded
form.
Definition 2.2 Suppose A is an n×n matrix. Let χA (λ) be the characteristic polynomial
of A. The algebraic multiplicity of an eigenvalue λ0 is the number of times it appears
in the factorization χA (λ) = (λ − λ1 )(λ − λ2 )...(λ − λn ) of the characteristic polynomial.
That is, the highest power of (λ − λ0 ) that divides the characteristic polynomial χA (λ).
The geometric multiplicity of λ0 is the dimension of the eigenspace EA (λ0 ) of λ0 .
That is, the number of linearly independent eigenvectors that span EA (λ0 ).
54
Theorem 2.8 Suppose A is an n × n matrix. Then A cannot have more than n distinct
eigenvalues.
= n.
Solution. The characteristic polynomial of A is χA (λ) = (λ − 1)3 , and thus the only
eigenvalue of A is λ = 1. This eigenvalue has algebraic multiplicity 3. Solving for the
eigenspace using (I − A)x = 0 gives x2 = 0, x3 = 0 and x1 is a free variable. Thus v is
in the eigenspace E1 if and only if v is of the form
x1 1
v=
0 = x1 0 .
0 0
Thus the geometric multiplicity of the eigenvalue λ = 1 is 1.
55
Example 2.7 Determine
the algebraic and geometric multiplicities for the eigenvalues
1 1 0
of B =
0 1 0 .
0 0 1
λI − A1 −B
|λI − A| = = |λI − A1 |.|λI − A2 | = χA1 (λ) . χA2 (λ).
0 λI − A2
That is, the characteristic polynomial of A is the product of the characteristic polyno-
mials of the diagonal blocks A1 and A2 .
By induction, we have the following result.
Theorem 2.9 Suppose A is a block triangular matrix with square diagonal blocks A1 , A2 , ..., Ar .
Then the characteristic polynomial of A is the product of the characteristic polynomials
of the diagonal blocks Ai , i = 1, 2, ..., r. That is,
56
9 −1 5 7
8 3 2 −4
Example 2.8 Find the characteristic polynomial of M = .
0 0 3 6
0 0 −1 8
Solution. Check that M is a block triangular matrix with diagonal blocks M1 and M2 ,
with χM1 (λ) = λ2 −12λ+35 = (λ−5)(λ−7) and χM2 (λ) = λ2 −11λ+30 = (λ−5)(λ−6).
Thus the characteristic polynomial of M is the product
We know that two linear systems of equations have the same solution if their augmented
matrices are row equivalent. We now identify classes of matrices that have the same
eigenvalues.
Remark 2.3
57
Theorem 2.10 If A and B are similar square matrices, then A and B have the same
characteristic polynomials and hence the same eigenvalues. Moreover, these eigenvalues
have the same algebraic multiplicity.
Proof. Since A and B are similar, there exists a non-singular matrix P such that
B = P −1 AP . To establish the above fact, observe that
χB (λ) = |λI − B|
= |λI − P −1 AP |
= |λP −1 P − P −1 AP |
= |P −1 (λI − A)P |
= |P −1 ||λI − A||P |
= |P −1 ||P ||λI − A|
= |λI − A|
= χA (λ)
Remark 2.4
Note that although similar matrices always have the same characteristic polynomial, it
is not true that two matrices with the same characteristic polynomials are necessarily
similar.
Now χ(λ) = (λ − 1)2 is the characteristic polynomial for both A and I; so A and I have
the same set of eigenvalues. If A and I were similar, however, there would be a 2 × 2
matrix P such that I = P −1 AP . But the equation I = P −1 AP is equivalent to P = AP ,
which is in turn equivalent to P P −1 = A or I = A. Thus I and A cannot be similar.
Remark 2.5
Two matrices can have exactly the same characteristic polynomial without being similar,
so similarity leads to a more finely detailed way to distinguish matrices. Although similar
matrices have the same eigenvalues, they do not generally have the same eigenvectors.
58
For example, if B = P −1 AP and if Bx = λx, then P −1 AP x = λx or A(P x) = λ(P x).
Thus if x is an eigenvector for B corresponding to λ, then P x is an eigenvector for A
corresponding to λ.
Dk = (P −1 AP )k = P −1 Ak P.
Since D is diagonal, it is easy to form the power Dk . Once the matrix Dk has been
computed, the matrix Ak can be recovered easily by forming P Dk P −1 :
P Dk P −1 = P (P −1 Ak P )P −1 = Ak .
Diagonalization Algorithm
Given an n × n matrix A,
Step 1: Find the characteristic polynomial χA (λ) of A.
Step 2: Find all the n roots of χA (λ) to obtain the eigenvalues of A.
Step 3: Find all the eigenvectors v i corresponding to each eigenvalue λi .
Step 4: Consider the collection S = {v 1 , v 2 , ..., v m } of all eigenvectors obtained in Step 3.
Theorem 2.11 Let A be an n × n matrix. Suppose that v 1 , v 2 , ..., v n are non-zero eigen-
vectors of A belonging to distinct eigenvalues λ1 , λ2 , ..., λn . Then v 1 , v 2 , ..., v n are linearly
independent.
Proof. We prove by contradiction. Suppose the theorem is not true. Let v 1 , v 2 , ..., v s
be a minimal set of vectors for which the theorem is not true. We have s > 1, since
59
v 1 ̸= 0. Also, by the minimality condition, v 2 , v 3 , ..., v s are linearly independent. Thus
v 1 is a linear combination of v 2 , v 3 , ..., v s , say
v 1 = a2 v 2 + a3 v 3 + ... + as v s (2.2)
where some ak ̸= 0. Applying A to both sides of (2.2) and using the linearity of A yields
Since v 1 , v 2 , ..., v s are linearly independent, the coefficients in (2.5) must all be zero.
That is
a2 (λ1 − λ2 ) = 0, a3 (λ1 − λ3 ) = 0, ..., as (λ1 − λs ) = 0.
Proof. Since A has n distinct eigenvalues, by Theorem 2.11 A has a set of n linearly
independent eigenvectors. Thus A is diagonalizable.
60
not diagonalizable. Observe that A2 = 0. If there exists a non-singular matrix P such
that P −1 AP = D, where D is diagonal then D2 = P −1 A2 P = 0 implies that D = 0,
which implies that A = 0, which is false. Thus A, as well as any other nonzero nilpotent
matrix, is not diagonalizable. Non-zero nilpotent matrices are not the only ones that
can’t be diagonalized, but as we will see, nilpotent matrices paly a particularly nilpotent
role in non-diagonalizability.
We give necessary and sufficient conditions for a square matrix to be diagonalizable.
Proof.
(⇐=): Let S = {v 1 , v 2 , ..., v n } be a linearly independent set of eigenvectors of A corre-
sponding to the eigenvalues λ1 , λ2 , ..., λn of A. Define P = [v 1 |v 2 |...|v n ]. That is, P is a
matrix whose columns are the eigenvectors of A. Then
λ1 0
D=
... = [λ1 e1 |λ2 e2 |...|λn en ]
0 λn
The columns of P are the eigenvectors of the linearly independent set S and so P is
non-singular. We know P −1 exists and
P −1 AP = P −1 A[v 1 |v 2 |...|v n ]
= P −1 [Av 1 |Av 2 |...|Av n ]
= P −1 [λ1 v 1 |λ2 v 2 |...|λn v n ]
= P −1 [λ1 P e1 |λ2 P e2 |...|λn P en ]
= P −1 [P (λ1 e1 )|P (λ2 e2 )|...|P (λn en )]
= P −1 P [λ1 ev 1 |λ2 e2 |...|λn en ]
= ID
= D
61
d1 0
[y 1 |y 2 |...|y n ] and a diagonal matrix E =
..
. = [d1 e1 |d2 e2 |...|dn en ] such
0 dn
−1
that T AT = E. Then consider
Thus we conclude that the individual columns are equal vectors. That is, Ay i = di y i ,
for 1 ≤ i ≤ n. That is, y i is an eigenvector of A corresponding to the eigenvalue di .
Since T is non-singular, the set S containing T ′ s columns is a linearly independent set.
So the set set S has all the required properties.
It is clear that diagonalizable matrices have full eigenspaces.
Theorem 2.15 A square matrix A is diagonalizable if and only if the algebraic multi-
plicity of each eigenvalue is the same as its geometric multiplicity. That is, for every
eigenvalue λ of A, GA (λ) = αA (λ).
62
Solution. It is easy to verify
( that ( ) λ1 = 2 and λ2 = −1 with corre-
) A has eigenvalues
2 1
sponding eigenvectors v 1 = and v 2 = . Forming P = [v 1 v 2 ], we obtain
1 1
( ) ( )
2 1 −1 2
P = , P −1 = . It is easy to check that
1 1 1 −1
( )
2 0
P −1 AP = =D
0 −1
Definition 2.6 can be rephrased as follows: A square matrix Q is orthogonal if and only
if Qt Q = I. Another useful description of orthogonal matrices can be obtained from the
above relation. Suppose Q = [q1 , q − 2, ..., qn ] is an n × n matrix. since the ith row of Qt
is equal to qit , the definition of matrix multiplication tells us that: the ij-th entry of Qt Q
is equal to qit qj . Thus a matrix Q is orthogonal if and only if the columns {q1 , q2 , ..., qn }
of Q, form an orthonormal set of vectors.
63
( ) ( )
1 2 1 2
Example 2.12 Let A = and B = . Then A is symmetric while B
2 3 1 2
is not symmetric.
Symmetric matrices A have the property that αA (λ) = GA (λ) and hence are examples
of perfect matrices.
Theorem 2.17 If A is an n × n real symmetric matrix, then all its eigenvalues are real.
Proof. Let A be any n × n real symmetric matrix, and suppose that Av = λv, v ̸= 0
and where we allow the possibility that v is a complex vector. To isolate λ, we first note
that
v t (Av) = v t (λv) = λ(v t v).
λv t v = v t (Av) = (Av)t v = v t At v.
This holds since A = At . Since A is real, we also know that Av = λv. Hence, we deduce
that
λv t v = λv t v.
In many physical problems, a matrix of interest will be real and symmetric. If the
eigenvalues are to represent physical quantities of interest, then they need to be real
numbers.
We have seen some conditions for a square matrix to be similar to a diagonal matrix.
Notice that a similarity transformation is a change of basis on a matrix representation.
So we can now discuss the choice of basis used to build a matrix representation, and
decide if some bases are better than others.
64
2.6.1 Diagonalization of Symmetric Matrices
Set
w1 = v1
⟨v 2 ,w1 ⟩
w2 = v 2 − ⟨w 1 ,w 1 ⟩
w1
⟨v 3 ,w1 ⟩ ⟨v 3 ,w2 ⟩
w3 = v3 − ⟨w1 ,w1 ⟩ 1
w − ⟨w 2 ,w 2 ⟩
w2
.. .. ..
. . .
⟨v n ,w1 ⟩ ⟨v n ,w2 ⟩ ⟨v n ,wn−1 ⟩
wn = v n − ⟨w1 ,w1 ⟩ 1
w − ⟨w2 ,w2 ⟩ 2
w − ... − w
⟨wn−1 ,wn−1 ⟩ n−1
Example 2.13 Use the Gram-Schmidt process to construct an orthonormal basis set
from
1 0 3
v1 =
2 , v 2 = 1 , v 3 = −7
−1 −1 1
− 12 − √12
√
Normalizing we have u2 = √ w2
= 2
0 = 0 .
⟨w2 ,w2 ⟩
−2
1
− 2
√1
65
Finally, the third vector is found from
⟨w1 , v 3 ⟩ ⟨w2 , v 3 ⟩
w3 = v 3 − w1 − w2 .
⟨w1 , w1 ⟩ ⟨w2 , w2 ⟩
3 1 − 1
3
12 2 2
But ⟨w2 , v 3 ⟩ = −2, so w3 = −7 + 6 2 + 1 0 = −3 .
2
1 −1 − 12 −3
Normalizing ⟨w3 , w3 ⟩ = 27 and so
3 3 1
w3 1 1
−3 = √1 −1
u3 = =√ −3 = √
∥w3 ∥ 27 3 3 3
−3 −3 −1
We now show that every symmetric matrix can be diagonalized by an orthogonal matrix.
We demonstrate this by first stating the following theorem.
Theorem 2.18 (Schur’s Theorem) Let A be an n × n matrix, where A has only real
eigenvalues. Then there is an n × n matrix Q such that Qt AQ = T , where T is an n × n
upper triangular matrix.
Proof.(a). Suppose A is symmetric. Then A has only real eigenvalues. Thus there exist
an orthogonal matrix Q such that Qt AQ = M , where M is an upper triangular matrix.
Using the transpose operation on the above equality and also using the fact that At = A,
we obtain
M t = (Qt AQ)t = Qt At Q = Qt AQ = M.
66
diagonal, we know that Dt = D. Thus, using the transpose operation on this equality,
we obtain
Qt AQ = D = Dt = (Qt AQ)t = Qt At Q.
or
(QQt )A(QQt ) = (QQt )At (QQt )
Remark 2.6
Theorem 2.14 above states that every real symmetric matrix A is orthogonally diag-
onalizable. That is, there exists an orthogonal matrix Q such that Qt AQ = D, where
D is diagonal. The eigenvalues of A are the diagonal entries of D and eigenvectors of A
can be chosen as the columns of Q. Since the columns of Q form an orthonormal set,
we have the following result.
Theorem 2.20 Suppose A is a symmetric matrix and x and y are two distinct eigen-
vectors of A corresponding to distinct eigenvalues. Then x and y are orthogonal vectors.
⟨x, y⟩ = 1
λ−ρ
(λ − ρ)⟨x, y⟩
= 1
λ−ρ
(λ⟨x, y⟩ − ρ⟨x, y⟩)
= 1
λ−ρ
(⟨λx, y⟩ − ⟨x, ρ y⟩)
= 1
λ−ρ
(⟨λx, y⟩ − ⟨x, ρy⟩)
= 1
λ−ρ
(⟨Ax, y⟩ − ⟨x, Ay⟩)
= 1
λ−ρ
(⟨Ax, y⟩ − ⟨Ax, y⟩)
1
= λ−ρ
(0)
= 0.
Thus x is orthogonal to y.
67
Corollary 2.21 Let A be an n × n symmetric matrix. It is possible to choose eigenvec-
tors v 1 , v 2 , ..., v n for A such that {v 1 , v 2 , ..., v n } is an orthonormal basis for Rn .
68
2.7 Solved Exercises
( )
1. Let V = P2 and T (x) = T x(t) = (1 + t2 )x′′ (t) + x′ (t) + x(t), where ′ = d
dt
and P2 is
the vector space of polynomials of degree less than or equal to 2. Find the eigenvalues
and eigenvectors of T .
Solution. One must verify that T : V −→ V , and this can be done in the process
of the computations. Pick an ordered basis {1, t, t2 } for V and compute each T (ti ) to
find the matrix A of T . If the image under T of each basis member lies in V , then so
does the image T (x) for every x ∈ V , by the linearity of T . Hence the construction
of A by expressing each T (ti ) as a linear combination of {1, t, t2 } can succeed only if
T : V −→ V . From T (1) = 1, T (t) = 1 + t and T (t2 ) = 2(1 + t2 ) + 2t + t2 , we have
1 1 2
A = [T ] =
0 1 2 .
0 0 3
Hence χA (λ) is the characteristic polynomial of A. Thus the eigenvalues of T are 1 and
3, with algebraic multiplicities 2 and 1, respectively.
0 1 0
• For λ1 = 1: (A − I) after row-reduction gives 0 0 1
. A basis for the eigenspace
0 0 0
{ 1 }
is
0 .
0
−2 0 3 { 1.5 }
• For λ2 = 3: (A−3I) after row-reduction yields 0 1 −1 , which gives
1
0 0 0 1
as a basis.
Returning to P2 , the eigenvalue, eigenspace pairs for T are λ + 1, E1 = span{e1 }, with
e1 = 1, and λ2 = 3, E2 = span{e2 } with e2 = 1.5 + t + t2 . Since all the eigenvectors
have been found, there is not a basis of eigenvectors for T .
69
3 1 1
2. Let A =
1 3 1 .
1 1 3
70
SolutionThe characteristic equation of A is χA (λ) = λ2 − 7λ − 8 = 0. The Cayley-
( )
43 14
Hamilton Theorem tells us that A2 − 7A − 8I = 0. Note that A2 = , 7A =
63 22
( ) ( )
35 14 8 0
, 8I = . Putting these together, we obtain
63 14 0 8
( ) ( ) ( ) ( )
43 14 35 14 8 0 0 0
A2 − 7A − 8I = − − =
63 22 63 14 0 8 0 0
4. Suppose that λ and ρ are two different eigenvalues of a square matrix A. Prove
that the intersection of the eigenspaces for these two eigenvalues is trivial. That is,
EA (λ) ∩ EA (ρ) = {0}.
Solution.
It suffices to show that the two sets are equal. First, note that {0} ⊆ EA (λ) ∩ EA (ρ).
Choose x ∈ {0}. Then x = 0. Eigenspaces are subspaces, so both EA (λ) and EA (ρ)
contain the zero vector, and therefore x ∈ EA (λ) ∩ EA (ρ). That is
To show that EA (λ)∩EA (ρ) ⊆ {0}, suppose x ∈ EA (λ)∩EA (ρ). Then x is an eigenvector
of A for both λ and ρ and so
x = 1x
= 1
λ−ρ
(λ − ρ)x, λ ̸= ρ, λ − ρ ̸= o
= 1
λ−ρ
(λx − ρx)
= 1
λ−ρ
(Ax − Ax)
1
= λ−ρ
(0)
= = 0.
71
Solution.
Let x ̸= 0 be one eigenvector of A corresponding to λ, and write q(x) = a0 + a1 x +
a2 x2 + ... + am xm . Then
6. Suppose that A is a square matrix. Prove that the constant term of the characteristic
polynomial of A is equal to det(A).
Solution. Suppose the characteristic polynomial χA (λ) = a0 + a1 λ + ... + an λn . Then
7. Suppose A is a square matrix. Prove that a single vector may not be an eigenvector
of A corresponding to two different eigenvalues.
Solution. Suppose x ̸= 0 is an eigenvector of A corresponding to two eigenvalues λ and
ρ, where λ ̸= ρ. Then λ − ρ ̸= 0, and so we also have
0 = Ax − Ax
= λx − ρx
= (λ − ρ)x
8. Suppose A and B are similar matrices. Prove that A3 and B 3 are similar matrices.
Generalize.
72
Solution
We know that there is a non-singular matrix P such that A = P −1 BP . Then
A3 = (P −1 BP )3
= (P −1 BP )(P −1 BP )(P −1 BP )
= P −1 BBBP
= P −1 B 3 P
More generally, if A is similar to B and k is any non-negative integer, then Ak is similar
to B k . This can be proved by mathematical induction on k.
73
Does there exist a vector v ∈ R3×1 such that Av = 2v?
Solution The characteristic equation is −(λ − 1)(λ − 4)2 and the eigenvalues of A are
λ = 1 and λ = 4 (of algebraic multiplicity
2).
−3 0 3 1 0 1
• If λ = 1, (A − I) reduces to (A − I) =
0 3 0 ∼ 0 1 0 .
−6 0 6 0 0 0
The rank
is 2, so the eigenspace has dimension 3 − 2 = 1, generated by the eigenvector
1
v1 = 0 .
−1
−6 0 3 2 0 −1
• If λ = 4 (A − 4I) reduces to (A − 4I) =
0 0 0 ∼ 0 0 0 .
−6 0 3 0 0 0
The rank is 1, so the eigenspace
has dimension 3
− 1 = 2, generated by two linearly
1 0
independent eigenvectors v 2 =
0 and v 3 = 1 .
2 0
The answer is ”no”. Because if it was true, then λ = 2 would be an eigenvalue of A,
which is not.
f : R −→R be the linear map which is in the usual basis of R has the matrix
3 3 3
11. Let
2 0 −3
A= 0 5 0 .
4 0 9
(a). Find all eigenvalues and the corresponding eigenvectors of f .
(b). Is there a basis for R3 such that the matrix of f with respect to this basis is a
diagonal matrix?
Solution
(a). It is easy to show that χA (λ) = −(λ − 5)2 (λ − 6) and hence the eigenvalues of A
are λ = 5(with algebraic multiplicity 2) and λ = 6.
1
• If λ = 5, there are two linearly independent eigenvectors v 1 =
0 and v 2 =
−1
74
0
1 .
0
3
• If λ = 6, we have v 3 = 0
as its corresponding eigenvector.
−4
(b). Since we have 3(= dimR3 ) linearly independent eigenvectors, these must be a basis
for R
3
, and the matrix
with respect to {v 1 , v 2 , v 3 } is the diagonal matrix
5 0 0
D= 0 5 0 .
0 0 6
2 2 1 2 1 −1
12. Prove that the matrices A =
1 3 1 and B = 0
2 −1 have the
1 2 2 −3 −2 3
same characteristic polynomial, and yet they are not similar.
Solution A simple computation shows that χA (λ) = −(λ − 1)2 (λ − 5) and χB (λ) =
−(λ − 1)2 (λ − 5). That is χA (λ) = χB (λ).
If they are not similar, then λ = 1 , which has algebraic multiplicity 2, must have dif-
ferent geometric multiplicities for A and B.
For λ = 1we reduce:
1 2 1 1 2 1
A−I =
1 2 1 ∼ 0 0 0 and
1 2 1 0 0 0
1 1 −1 1 0 0
B−I = 0
1 −1 ∼ 0 1 −1 .
−3 −2 2 0 0 0
Since A − I has rank 1, and B − I has rank 2, the matrices A and B cannot be similar.
1 0 0
13. Given that A = 1 1 1 ,
1 0 2
(a). find all the eigenvalues and the corresponding eigenvectors of A.
(b). find a matrix P such that P −1 AP = D.
75
Solution
A simple computation gives χA (λ) = (1 − λ)2 (2 − λ) and the eigenvectors are λ =
1
1(algebraic multiplicity 2) and λ = 2 with corresponding eigenvalues v 1 = 0 and
−1
0 0
v2 =
1 and v 3 = 1 .
0 1
1 0 0 1 0 0
It is immediately seen that P = [v 1 |v 2 |v 3 ] =
0 1 1 and D = 0 1 0 .
−1 0 1 0 0 2
0 2 0
14. Diagonalize the matrix A = 2 0 2 .
0 2 0
Solution Notice that since A is symmetric , the eigenvectors of A forms a basis for
√ √
R3 . Solving the characteristic polynomial gives {0, −2 2, 2 2} as the
eigenvalues
of
1 1
√
A, corresponding normalized eigenvectors v 1 = √12 0 , v2 = 1 − 2 , v3 =
2
−1 1
1
1
√
2 2
.
1
√1 1 1
2 2
√ √
2
The matrix that orthogonally diagonalizes A is Q =
0 − 2
2
2
2 .
− √12 1 1
2 2
0
0 0
√
Note that Qt AQ =
0 −2 2 0 .
√
0 0 2 2
15. Find two matrices that are of the same size and have the same determinant but are
not similar.(Hint: keep this simple. Look at 2 × 2 diagonal matrices).
76
( ) ( )
1
1 0 2
0
Solution Let A = and B = .
0 1 0 2
Clearly det(A) = det(B). But A and B are not similar. For suppose they are similar.
Then B = P −1 AP = P −1 P = I, which is false.
2.8 Exercises
4 −2 4 2
1. Given the matrix A =
−2 2
7 and v = 1 ,
4 2 4 −2
(a). Prove that v is an eigenvector of A.
(b). Solve the equation v T x = 0 and prove that every x ∈ R3 which satisfies the equation
is an eigenvector of A.
(c). Find an orthogonal matrix Q and a diagonal matrix D such that QT AQ = D.
2 −1 −1
2.(a).Find the eigenvalues and eigenvectors of A =
−1 −1 . 2
−2 2 −1
(b). Determine the algebraic and geometric multiplicities of each eigenvalue of A.
2 −2 3
3. Consider the matrix A =
1 1
1 .
1 3 −1
(a). Find characteristic polynomial of A.
(b). Find all the eigenvalues of A and their corresponding eigenvectors.
(c). Show by a direct calculation that A obeys the Cayley-Hamilton Theorem.
(d). Compute A4 using the Cayley-Hamilton Theorem.
(e). Compute A−1 .
(f). Show that the eigenvectors of A are linearly independent.
( )
−5 8
4. Suppose A is the 2 × 2 matrix A = .
−4 7
(a). Find the eigenvalues of A.
(b). Find the eigenspaces of A.
77
(c). For the polynomial p(x) = 3x2 − x + 2, compute p(A).
(d). Find a diagonal matrix similar to A.
5. Find the
( eigenvalues,
) eigenspaces, algebraic and geometric multiplicities for the ma-
−1 2
trix B = .
−6 6
8. Suppose A and B are similar matrices with A non-singular. Prove that B is non-
singular and that A−1 is similar to B −1 .
( )
1 α
9. Prove that the eigenvectors of the matrix A = , where α ̸= 0, generates a
0 1
one-dimensional space. Find a basis for this space.
2 2 0
10. Prove that the eigenvectors of the matrix A = 0 2 0
generates a 2-dimensional
0 0 2
space and find a basis for this space.
11. Find
the eigenvalues
and eigenvectors of each of the following matrices:
1 1 1
(a). A =
0 1 1
0 0 1
78
1 1 0
(b). B =
0 1 1
0 0 1
( )
0 2
(c).
−2 0
(Hint: In (c) expect complex eigenvalues and eigenvectors.)
12. Diagonalize
the matrices
−1 3 −1
(a). A =
−3 5 −1
−3 3 1
1 1 1 1
1 1 −1 −1
(b). .
1 −1 1 −1
1 −1 −1 1
d
13. Find the eigenvalues and eigenvectors of the linear operator dt
in the space of poly-
nomials Pn of degree at most n with real coefficients.
0 1 −3
14. Let A =
−1 −3 3 .
−1 −1 0
(a). Find the eigenvalues and corresponding eigenvectors of A.
(b). Determine the algebraic and geometric multiplicities of each eigenvalue of A.
(c). Explain why A is not similar to a diagonal matrix.
3 2 1
15. Show that λ = 2 is an eigenvalue of A =
0 0
.2
−2 −3 0
Find αA (λ) and GA (λ). Can you conclude anything about the diagonalizability of A
from these results? ( )
7 1
5 5
16. Compute limn→∞ An for A = .
−1 1
2
79
17.(a). Find a matrix with eigenvalues 2 and 5.
(b). Prove that a square matrix is singular if and only if 0 is one of its eigenvalues.
(c). Suppose v is an eigenvector of an n × n matrix A associated with the eigenvalue λ.
Suppose P is a non-singular n × n matrix. Show that P −1 v is an eigenvector of P −1 AP
associated with the eigenvalue λ.
(d). Suppose that A and B are n × n matrices. Suppose v is an eigenvector of A asso-
ciated with the eigenvalue λ. Suppose v is also an eigenvector of B associated with the
eigenvalue ρ. Show that v is an eigenvector of A + B associated with λ + ρ.
19. Consider the linear operator T : C(R) −→ C(R) defined by T (f (x)) = f (x − 1),
where C(R) denotes the set of all continuous functions from R to R.
(a). Show that the function f (x) = ex is an eigenvector of T associated with the
eigenvalue e−1 .
(b). Show that the function f (x) = e2x is an eigenvector of T . What is the associated
eigenvalue?
(c). Show that any negative number is an eigenvalue of T .
(d). Show that 0 is not an eigenvalue of T .
80
Chapter 3
MINIMAL POLYNOMIAL OF A
SQUARE MATRIX
3.1 Introduction
We continue our study to a unique polynomial extracted from the characteristic poly-
nomial of a square matrix.
Objectives
At the end of this lecture, you should be able to:
• Determine whether two square matrices of the same size and in the same field
are similar.
81
equals 1.
Let A be a square matrix. Let J(A) denote the collection of all polynomials f (t) for
which A is a root,i.e., for which f (A) = 0. The set J(A) is not empty, since the Cayley-
Hamilton Theorem tells us that the characteristic polynomial χA (t) of A belongs to J(A).
Let mA (t) denote the monic polynomial of lowest degree in J(A). Such a polynomial
exists and is unique. We call mA (t) the minimal polynomial of the matrix A. That is,
the minimal polynomial mA (t) of A is a monic polynomial of smallest degree for which
mA (A) = 0. The Cayley-Hamilton Theorem ensures or guarantees the existence of a
minimal polynomial. Clearly, its degree is less or equal to the degree of the characteristic
polynomial.
Example 3.1 f (t) = t2 + 3t + 1 and g(t) = t10 − 5t6 + 7t2 + 5 are monic polynomials.
Proof Let p′ (x) be another monic polynomial has the same degree as p(x) and such
that p′ (A) = 0. Then p(x) − p′ (x) will be a polynomial annihilating A and having degree
strictly less than deg p. It follows that p′ (x) = p(x).
By the division algorithm, every polynomial q(x) = s(x).p(x) + r(x), where deg r <
deg p(x). The polynomial r(x) in this case is called the remainder. We say that a
polynomial p(x) divides another polynomial q(x) if there exists a polynomial s(x) such
that q(x) = s(x).p(x). In other words, p(x) divides q(x) if we can take the remainder
r(x) to be zero.
Example 3.2 Let p(x) = x − 2 and q(x) = x4 − 4x + 4. Clearly p(x) divides q(x), since
q(x) = (x − 2)(x − 2).
82
Theorem 3.2 The minimal polynomial mA (t) of a matrix A divides every polynomial
that has A as a root. In particular, mA (t) divides the characteristic polynomial χA (t) of
A. Moreover, the minimal polynomial mA (t) of A has A as a root. That is, mA (A) = 0.
Proof. Suppose that f (t) is a polynomial for which f (A) = 0. By the division al-
gorithm, there exist polynomials q(t) and r(t) for which f (t) = mA (t)q(t) + r(t) and
r(t) = 0 or deg r(t) < deg mA (t). Substituting t = A in this equation and using that
f (A) = 0 and mA (A) = 0, we obtain r(A) = 0. If r(t) ̸= 0, then r(t) is a polynomial of
degree less than mA (t) that has A as a root. This contradicts the definition of the mini-
mal polynomial. Thus r(t) = 0, and so f (t) = mA (t)q(t), that is, mA (t) divides f (t).
There is an even stronger relationship between the characteristic polynomial and the
minimal polynomials of A.
Theorem 3.3 The characteristic polynomial χA (t) and the minimal polynomial mA (t)
of A have the same irreducible factors.
Proof. Suppose that f (t) is an irreducible polynomial. If f (t) divides mA (t), then f (t)
( )
divides χA (t) since mA (t) divides χA (t) . On the other hand, if f (t) divides χA (t),
then f (t) also divides [mA (t)]n . But f (t) is irreducible; hence f (t) also divides mA (t).
Thus mA (t) and χA (t) have the same irreducible factors.
Remark 3.1
Note that an irreducible factor is a factor that cannot be expressed as the product of
two or more non-trivial factors. Irreducibility depends on the field K: a polynomial may
be irreducible over some fields but reducible over others. For instance, p(x) = x2 + 1 is
irreducible over R but reducible over C. The polynomial q(x) = x2 − 2 is irreducible in
Q but reducible in R. Clearly , all polynomials of degree 1 are irreducible.
Note that the Theorem 3.2 does not say that mA (t) = χA (t), only that any irreducible
factor of one must divide the other. In particular, since a linear factor is irreducible,
mA (t) and χA (t) have the same linear factors. Hence they have the same roots. Thus
we have the following theorem.
83
Theorem 3.4 A scalar λ is an eigenvalue of a square matrix A if and only if λ is a
root of the minimal polynomial of A.
Thus
f (λ) = mA (λ) = (λ − 1)(λ − 3) = λ2 − 4λ + 3
84
3.2 Minimal polynomial of a Block Triangular Matrix
We investigate the case of a diagonal block matrix. The case of a block triangular matrix
is similar.
Theorem 3.5 Suppose that M is a block diagonal matrix with diagonal blocks A1 , A2 , ..., Ar .
Then the minimal polynomial of A is equal to the least common multiple (LCM) of the
minimal polynomials of the diagonal blocks A1 , A2 , ..., Ar .
Proof. We prove the theorem( for the)case r = 2. The general theorem follows easily
A 0
by induction. Suppose M = , where A and B are square matrices. We need
0 B
to show that the minimal polynomial mM (t) of M is the least common multiple of the
minimal polynomials mA (t) and mB (t) of A and B, respectively. Since mM (t) is the
minimal polynomial of M ,
( )
mA (A) 0
mM (M ) = = 0,
0 mB (B)
and hence mA (A) = 0 and mB (B) = 0. Since mA (t) is the minimal polynomial of A,
mA (t) divides mM (t). Similarly, mB (t) divides mM (t). Thus mM (t) is a multiple of
mA (t) and mB (t). Now let f(t) be another multiple of mA (t) and mB (t). Then
( ) ( )
f (A) 0 0 0
f (M ) = = .
0 f (B) 0 0
But mM (t) is the minimal polynomial of M ; hence mM (t) divides f (t). Thus mM (t) is
the least common multiple of mA (t) and mB (t).
Example 3.5 Find the characteristic polynomial and the minimal polynomial of the
2 5 0 0 0
0 2 0 0 0
block diagonal matrix A = 0 0 4 2 0 .
0 0 3 5 0
0 0 0 0 7
85
( ) ( )
2 5 4 2
Solution. Note that A = diag(A1 , A2 , A3 ) where A1 = , A2 = , A3 =
0 2 3 5
[7]. The characteristic polynomial χA (λ) of A is the product of the characteristic poly-
nomials χA1 (λ), χA2 (λ) and χA3 (λ) of A1 , A2 and A3 , respectively. One can show
that χA1 (λ) = (λ − 2)2 , χA2 (λ) = (λ − 2)(λ − 7) and χA3 (λ) = (λ − 7). Thus
χA (λ) = (λ − 2)3 (λ − 7) is the characteristic polynomial of A.
It is easy to check that the minimal polynomials mA1 (λ), mA2 (λ) and mA3 (λ) of A1 , A2
and A3 , respectively, are equal to their characteristic polynomials. That is
mA1 (λ) = (λ−2)2 , mA2 (λ) = (λ−2)(λ−7) and mA3 (λ) = (λ−7). But mA (λ) is equal to
{ }
the least common multiple of these three polynomials,.i.e. mA (λ) = LCM mA1 (λ), mA2 (λ), mA3 (λ) .
Thus mA (λ) = (λ − 2)2 (λ − 7).
Remark 3.2
Note that the Theorem 3.6 says that for A to be diagonalizable with eigenvalues λ1 , λ2 , ..., λk ,
then its minimal polynomial of A is of the form
mA (λ) = (λ − λ1 )α1 (λ − λ2 )α2 ...(λ − λk )αk , where α1 = α2 = ... = αk = 1.
In Chapter two, we have seen that similar matrices have the same characteristic poly-
nomials and that the converse is not true. We gave examples of matrices with the same
characteristic polynomial and yet they are not similar. In this section we prove that if
two matrices of the same size have the same minimal polynomials and are both diago-
nalizable or both non-diagonalizable, then they must be similar. We also note although
similar matrices must have the same characteristic polynomial, they need not have the
same minimal polynomials.
86
Theorem 3.7 Similar matrices have the same minimal polynomials.
Proof Recall that similar matrices have the same characteristic polynomials. Suppose
A and B are similar. Then χA (x) = χB (x). Suppose mA (x) ̸= mB (x). Then mA (x) =
mB (x) + r(x), where r(x) ̸= 0. Note that since mA (x) divides χA (x) and mB (x) divides
χB (x), we have that χA (x) = mA (x).p(x) and χB (x) = mB (x).q(x), for some non-zero
polynomials p(x) and q(x). Therefore χA (x) = (mB (x) + r(x)).p(x) = mB (x).p(x) +
r(x).p(x). But χA (x) = χB (x) implies that mB (x).p(x) + r(x).p(x) = mB (x).q(x) which
(
implies that mB (x).p(x) − mB (x).q(x) + r(x).p(x) = 0 which implies that mB (x) p(x) −
)
q(x) + r(x).p(x) = 0. This is true only if p(x) = q(x) and r(x) = 0, which are both
contradictions to our assumption that r(x) ̸= 0. This proves that mA (x) = mB (x).
Remark 3.3
The converse of Theorem 3.7 is not true in general. It is true only if the two matrices
are both diagonalizable or both non-diagonalizable and the minimal polynomials are
products of distinct linear factors. Note also two matrices having the same characteristic
polynomial does not generally imply that they have the same minimal polynomial.
Solution
(a). By inspection, χA (λ) = χB (λ) = (λ − 2)3 . A simple computation shows that
mA (λ) = (λ − 2)3 and mA (λ) = (λ − 2)2 .
(b). Clearly A and B are not diagonalizable since their minimal polynomials are not
products of distinct linear factors.
(c). A and B are not similar. Although they are both non-diagonalizable, they do not
have the same minimal polynomials.
87
Remark 3.4
Note that two matrices of the same size can have the same minimal and characteristic
polynomials without being similar.
88
(a). A simple computation gives the characteristic polynomial χA (λ) = λ(λ − 1)2 .
(b). We extract the minimal polynomial of A from the characteristic polynomial: it
must divide the characteristic polynomial, is of least degree, it is monic, has same lin-
ear(irreducible) factors and hence same roots and has A as a root. Thus the minimal
polynomial mA (λ) is either f (λ) = λ(λ − 1)2 or g(λ) = λ(λ − 1). We only check g since
f being the characteristic
polynomial satisfies
all the above
conditions.
0 0 1 −1 0 1 0 0 0
A(A − I) =
0 1 0 0 0 0 = 0 0 0 .
0 0 1 0 0 0 0 0 0
Thus g(λ) = mA (λ) = λ(λ − 1) is the minimal polynomial of A.
(c). Since the minimal polynomial of A is a product of linear(irreducible) factors, A is
daigonalizable.
2 1 0 0 0
0 2 0 0 0
3. Let A = 0 0 2 0 0 .
0 0 0 −1 1
0 0 0 0 −1
Note that the minimal polynomial is unique and does not depend on how we choose our
89
diagonal blocks. In Problem (3), we several other choices of the diagonal blocks.
3.6 Exercises
−1 −2 −2
1. Let A =
1 2 1 . (a). Find the characteristic polynomial of A.
−1 −1 0
(b). Find the minimal polynomial of A.
′′ ′
2. Let T : P2 −→ P2 be defined by T p(x) = (1 − x2 )p (x) − xp (x) + 2p(x), where P2
′
denotes the vector space of polynomials of degree less or equal to 2 and p (x) and p”
denotes the first and second derivative of p(x), respectively.
(a). Find a matrix representation A for T .
(b). Find the characteristic polynomial and minimal polynomial of A.
(c). Is A diagonalizable? Give a reason for your answer.
1 1 1 1
1 1 1 1
3. Let B = .
1 1 1 1
1 1 1 1
(a). Find the characteristic polynomial of B.
(b). Find the minimal polynomial of B.
(c). Is B diagonalizable? Give a reason for your answer.
1 2 0 0
0 −1 0 0
4. Let M = .
0 0 1 0
0 0 −1 2
(a). Find the characteristic polynomial of M .
(b). Find the minimal polynomial of M .
(c). Is M diagonalizable? Give a reason for your answer.
90
2 0 1 0 0
0 2 3 0 0
5. Let N = 0 0 1 0 0 .
0 0 0 1 1
0 0 0 1 1
(a). Find the characteristic polynomial of N .
(b). Find the minimal polynomial of N .
(c). Is N diagonalizable? Give a reason for your answer.
9. Explain why similar matrices have the same minimum and characteristic polynomials.
91
Chapter 4
LINEAR FUNCTIONALS
4.1 Introduction
In this chapter we will study linear mappings from a vector space V into its field of
scalars K. Here, we view K as a vector space over itself.
Objectives
At the end of this lecture, you should be able to:
Remark 4.1
92
The following are some examples of linear functionals:
∑
1. Let T : Rn −→ R be defined by T x = ni xi , where x = (x1 , x2 , ..., xn ). Then T is
a linear functional on V = Rn .
∫1
2. Let T : C([0, 1]) −→ R be defined by T x = 0
x(t)e−t dt, where V = C([0, 1]) denotes
the vector space of continuous functions on the closed interval [0, 1]. Then T is a linear
functional on V .
3. Let T : C([0, 1]) −→ R be defined by T x = x(t0 ), where V = C([0, 1]) denotes the
vector space of continuous functions on the closed interval [0, 1] and t0 is a fixed number
in [0, 1]. This is called the evaluation map. It is easy to check that T is a linear
functional on V = C([0, 1]).
The set of linear functionals on a vector space V over a field K is also a vector space
over K, with addition and scalar multiplication defined by
and
(αϕ)(v) = αϕ(v),
where ϕ and ψ are linear functionals on V and α ∈ K. This space is called the dual
space of V , and is denoted by V ∗ .
93
4.3 Dual Basis
Suppose V is a vector space of dimension n over K. The dual space V ∗ has dimension
n (since K is of dimension 1 over itself). In fact, each basis of V determines a basis for
V ∗.
Theorem 4.1 Suppose B = {v1 , v2 , ..., vn } is a basis of a vector space V over a scalar
field K. Let ϕ1 , ϕ2 , ..., ϕn ∈ V ∗ be linear functionals as defined by
{
1, if i = j
ϕi (vj ) = δij =
0, if i ̸= j
Then B ∗ = {ϕ1 , ϕ2 , ..., ϕn } is a basis of V ∗ .
The above basis B∗ is termed the basis dual to B or the dual basis. The above formula
is a short way of writing
( ) ( )
{ 2 3 }
Example 4.1 Consider the basis B = v1 = , v2 = of R2 . Find the
1 1
∗
dual basis B = {ϕ1 , ϕ2 }.
ϕ1 (v1 ) = 1, ϕ1 (v2 ) = 0
ϕ2 (v1 ) = 0, ϕ2 (v2 ) = 1.
These four conditions lead to the following two systems of linear equations:
94
ϕ1 (v1 ) = ϕ1 (2, 1) = 2a + b = 1,
ϕ1 (v2 ) = ϕ1 (3, 1) = 3a + b = 0
and
ϕ2 (v1 ) = ϕ1 (2, 1) = 2c + d = 0,
ϕ2 (v2 ) = ϕ1 (3, 1) = 3c + d = 1
with solutions a = −1, b = 3 and c = 1, d = −2. Hence ϕ1 (x, y) = −x + 3y and
ϕ2 (x, y) = x − 2y form the dual basis. Therefore B ∗ = {−x + 3y, x − 2y}.
of R3 .
ϕ1 (x, y, z) = x.
95
Solving the system of equations gives b1 = 7, b2 = −2, b3 = −3. Thus
ϕ2 (x, y, z) = 7x − 2y − 3z.
ϕ3 (x, y, z) = −2x + y + z.
ϕ1 (v 1 ) = 1, ϕ1 (v 2 ) = 0
and
ϕ2 (v 1 ) = 0, ϕ2 (v 2 ) = 1.
Thus ∫1
ϕ1 (v 1 ) = (a + bt)dt = a + 21 b = 1,
∫ 02
ϕ2 (v 1 ) = 0
(a + bt)dt = 2a + 2b = 0
and ∫1
ϕ1 (v 2 ) = (c + dt)dt = c + 12 d = 0,
∫ 02
ϕ2 (v 2 ) = 0
(c + dt)dt = 2c + 2d = 1
Solving each system yields a = 2, b = −2 and c = − 12 , d = 1. Thus {v 1 = 2 − 2t, v 2 =
− 12 + t} is the basis of V that is dual to {ϕ1 , ϕ2 }.
4.5 Exercises
{ }
1. Find the dual basis of the basis (1, 0, 0), (0, 1, 0), (0, 0, 1) of R3 .
96
2. Let V = P2 , the vector space of polynomials over R of degree ≤ 2. Let ϕ1 , ϕ2 , ϕ3 be
the linear functionals on V defined by
∫ 1
ϕ1 (f (t)) = f (t)dt, ϕ2 (f (t)) = f ′ (1), ϕ(f (t)) = f (0)
0
Here f (t) = a + bt + ct2 ∈ V and f ′ (t) denotes the derivative of f (t). Find the basis
{ } { }
f1 (t), f2 (t), f3 (t) of V that is dual to ϕ1 , ϕ2 , ϕ3 .
4. Find the dual basis of the basis {(1, −2, 3), (1, −1, 1), (2, −4, 7)}.
97
Chapter 5
5.1 Introduction
In this chapter we generalize the notions of linear mappings and linear functionals. We
introduce the notion of a bilinear form, which gives rise to a quadratic form. Quadratic
forms occur frequently in applications of linear algebra to engineering (in design crite-
ria and optimization) and signal processing (as output noise power). They also arise
in physics (as potential and kinetic energy), differential geometry (as normal curvature
of surfaces), economics (as utility functions), and statistics (in confidence ellipsoids).
Some of the mathematical background for such applications flows easily from our work
on symmetric matrices.
98
Objectives
At the end of this lecture, you should be able to:
• Determine the symmetric matrix associated with a quadratic form and vice
versa.
Definition 5.1 Let V be a vector space of finite dimension over a field K. A bilin-
ear form on V is a mapping f : V × V −→ K such that, for all α, β ∈ K, and all
u1 , u2 , v 1 , v 2 , u, v ∈ V the following two conditions are satisfied:
(1). f (αu1 + βu2 , v) = αf (u1 , v) + βf (u2 , v), i.e. f is ”linear in the first
variable”.
Example 5.1 Let f be the dot product on Rn . For u = (u1 , u2 , ..., u2 ) and v = (v1 , v2 , ..., vn ),
we have
f (u, v) = u.v = u1 v1 + u2 + v2 + ... + un vn
99
5.2.1 Bilinear Forms and Matrices
Theorem 5.1 Let f be a bilinear form. Then f is alternating if and only if it is skew-
symmetric.
We now investigate the notions of symmetric bilinear forms and quadratic forms and
their representation by means of symmetric matrices.
Definition 5.3 Let f be a bilinear form on V . Then f is said to be symmetric if, for
every u, v ∈ V , f (u, v) = f (v, u).
Theorem 5.2 f is a symmetric bilinear form if and only if any matrix representation
A of f is a symmetric matrix.
100
5.2.3 Quadratic Forms
Quadratic forms are heavily used in calculus to check the second order conditions in
optimization problems. They have a particular use in econometrics, as well.
This means that the quadratic polynomial representing q has no ”cross-product” terms.
Indeed, every quadratic form has such a representation.
Remark 5.1
101
5.2.4 Classification of Real Symmetric Bilinear Forms
Example 5.2 Let f be the dot product on Rn . Clearly f is a symmetric bilinear for on
Rn . Note that f is positive definite. For any u = (u1 , u2 , ..., un ) ̸= 0 in Rn ,
t
Solution. Let A
= [aij ], where
aij isthe coefficient of xi yj . Then f (u, v) = X AY =
( ) 3 0 −2 y1
x1 x2 x3 5 7 −8
y2
0 4 −6 y3
Example 5.4 Find the symmetric matrix that corresponds to each of the following
quadratic forms:
(a). q(x, y, z) = 3x2 + 4xy − y 2 + 8xz
(b). q(x, y, z) = 2x2 − 5y 2 − 7z 2
Solution
x ( ) 3 2 4 x
(a). Let X = x y z
t
y . Then q(x, y, z) = X AX = 2 −1 0 y .
z 4 0 0 z
Thus
102
3 2 4
A=
2 −1 0
4 0 0
(b). Proceeding
as in
(a), it is clear that
2 0 0
A= 0 −5 0
0 0 −7
Example 5.5 Find the quadratic form q(X) that corresponds to the following symmetric
( )
5 −3
matrix A = .
−3 8
Solution( )
x
Let X = . Then
y
t
q(x, y) = X AX = 5x2 − 6xy + 8y 2
There are several ways to determine whether or not a real quadratic form is positive
definite.
Recall: Principal
minors are the determinants
of the principal submatrices A(k) of A.
a11 a12 ... a1n
( )
a ... a2n
21 a22 a11 a 12
Suppose A = . .. .. . . Then A = [a11 ], A =
(1) (2)
, ..., A(n) =
.. . . .. a21 a22
an1 an2 ... ann
A.
Note that Sylvester’s test is named after the famous English and American mathemati-
cian James Joseph Sylvester (1814-1897).
103
Theorem 5.4 (Principal Axis Theorem). A quadratic form q is positive definite
if and only if all the eigenvalues of A, the symmetric matrix representation of q in some
basis of V are strictly positive.
Example 5.6 Make a change of variable that transforms the quadratic form
q(x, y) = x2 − 8xy − 5y 2
104
into a quadratic form with no cross-product term.
( )
1 −4
Solution The matrix of the quadratic form is A = . First step is to
−4 −5
orthogonally diagonalize A. Its eigenvalues turn out to be λ = 3 and λ = −7. The
corresponding unit(normalized) eigenvectors are
( ) ( )
√2 √1
λ = 3 : v1 = 5
; λ = −7 : v 2 = 5
− √15 √2
5
These vectors are orthogonal and so provide an orthonormal basis for R2 . Let
( ) ( )
√2 √1 3 0
5 5
P = [v 1 v 2 ] = ,D =
− 5
√ 1 √2
5
0 −7
Then A = P DP −1 and D = P −1 AP = P t AP . ( ) ( )
x u1
A suitable change of variable is X = P u, where X = and u = .
y u2
Then
t
q(x, y) = x2 − 8xy − 5y 2 = X AX = (P u)t A(P u) = ut P t AP u = ut Du = 3u21 − 7u22
Remark 5.2
The columns of P in Theorem 5.6 are called the principal axes of the quadratic from
t
q(X) = X AX. The vector u is the coordinate vector of X relative to the orthonormal
basis of Rn given by these principal axes. Quadratic forms are important in statistics in
the form of the covariance matrix. Here the change of basis to diagonalize the matrix
leads to a decomposition of a multivariate distribution as a product of independent
distributions in the direction of the new basis vectors.
105
5.3.2 Geometric View of Principal Axes
t
Suppose q(X) = X AX, where A is an invertible 2 × 2 symmetric matrix, and let c be
a constant. It can be shown that the set of all X in R2 that satisfy
t
q(X) = X AX = c
106
(i). positive definite if and only if the eigenvalues of A are all positive, i.e λi > 0 for
all i.
(ii). positive(or positive semidefinite or nonnegative semidefinite) if and only
if the eigenvalues of A are non-negative, i.e λi ≥ 0 for all i.
(iii). negative definite on V if and only if the eigenvalues of A are negative, i.e.
λi < 0 for all i.
(iv). negative on V if and only if the eigenvalues of A are non-positive, i.e. λi ≤ 0 for
all i.
(v). indefinite on V if and only if A has both positive and negative eigenvalues.
Proof. By the Principal Axes Theorem, there exists an orthogonal change of variable
X = P u such that
t
q(X) = X AX = ut Du = λ1 u21 + λ2 u22 + ... + λn u2n (5.2)
where λ1 , ..., λn are the eigenvalues of A. Since P is invertible, there exists a one-to-
one correspondence between all nonzero X and all nonzero u. Thus the values of q(X)
for X ̸= 0 coincide with the values of the expression on the right hand side of (5.2),
which is obviously controlled by the signs of the eigenvalues λ1 , ..., λn in all the five cases
described in the theorem.
Engineers, economists, scientists and mathematicians often need to find the maximum
or minimum value of a quadratic form q(X) for X in some specified set. Typically, the
107
problem can be arranged so that X varies over the set of unit vectors. This is called
constrained optimization problem.
When a quadratic form q(X) has no cross-product terms, it is easy to find the maximum
t
and minimum of q(X) for X X = 1.
Example 5.8 Find the maximum and minimum values of q(x1 , x2 , x3 ) = 9x21 +4x22 +3x23
subject to the constraint x21 + x22 + x23 = 1.
Solution Since x22 and x23 are non-negative, note that 4x22 ≤ 9x22 and 3x23 ≤ 9x23 and
q(x1 , x2 , x3 ) = 9x21 + 4x22 + 3x23
≤ 9x21 + 9x22 = 9x23
hence whenever x21 + x22 + x23 = 1. So the maximum
= 9(x21 + x22 + x23 )
= 9
value of q(X) cannot exceed 9 when X is a unit vector. Furthermore q(X) = 9 when
t
X = (1, 0, 0). Thus 9 is the maximum value of q(X) for X X = 1.
To find the minimum value of q(X), observe that
and hence
q(X) ≥ 3x21 + 3x22 + 3x23 = 3(x21 + x22 + x23 ) = 3
whenever x21 + x22 + x23 = 1. Also, q(X) = 3 when x1 = 0, x2 = 0, and x3 = 1. So 3 is
t
the minimum value of q(X) when X X = 1.
Remark 5.3
It is easy to see from Example 5.8 that the matrix of the quadratic form has eigenvalues
9, 4, and 3 and the largest and smallest eigenvalues equal, respectively, the (constrained)
maximum and minimum of q(X). The same holds for any quadratic form.
108
2. Find a change of variable that removes the cross-product term in
5x1 − 4x1 x2 − 5x22 = 48.
( )
5 −2
Solution A = . The eigenvalues of A are 3 and 7, with corresponding
−2 5
( ) ( )
√1
2
− √1
2
unit (normalized) eigenvectors v 1 = , v2 = . Let
√1 √1
( ) 2 2
√1
2
− √12
P = [v 1 v 2 ] = . Then P orthogonally diagonalizes A, so the change of
√1 √1
2 2
variable X = P u produces the quadratic form
The new axes for this change of variable are shown on Fig 5.2(a).
5.6 Exercises
2. Classify the quadratic form. Then make a change of variable X = P u that transforms
the quadratic form into one with no cross-product term. Write the new quadratic form.
(a). q(x1 , x2 ) = 3x21 − 3x1 x2 + 6x22
(b). q(x1 , x2 ) = 9x21 − 8x1 x2 + 3x22
(c). q(x1 , x2 ) = x21 − 6x1 x2 + 9x22
(d). q(x1 , x2 ) = 8x21 + 6x1 x2
(e). q(x1 , x2 ) = x21 + 10x1 x2 + x22
It can be shown that the eigenvalues of A are 3, 9, and 15. Find an orthogonal matrix P
t
such that the change of variable X = P u transforms X AX into a quadratic form with
109
no cross-product term. Give P and the new quadratic form.
t
5. Let A be an n × n invertible matrix. Show that if the quadratic form q(X) = X AX
t
is positive definite then so is the quadratic form qb(X) = X A−1 X.
is positive definite.
110
Chapter 6
6.1 Introduction
In Chapter 2, we introduced the notion of orthogonal matrices and the role they play
in orthogonally diagonalizing real symmetric matrices. In this lecture, we continue the
study on orthogonal matrices, studying more properties and how they define orthogonal
operators in vector spaces.
Recall that a real matrix P is orthogonal if P is nonsingular and P −1 = P t . Equiva-
lently, if P P t = P t P = I. If V is complex, then P is said to be unitary and satisfies
U ∗ U = U U ∗ = I. That is U ∗ = U −1 , where T ∗ = (T )t , is the adjoint of a matrix T .
Objectives
At the end of this lecture, you should be able to:
111
6.2 Unitary and Orthogonal Matrices
We note that unitary and orthogonal matrices have special features, one of which is the
fact that they have easily computed inverses. We study some properties of orthogo-
nal/unitary matrices.
( √ )
1 3
2 2
Example 6.1 The matrix A = √ is orthogonal.
− 2
3 1
2
Proof The matrix I is nonsingular. If B is singular, then this would imply that I is
singular, a contradiction. So B must be nonsingular. Now that we know that B is
nonsingular, there exists an n × n matrix C such that BC = I. Now
BA = (BA)I
= (BA)(BC)
= B(AB)C
= BIC
= BC
= I
Lemma 6.2 says that if A is nonsingular, then the matrix B will be both a right inverse
and a left inverse for A, so A is invertible and A−1 = B.
Proof of Theorem 6.1 . By definition we know that Qt Q = I. If either Qt or Q
112
were singular, then this equation would have us conclude that I is singular. This is a
contradiction, since I is nonsingular. So Q and Qt are both nonsingular. By Lemma 6.2
QQt = I, and hence Q−1 = Qt
Theorem 6.1 also holds for a unitary [Link] simply replace Qt with Q∗ .
Proof. The proof revolves around recognizing that a typical entry of the product (A)t A
is an inner product of columns of A. To support this claim note that
[ ] ∑n [ t ]
t
(A) A = k=1 (A) [A]kj
ij ik
∑n
= k=1 [A]ki [A]kj
∑n
= k=1 [A]ki [A]kj
∑n
= k=1 [A]kj [A]ki
∑n
= k=1 [Aj ]k [Ai ]k
= ⟨Aj , Ai ⟩
⇐⇒ (A)t A = In
⇐⇒ A
is an orthogonal/unitary matrix
113
Proof.
⟨Qu, Qv⟩ = (Qu)t Qv
= ut Qt Qv
= ut Qt Qv
= ut (Q)t Qv
= ut (Q)t Qv
= ut I n v
= ut In v
= ut v
= ⟨u, v⟩
The second conclusion is just a specialization of the first condition.
√
∥Qv∥ = ∥Qv∥2
√
= ⟨Qv, Qv⟩
√
= ⟨v, v⟩
√
= ∥v∥2
= ∥v∥
Theorem 6.5 Let P be a real matrix. Then the following are equivalent:
(a). P is orthogonal
(b). the rows of P form an orthonormal set
(c). the columns of P form an orthonormal set.
Example. Let T : R3 −→ R3 be the linear operator that rotates each vector v about
the z-axis by a fixed angle θ. That is, T (x, y, z) = (x cos θ − y sin θ, x sin θ + y cos θ, z).
It is easy to show that T is orthogonal.
114
Definition 6.3 Let V be an n-dimensional vector space. Mappings T : V −→ V that
preserve a metric are called isometries.
Note that isometries are called unitary operators when V is complex(i.e. when the
underlying field is C) and orthogonal when V is real(i.e. when the underlying field is
R). Thus isometries on Rn are precisely the orthogonal matrices, and the isometries on
Cn are the unitary matrices. The set U(n) ⊂ M(n, C) of unitary n × n matrices is a
group under matrix multiplication. It is called the unitary group.
The set O(n) ⊂ M(n, R) of orthogonal n × n matrices is a group under matrix multi-
plication. It is called the orthogonal group.
Note that orthogonal equivalence implies unitary equivalence but the converse is not
generally true.
Proof Let U, V and W be vector spaces and assume that the mappings P : U −→ V
and Q : V −→ W both are isometries. Then ∥P u∥ = ∥u∥ for all u ∈ U and ∥Qv∥ = ∥v∥
for all v ∈ V . Then
∥QP u∥ = ∥Q(P u)∥ = ∥P u∥ = ∥u∥
for all u ∈ U .
The orthogonal operators in a vector space V with determinant 1 form a group called
the special orthogonal group . This group is usually denoted by SO(V). An element
of this group is called a proper rotation. If V = Rn , this group is denoted by SO(n, R).
This group plays an important role in Euclidean geometry. The operators T in SO(V)
115
in 2-dimensional case are the most simple ones. If B = {e1 , e2 } is an orthornomal basis
in V , then these operators are of the form
( )
cos φ − sin φ
T = (6.1)
sin φ cos φ
The operator T associated with matrix (6.2) is called the operator of rotation about
the vector e3 by the angle φ.
116
1 0 0
Px =
0 cos φ − sin φ
0 sin φ cos φ
117
cos φ 0 sin φ
Py =
0 1 0
− sin φ 0 cos φ
118
cos φ − sin φ 0
Pz =
sin φ cos φ 0
0 0 1
119
6.4 Solved Problems
120
Solution No! Although the column vectors are orthogonal, they are not normalized.
of R , T
3
Solution
With respect
to the standard
basis and L have matrix
representations
1 0 −1 0 0 1 2 1 −1
T = 1 1 0 and L = 1 0 0 . T T = 1 2 0 ̸= I. So T is not
t
0 0 1 0 1 0 −1 0 1
orthogonal. LLt = I, and hence L is orthogonal. You can check that the rows and
columns of L are orthonormal.
6.5 Exercises
1. Let A be an orthogonal matrix. Show that det(A) = ±1. Show that if B is also
orthogonal and det(A) = −det(B), then A + B is singular.
2. Prove that the eigenvalues of a unitary operator is contained in the unit circle
∂D = {z : |z| = 1}.
is orthogonal. (Hint: It must be a unit vector that is orthogonal to the other columns.
How much freedom does it leave?). Verify that the rows automatically become orthonor-
mal at the same time.
121
( )
√1 √1
2 2
(a).
− 12
√ √1
2
0 0 1
(b).
1 0 0
0 1 0
( √ )
1
2
− 2
3
(c). √
3 1
2 2
3 6 −2
7 7 7
(d).
−2
7
3
7
6
7
6 −2 3
7 7 7
122
Bibliography
[1] Roland E. Larson and Bruce H. Edwards, Elementary Linear Algebra, Third
Edition,...
[2] Seymour Lipschutz and Marc Lars Lipson, Schaum’s Outline Series: Theory
and Problems of Linear Algebra, Third Edition, McGraw-Hill, NY, 2001.
[3] Scheick J.T., Linear Algebra with Applications, International Series in Pure and
Applied Mathematics, McGraw-Hill, NY, 1997.
[5] George Nakos and David Joyner, Linear Algebra with Applications,
Brooks/Cole, CA, 1998.
123
Index
additions, 17 vector, 2
algebraic multiplicity, 53 column-reduced, 16
alternating, 100 complex
area, 28 matrices, 3
arithmetic operations, 17 Constrained Optimization, 107
coplanar
bilinear form, 99
points, 34
block diagonal, 18
Cramer’s Rule, 26
block triangular, 18
defective
calculator algorithms, 17
matrices, 55
Cayley-Hamilton Theorem, 54
deficient
characteristic
matrix, 55
equation, 50
determinant, 1
values, 47
cofactor expansion, 6
vectors, 47
determinant function, 19
characteristic
diagonal, 10
polynomial, 50
diagonalizable
classical adjoint, 23
matrix, 57
coefficient matrix, 26
Diagonalization, 57
cofactor, 7
diagonalization, 48
expansion, 6
Diagonalization Algorithm, 59
cofactor
diagonalization theorem, 104
expansion, 5
dual basis, 93
cofactors, 5
dual space, 93
column
matrix, 2 echelon form, 11
124
eigenpair, 47 invertible
eigenspace, 49 matrix, 21
eigenvalue, 46 isometries, 115
eigenvalue problem, 46
leading 1, 11
eigenvector, 46
leading entry, 11
elementary
linear equations, 25
column operations, 12
linear functional, 92
row operations, 12
lower
elementary matrix, 21
triangular
Elimination
matrix, 9
Gaussian, 11
entry, 2 matrix, 2
equation block triangular, 56
plane, 34 classical adjoint, 23
straight line, 30 echelon form, 11
equations of lines, 28 elementary, 21
evaluation map, 93 invertible, 21
matrix
Gaussian Elimination, 11
spectrum, 47
gaussian elimination, 37
maximum value, 107
geometric multiplicity, 53
minimal polynomial, 81
Gram-Schimdt orthogonalization process, 65
minimum value, 107
homogeneous system, 48 minor, 6
Householder transformation, 121 monic polynomial, 82
multiplications, 17
identity
multiplicity
matrix, 4
algebraic, 54
imperfect
geometric, 54
matrices, 55
indefinite, 101 negative, 101
inverse negative definite, 101
matrix, 21 nonnegative semidefinite, 101
125
nonzero row, 11 row-reduced, 16
RREF, 11
order
matrix, 2 Schur’s Theorem, 66
orthogonal group, 114 signed products, 6
orthogonal matrix, 63, 111 similar
orthogonally diagonalizable, 67 matrices, 57
orthogonally equivalent, 115 singular
orthonorma set, 111 matrix, 21
orthonormal basis, 111 size
orthonormal set, 63 matrix, 2
skew-symmetric, 100
plane
special orthogonal group, 115
equation, 34
spectrum
positive , 101
matrix, 47
positive definite, 101
square
Principal Axes Theorem, 105
matrix, 3
principal axis theorem, 103
standard position, 105
proper rotation, 115
symmetric, 63
proportional rows, 16
symmetric bilinear form, 100
quadratic form, 101
tetrahedron
real volume, 32
matrices, 3 Theorem
real symmetric, 63 Schur, 66
Reduced Row Echelon Form, 11 Theorem
root Cayley-Hamilton , 54
of a polynomial, 53 transpose, 3, 19
row triangular, 9
matrix, 2 triangular block, 18
row trivial solution, 25
vector, 2
unique solution, 25
row equivalent, 13
126
unitarily equivalent, 115
unitary group, 114
unitary matrix, 111
upper
triangular
matrix, 9
vandermonde’s determinant, 35
volume
tetrahedron, 32
zero
matrix, 4
zero row, 11
127