0% found this document useful (0 votes)
25 views116 pages

2200 Module NOTES

The document is a course module for MAT2200, titled 'An Introduction to Linear Algebra,' prepared by Zilore Mumba for students at the University of Zambia. It covers various topics in linear algebra, including linear equations, matrices, determinants, vector spaces, linear mappings, and inner product spaces, structured to align with the course syllabus. The module serves as a study aid and is based on existing literature rather than original content.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views116 pages

2200 Module NOTES

The document is a course module for MAT2200, titled 'An Introduction to Linear Algebra,' prepared by Zilore Mumba for students at the University of Zambia. It covers various topics in linear algebra, including linear equations, matrices, determinants, vector spaces, linear mappings, and inner product spaces, structured to align with the course syllabus. The module serves as a study aid and is based on existing literature rather than original content.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

An Introduction to Linear Algebra

MAT2200 Course Module

Zilore Mumba

Part-Time Lecturer
Department of Mathematics and Statistics
University of Zambia

April, 2016
Contents
Preface v

Introduction 1

1 Linear Equations 2

2 Matrices 4

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Square Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3 Matrix Addition and Scalar Multiplication . . . . . . . . . . . . . . . . . . . . . 6

2.4 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.5 Special Types of Square Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.6 Algebra of Square Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.7 Echelon Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.8 Elementary Row Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.9 Applications of Matrices to Systems of Linear Equations . . . . . . . . . . . . . 15

2.10 Elementary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.11 Application to Finding the Inverse of an nxn Matrix . . . . . . . . . . . . . . . . 18

2.12 Elementary Column Operations and Equivalent Matrices . . . . . . . . . . . . . 19

3 Determinants 21

3.1 Definition of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2 Minors and Cofactors of a Square Matrix . . . . . . . . . . . . . . . . . . . . . 21

3.3 Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.4 Major Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.5 Evaluation of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

i
CONTENTS

3.5.1 Determinants by Elementary Row/Column Operations . . . . . . . . . . 23

3.5.2 Determinants by Factorization . . . . . . . . . . . . . . . . . . . . . . . 24

3.5.3 Determinants by Pivotal Condensation . . . . . . . . . . . . . . . . . . 25

3.5.4 Calculating 3x3 Determinants . . . . . . . . . . . . . . . . . . . . . . . 26

3.6 Classical Adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.7 Applications to Solutions of Linear Equations . . . . . . . . . . . . . . . . . . . 28

4 Vector Spaces 29

4.1 Review of Spatial Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.2 Dot (or Inner) Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.3 Norm (or Length) of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.4 Distance, Angles, Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.5 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.5.1 4.20 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.6 Definition of a Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.7 Examples of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.8 Basis and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.9 Examples of Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.10 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.11 Examples of Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.12 Linear Independence, Linear Span, Row Space of a Matrix . . . . . . . . . . . 43

4.12.1 Linear Combinations and Independence . . . . . . . . . . . . . . . . . 43

4.13 Linear Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.14 Dimension and Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.15 Row Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

ii CONTENTS Chapter 0
CONTENTS

4.16 Basis-Finding Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.16.1 Finding a Basis and Dimension of a Subspace of a Vector Space . . . . 48

4.16.2 Casting-out Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.17 Intersection of Subspaces, Sums and Direct Sums . . . . . . . . . . . . . . . . 51

4.17.1 Intersection of Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.17.2 4.29 Sums of Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.17.3 Direct Sums of Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.18 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.19 Isomorphism of V and K n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5 Linear Mappings on Vector Spaces 55

5.1 Definition of a Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . 55

5.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2 Examples of Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . 56

5.3 Kernel and Image of a Linear Transformation . . . . . . . . . . . . . . . . . . . 59

5.4 Rank and Nullity of a Linear Mapping . . . . . . . . . . . . . . . . . . . . . . . 60

5.5 Matrix of a Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.5.1 Algorithm for Finding Matrix Representations of a Linear Transforma-


tion Relative to a Given Basis . . . . . . . . . . . . . . . . . . . . . . . 63

5.6 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.7 Singular and Nonsingular Linear Transformations, Isomorphisms . . . . . . . . 69

5.8 Operations with Linear Transformations . . . . . . . . . . . . . . . . . . . . . . 71

5.8.1 Composition of Linear Mappings . . . . . . . . . . . . . . . . . . . . . . 72

6 Inner Product Spaces 73

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Chapter 0 CONTENTS iii


CONTENTS

6.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.3 Euclidean and Unitary Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.4 Examples of Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . 75

6.5 Orthogonal Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.6 Gram-Schmidt Orthogonalisation Procedure . . . . . . . . . . . . . . . . . . . 79

7 Characteristic Roots and Vectors 81

7.1 Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

7.2 Similarity of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.3 Diagonalisation of Square Matrices and Linear Transformations . . . . . . . . . 86

7.4 Minimum Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.5 Diagonalisation of Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . 89

7.5.1 Real-Symmetric and Hermittian Matrices . . . . . . . . . . . . . . . . . 89

7.5.2 Properties of the Conjugate Transpose . . . . . . . . . . . . . . . . . . 89

7.6 Orthogonal and Unitary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7.7 Hermittian and Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . . . 90

7.8 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

7.8.1 General Quadratric Form in Three Variables . . . . . . . . . . . . . . . 94

7.8.2 Converting a Quadratric Form to Diagonal Form . . . . . . . . . . . . . 94

7.8.3 Principal Axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

7.8.4 Classification of Quadratric Forms . . . . . . . . . . . . . . . . . . . . . 99

7.8.5 Classification of symmetric matrices . . . . . . . . . . . . . . . . . . . . 101

Solutions to Numerical Exercises 110

iv CONTENTS Chapter 0
Preface
This MAT2200 course module, An Introduction to Linear Algebra, was prepared to be used a
study aide for students taking the course in the Department of Mathemtics and Statistics of
the University of Zambia during the 2015 academic year. This module does not contain any
original work by the author, nor does is it intended to replace the recommended textbook for
the course.

The material content was extracted from two main sources: Linear Algebra, an Introduction,
by A. Morris (Van Nostrand, 2004) and Linear Algebra, 4th editon, by S. Lipschulz and M.
Lipson (McGrawHill, 2009), including a little bit of material from other sources. I tried as
much as possible to confine the material to (the best of my understaning of) the MAT2200
syllabus as stipulated by the Department of Mathemtics and Statistics.

Each chapter of this module starts with a list of expected outputs of that chapter. This is in-
tended to assist the student identify the knowledge level expected of them at the end of the
chapter, and therefore hopefully assist them in the study.

I have spent a good amount of time proof reading the document to ensure error-free content
in the text as well as in solutions to problems. However, I cannot guarantee that there is no
error, especially when a document is proof read only by the author. Therefore, I will appreci-
ate to be informed of any errors so that I can correct them.

Zilore Mumba
Part-Time Lecturer
Department of Mathematics and Statistics
University of Zambia

v
Introduction
Linear algebra is the study of linear transformations (any operation that transforms an input
to an output) and their algebraic properties. A transformation is linear if (a) every
amplification of the input causes a corresponding amplification of the output (e.g. doubling of
the input causes a doubling of the output), and (b) adding inputs together leads to adding of
their respective outputs.

A good understanding of linear algebra is essential for research in almost all areas, not only
in physics and of athematics (e.g. in engineering, Image compression, Linear Programming,
to mention only a few) but also in finance, economics and sociology. It forms the core of
research in several modern applications, such as quantum information theory. Efficient
methods for handling large matrices lie at the heart of fast algorithms for retrieval of data and
information

Many abstract objects which will be encountered in various topics of linear algebra, such as
"change of basis", "linear transformations"," bilinear forms", etc can conveniently be
represented by matrices.
We can understand the behavior of linear transformations in terms of matrix multiplication.
This is not quite saying that linear transformations are the same as matrices, for two
reasons: i) the correspondence only works for finite dimensional spaces; and ii) the matrix
we get depends on the basis we choose, e.g. a single linear transformation can correspond
to many different matrices depending on what bases we pick.
But due to the convenience of representing linear transformations by matrices, we will look at
manipulations on matrices in a little more detail, and use matrix representation throughout
the course.
In linear algebra we shall manipulate not just scalars, but also vectors, vector spaces,
matrices, and linear transformations. These manipulations will include familiar operations
such as addition, multiplication, and reciprocal (multiplicative inverse), but also new
operations such as span, dimension, transpose, determinant, trace, eigenvalue, eigenvector,
and characteristic polynomial.

Broadly speaking, this note will cover the following topics:

• Matrices - allow us to carry out operations on elements of any arbitrary field of


numbers;

• Vector spaces - allow us to add and scalar multiply vectors;


• Linear transformations - allow us to transform inputs to outputs;
• Inner product spaces - allow us to compute lengths, angles, and inner products;
• Diagonalisation of matrices - allow us an insight into what the linear transformation is
doing - scaling, stretching or rotating.

1
Chapter 1: Linear Equations
Outputs
1. Understanding of two methods of finding solutions to systems of linear equations:

i) Cramer’s rule

ii) Gaussian Elimination

Let us(find solutions to the following


( systems of equations:
x1 + x2 = 2 x1 − x2 + x3 − 2 = 1
i) ii) iii)
2x1 + 2x2 = 3 2x1 − x2 + x3 =2

Solutions
We can solve these systems of equations using:
a) Determinants (Cramer’s rule), or
b) Gauss Elimination Method
e.g., for system i), using Cramer’s rule
We define:
1 −2 1
∆ = 2 −1 1 ={1(1 − 1) − (−2(−2 − 4) + 1(2 + 4)}=-6
4 1 −1
1 1 1
∆x1 = 2 −1 1 ={1(1 − 1) − (−2)(−2 − 1) + 1(2 + 1)}=-3
1 1 −1
1 1 1
∆x2 = 2 2 1 ={1(−1 − 2) − (−2)(2 − 8) + 1(2 + 4)}=-3
4 1 −1
1 −2 1
∆x3 = 2 −1 2 ={1(−1 − 2) − (−2)(2 − 8) + 1(2 + 4)}=-9
4 1 1
Hence:
∆ −3 1
x1 = = =
∆x1 −6 2
∆ −3 1
x2 = = =
∆x2 −6 2
∆ −9 3
x3 = = =
∆x3 −6 2
Using the Gauss Elimination Method, we reduce the systems i) -iii) to simpler systems of
equations by a process of elimination.
solution
 to i)

 x1 − 2x2 + x3 = 1

2x1 − x2 + x3 = 2


4x1 + x2 − x3 = 1

Equation 2→ Equation 2 - 2xEquation 1

2
CHAPTER 1. LINEAR EQUATIONS

Equation 3→ Equation 3 - 4xEquation 1




x1 − 2x2 + x3 = 1

3x2 − x3 = 0


9x2 − 5x3 = −3

Equation
 3→ Equation 3 - 3xEquation 2

x 1 − 2x 2 + x3 = 1

3x2 − x3 = 0


− 2x3 = −3

This is a simpler system to solve, and gives the unique solution
3 1 1
x3 = ; x2 = ; x1 = .
2 2 2
Note that the last form of the system of equations is said to be in echelon form.

solution
( to ii)
x1 + x2 − 2 = 2
2x1 + 2x2 =3
Equation
( 2→ Equation 2 - 2xEquation 1
x1 + x2 =2
+ 0 = −1
this system is inconsistent, giving the false statement 0 = −1.
So the system has no solution.

solution
( to iii)
x1 − x2 + x3 − 2 = 1
2x1 − x2 + x3 =2
Equation
( 2→ Equation 2 - 2xEquation 1
x1 − x2 + x3 − 2 = 1
3x2 − x3 =0
From this simpler system we see that x2 = 31 x3 .
Whatever value we assign to x3 will give a value of x2 , and by substitution into equation 1, a
value of x1 .
Let x3 = t
if t = 0; x1 = 1, x2 = 0 = x3 .
if t = 3; x1 = 0, x2 = 1; x3 = 3.
From the above three examples we note that a system of linear equations can have:
i) no solution
ii) a unique solution
iii) more than one solution

We shall develop the solution process for systems of linear equations using the Gauss
Elimination Method because it is more efficient.
We shall also want to understand:
i) when does the solution exist?
ii) when is the solution the unique solution?
We shall concentrate our study through the use of matrices.

Chapter 1 3
Chapter 2: Matrices
Outputs
Ability to:

1. Carry out matrix operations of addition and scalar multiplication

2. Carry out elementary row/column operations to reduce matrices to echelon and to row canon-
ical form

3. Be able to apply the above to:

i) solutions of linear equations

ii) finding inverse of a matrix

Preamble
We can list some of the subsets of the number system on which we can perform the usual
operations of addition and multiplication.
the set of real (rational and irrational) numbers → the real field R
the set of rational numbers → the rational field Q
the set of complex numbers → the complex field C
These subsets will form the basis of this course. We will frequently refer to an arbitrary field
K, which may be any of the fields above.

2.1 Introduction
Let K be an arbitrary field. A
 rectangular array of elements of K
a11 a12 . . . a1n
 
 a21 a22 . . . a2n 
 
 . . . . . .
 
am1 am2 . . . amn
where aij are scalars in K is called a matrix. The above matrix is also denoted
(aij ), i = 1, 2, ..., m, j = 1, 2, ..., n, or simply (aij ).
The element aij , called the ij entry or ij component appears in the ith row and jth column. A
matrix with m rows and n columns is called an m by n or mxn matrix.

e.g.
 
  1 2 1  
1 0 −3 1   1 3 1
 0 2 0
a) 2 1 3 1 is a 3x4 matrix, b) 
−3 3 1 is a 4x3 matrix and c) 2 1 4
    

1 0 1 1 4 7 6
 
1 1 1
is a 3x3 or 3-square matrix.

4
CHAPTER 2. MATRICES

A matrix, e.g. the matrix in c) above could be considered as the coefficient matrix of the
system
 of homogeneous linear equations

 x + 3y + z = 0

2x + y + 4z = 0


4x + 7y + 6z = 0

or
 as the augmented matrix of the system of non-homogeneous linear equations

 x + 3y = 1

2x + y = 4


4x + 7y = 6

We shall see how matrices may be used to find solutions to these systems.
Matrices are usually denoted by capital letters, A, B, etc., and elements of the field K by
lower case letters, a, b, etc.
Two matrices A and B are equal, denoted A=B, if they have the same shape and the
corresponding elements are equal.
A matrix with one row is also called a row vector and a matrix with one column is called a
column vector.  
  1
e.g. 1 2 3 is a 1x3 matrix or row vector, and 2 is a 3x1 matrix or column vector
 

0
A matrix
 whose entries are
 all zero is called a zero matrix.
0 0 . . . 0
 
0 0 . . . 0
e.g. 
  is a zero matrix and is denoted by 0.
. . . . . . 
0 am2 . . . 0

2.2 Square Matrices


A square matrix is a matrix with the same number of rows as columns. An nxn square
matrix is said to be of order n and is sometimes called an n-square matrix.
Let A = aij be an n-square matrix. We can define:

a) The Diagonal (or main diagonal) of the n-square matrix A = (aij ) which consists of the
elements with the same subscripts,that is, a11 , a22 , ..., ann .
 
1 2 3
e.g. A = 4 5 6 is a 3-square matrix with diagonal elements 1, 5, 9.
 

7 8 9

b) A Diagonal Matrix is a square matrix whose non-diagonal entries are all zero.

   
a11 0 . . .
0 a11
   
 0 a22 . . . 0  a22 . . . 
i.e.: A = 
  or   is a diagonal matrix.
 . . . . . .
  . . . . . .


0 am2 . . . ann . . . ann

Chapter 2 2.2. SQUARE MATRICES 5


CHAPTER 2. MATRICES

 
1 0 0
e.g. A = 0 1 0 is a diagonal matrix.
 

0 0 2

c) An Upper Triangular Matrix (or simply triangular matrix), a square matrix whose entries
below the main diagonal are all zero.

i.e.:    
a11 a12 . . . a1n a11 a12 . . . a1n
   
 0 a22 . . . a2n   a22 . . . a2n 
A=  .
 or   is an upper triangular matrix.
 . . . . . 

 . . . . . .

0 0 . . . ann ann
 
1 2 3
e.g. A = 0 1 3 is upper triangular.
 

0 0 2

d) A Lower Triangular Matrix, a square matrix whose entries above the main diagonal are
all zero.
i.e.:    
a11 0 . . .
0 a11 . . .
   
 a21 a22 . . . 0 or  a21 a22 . . .
 
A=
 .
 is a lower triangular
 . . . . . 

 . . . . . .

am1 am2 . . . ann am1 am2 . . . ann
matrix.

 
1 0 0
e.g. A = 2 1 0 is lower triangular.
 

3 3 2

e) The trace of A, written tr(A), is the sum of the diagonal elements. Namely,
tr(A) = a11 + a22 + a33 + . . . + ann
The following theorem applies.

Theorem 2.2.1. Suppose A = aij and B = bij are n-square matrices and k is a scalar.
Then
i) tr(A + B) = tr(A) + tr(B)
ii) tr(kA) = ktr(A)
iii) tr(AT ) = tr(A)
iv) tr(AB) = tr(BA).

2.3 Matrix Addition and Scalar Multiplication


Let A and B be two 2x3
! matrices !
a11 a12 a13 b11 b12 b13
A= ,B=
a21 a22 a23 b21 b22 b23
The sum of a and B, written A+B is the 2x3 matrix obtained by adding the corresponding

6 2.3. MATRIX ADDITION AND SCALAR MULTIPLICATION Chapter 2


CHAPTER 2. MATRICES

elements of A and B, i.e. !


a11 + b11 a12 + b12 a13 + b13
A+B =
a21 + b21 a22 + b22 a23 + b23
Note that A2x3 + B2x3 = C2x3 .
The product of a scalar k by the matrix A, written kA is the matrix obtained by multiplying
each element of A by k. ! !
a11 a12 a13 ka11 ka12 ka13
kA = k = The sum of matrices of different sizes in
a21 a22 a23 ka21 ka22 ka23
not defined.
The basic properties of matrix Addition and multiplication by a scalar can be summarised in:

Theorem 2.3.1. i) (A + B) + C = A + (B + C)
ii) A + 0 = A
iii) A + (−A = 0
iv) A + B = B + A
v) k1 (A + B) = k1 A + k1 B
vi) (k1 + k2 )A = K1 A + K2 A
vii) (k1 k2 )A = k1 (k2 A)
viii) 1A = A and 0A = 0

Proof. (of v))


Let A and B be mxn matrices and k a scalar.
Then k(A + B) = Ka + kB
Suppose A = (aij ) and B = (bij )
Then (aij ) + (bij ) is the ij entry of A+B.
On the other hand kaij and kbij are the ij entry of kA and kB respectively, and so
k(aij + bij ) is the ij entry of k(A + B)
But k, aij and bij are scalars in a field.
Hence k(aij + bij )=kaij + kbij for every i,j.
Thus k(A + B) = kA + kB , as corresponding entries are equal. 

Using vi) and viii) above, we can also show that A + A = 2A, A + A + A = 3A, etc.

2.4 Matrix Multiplication


If A is an mxn matrix and B is a pxq matrix then the product AB is defined if and only if n=p,
and AB is the mxq matrix obtained by multiplying the elements in the ith row of A with the
corresponding elements in the jth column of B and summing, that is
    
a11 . . . a1n b11 b1j b1q c11 . . . c1q
 . ... .  . ... .  . ... .
    
    
=
 ai1 . . . ain 
 . ... . . cij .
 
 
 . ... .  . ... .  . ... .
    

am1 . . . amn bp1 bpj bpq cm1 . . . cmq

Chapter 2 2.4. MATRIX MULTIPLICATION 7


CHAPTER 2. MATRICES

Xp
Amn Bpq = Cmq , where Cij = ai1 b1j + ai2 b2j + ... + ain bnj = aik bkj
k=1

The product AB is not! ! pxq matrices if n 6= p.


defined for mxn and
1 2 1 1
e.g. If A = and B =
3 4 0 2
! !
1.1 + 2.0 1.1 + 2.2 1 5
AB = =
3.1 + 4.0 3.1 + 4.2 3 11
! !
1.1 + 1.3 1.2 + 1.4 4 6
BA = =
0.1 + 2.3 0.2 + 2.4 6 8

Matrix multiplication is not commutative, AB 6= BA.


The following rules of matrix multiplication are valid

Theorem 2.4.1. i) (AB)C = A(BC).


ii) A(B + C) = AB + AC
iii) (B + C)A = BA + CA
iv) k(AB) = (kA)B = a(kB)

Proof. (of i))


Let A = (aij ), B = (bjk ) and C = (ckl )
AB = (sik ), BC = (tjl )
  
a11 . . . . a1n b11 . . b1j . b1q
 . . . . . .  . . . . . .
  
  
sik = AB =   ai1 . . . . ain 
 .
 . . . . .

 . . . . . .  . . . . . .
  

am1 . . . . amn bp1 . . bpj . bpq


m
X
= ai1 ∗ b1k + ai2 ∗ b2k + ... + aim ∗ bmk = aij ∗ bjk
j=1
  
a11 . . . . a1n b11 . . b1j . b1q
 . . . . . .  . . . . . .
  
  
tjl = BC = 
 ai1 . . . . ain 
 .
 . . . . .

 . . . . . .  . . . . . .
  

am1 . . . . amn bp1 . . bpj . bpq


n
X
= bj1 ∗ c1l + bj2 ∗ c2l + ... + bjn ∗ cnl = bjn ∗ ckl
k=1

Multiplying S by C (i.e AB by C) the element in the ith row and the lth column of (AB)C is
n
X n X
X m
si1 c1l + si2 c2l + ... + sin ∗ cnl = sik ∗ ckl = (aij bjk )ckl
k=1 k=1 j=1

On the other hand, Multiplying A by T (i.e A by BC) the element in the ith row and the lth
column of A(BC) is

8 2.4. MATRIX MULTIPLICATION Chapter 2


CHAPTER 2. MATRICES

m
X m X
X n
ai1 t1l + ai2 t2l + ... + ain ∗ tnl = aij ∗ tjl = aij (bjk ckl )
j=1 j=1 k=1
The above sums are equal. 

Transpose
The transpose of a matrix A, written AT is the matrix obtained by writing the rows of A, in
order, as columns.  
! 1 3
1 2 1
if A = , AT =  2 4 
 
3 4 7
1 7
If A is an mxn matrix, AT is an nxm matrix. the transpose operation on matrices satisfies the
following properties

Theorem 2.4.2. i) (A + B)T = AT + B T


ii) (AT )T = A
iii) (kA)T = kAT , for k a scalar
iv) (AB)T = B T AT .

Proof. (of iv)


Let A = (aij ), B = (bjk )
m
X
ith row, jth column of AB is ai1 ∗ b1j + ... + aim ∗ bmj = aij ∗ bjk
j=1
=jth row, ith column of the transpose matrix (BA)T  
On the other hand, the jth row of B T = element of jth column of B: b1j b2j ... bmj
 
ai1
 
 ai2 
Furthermore, ith column of AT = elements from the ith row of A: 
 ...

 
bim
Consequently the elements appearing in the jth row, ith column of the matrix B T AT is the
product 2 by 3 which gives 1.
Hence (AB)T = B T AT . 
By induction (A1 A2 ...An )T = Atn Atn−1 ...At1 .

2.5 Special Types of Square Matrices


= (aij ), where aii = 1, aij = 0(i 6= j)
 denoted In or I 
1. Identity Matrix,
1 0 . . . 0
 
0 1 . . . 0
e.g. A=
 . . . . . . is the unit or identity matrix.

 
0 0 . . . 1
It has the property that
AIn = In A = A.

2. Inverse Matrix of a square matrix A is a matrix B, denoted A−1 , with the property that

Chapter 2 2.5. SPECIAL TYPES OF SQUARE MATRICES 9


CHAPTER 2. MATRICES

AB = BA = I , where I is the identity matrix.


The inverse B is unique.

Proof
Let C be another inverse of A, then
AC = CA = I
and
C = CI = C(AB) = (CA)B = IB = B .
!
2 5
Example 2.5.1. Find the inverse of A =
1 3
Solution
! ! ! !
a b 2 5 a b 1 0
We seek a matrix B = such that AB = =
c d 1 3 c d 0 1
! ! ( (
2a + 5c 2b + 5d 1 0 2a + 5c = 1 2b + 5d = 0
AB = = i.e
a + 3c b + 3d 0 1 a + 3c = 0 b + 3d = 1
!
3 −5
Solving for a, b, c, d gives B = A−1 =
−1 1

 
1 3 3
Example 2.5.2. Find the inverse of A = 1 4 3
 

1 3 4
Solution
    
r s t 1 3 3 r s t
We seek a matrix B = u v w such that AB = I , or 1 4 3 u v w =
    

x y z 1 3 4 x y z
 
1 0 0
0 1 0
 

0 0 1

If a matrix has an inverse it is said to be invertible (or nonsingular, and detA 6= 0).
Otherwise the matrix is called noninvertible (or singular).
If A and B are invertible nxn matrices, then AB is invertible and (AB)−1 ) = B −1 A−1 .
Proof
if A and B are invertible, then
AA−1 = I = A−1 A, and
BB −1 = I = B −1 B
(AB)(B −1 A−1 ) = I = (A−1 B −1 )(BA), and so AB is invertible, and since this inverse
is unique,
(AB)−1 ) = B −1 A−1 .
−1 −1
By induction (A1 A2 ...An )−1 ) = A−1
n An−1 ...A1 .

10 2.5. SPECIAL TYPES OF SQUARE MATRICES Chapter 2


CHAPTER 2. MATRICES

3. Symmetric Matrix A = (aij ) is one for which (aij ) = (aji ), or AT = A. In a symmetric


matrix the elements are symmetric about the diagonal of the matrix.
 
1 2 5
e.g. A = 2 5 −7 is a 3-square symmetric matrix.
 

5 −7 3

4. Skew-Symmetric Matrix A = (aij ) is one for which (aij ) = −(aji ), or AT = −A.


 
0 −5 4
e.g. A= 5 0 −1 is a 3-square skew-symmetric matrix.
 

−4 1 0
In a skew-symmetric matrix the diagonal elements are all zero ([Link] make
a11 = −a11 , a22 = −a22 , a33 = −a33 ).

5. Orthogonal Matrix A = (aij ) is one for which A−1 = AT , or AAT = AT A = I . An


orthogonal matrix A has AT as its inverse.

6. Normal Matrix A = (aij ) is one which commutes with its transpose AT , i.e. if
AAT = AT A. If A is symmetric, orthogonal, or skew-symmetric, then A is normal.
There are also other normal
! matrices.
6 3
e.g. Let A =
3 6
! ! !
6 3 6 −3 45 0
Then AT A = = and
−3 6 3 6 0 45
! ! !
6 −3 6 3 45 0
AT A = =
3 6 −3 6 0 45
T T
Because AA = A A, the matrix A is normal.

2.6 Algebra of Square Matrices


If A is an n-square matrix, we can form powers of A.
A2 = AA, A3 = AAA, ..., A0 = I
We can form polynomials in the matrix A. For any polynomial
f (x) = a0 + a1 x + ... + an xn , where ai are scalars, we define f A) to be the matrix
f (A) = a0 + a1 A + ... + an An
In the case where f (A) is the zero matrix, then A is called a zero or root of the polynomial
f (A).

Example 2.6.1.

 
1 2
Let A = . If f (x) = 2x2 + 3x − 5, then
3 −4
     
1 2 1 2 1 0
F (A) = 2 +3 -5
3 −4 3 −4 0 1

Chapter 2 2.6. ALGEBRA OF SQUARE MATRICES 11


CHAPTER 2. MATRICES

       
7 −6 3 6 5 0 16 −18
= +3 -5 =
9 −22 9 −12 0 5 −27 61

If g(x) = x2 + 3x − 10 then
              
1 2 1 2 1 2 1 0 7 −6 3 6 10 0 0 0
g(A) = +3 -10 = +3 -5 =
3 −4 3 −4 3 −4 0 1 9 −22 9 −12 0 10 0 0

Thus A is a zero of the polynomial g(x)

Exercises 2.6

1. Perform the indicated operations where possible


       
1 2 −3 4 3 −5 6 −1 1 2 −3 3 5
i) + , ii) +
0 −5 1 −1 2 0 −2 −3 0 −4 1 1 −2
iii) 3A + 4B − 2C , where
     
2 −5 1 1 −2 −3 0 1 −2
A= , B= , and C =
3 0 −4 0 −1 5 1 −1 −1
     
x y x 6 4 x+y
2. Find x, y, z, w if 3 = +
z w −1 2w z+w 3
 
2 2
3. Let A = , Find: a) A2 and b) A3
3 1

4. Find: i) f(A) and ii) g(A)


if a) f (x) = x3
− 3x2 − 2x + 4, and b) g(x) = x2 − x − 8
   
x 1 3
Find a nonzero column vector u = such that Au = 6u, if A = .
y 5 3

5. Let (rxs) denote a matrix with shape rxs. Determine the shape of the following products,
(if the product is defined):
i) A2x3 B3x4 ii) A4x1 B1x2 iii) A3x4 B3x4 iv) A5x2 B2x3

6. Find i) AB and ii) BA, given


 
2 −1  
1 −2 −5
a) A =  1 0  and B =
3 4 0
−3 4
 
 1 −2 0
b) A = 2 1 and B =
4 5 −3
   
1 1 1 2
c) A = and B =
2 2 −1 −2

7. Find the transpose AT of the matrices A and B, where


   
1 0 1 0 1 2 3
i) A = 2 3 4 5 and ii) B = 2 4 −5
4 4 4 4 3 −5 0

12 2.6. ALGEBRA OF SQUARE MATRICES Chapter 2


CHAPTER 2. MATRICES

8. Find a) AAT b) AT A, if
 
  2 6 −3
1 2 0 
i) A = , ii) A = 2 −1 3 , and iii) A =  3 2 6
3 −1 4
−6 3 2
 
x y
9. Matrices A and B are said to commute if AB=BA. Find all matrices which com-
z w
 
1 1
mute with
0 1
2
0 α α2

10. If = A(α) = 0 1 α, show that A(α)B(β) = A(α + β). Hence find the inverse
0 0 1
of A(α). Show that A(3α) − 3A(2α) + 3A(α) = 1, and hence find a cubic equation sat-
isfied by A(α).

11. Prove that for positive integers n


   
1 + 6n 4n 7 4
i) An = for A =
−9n 1 − 6n −9 −5
   
n 1 − 3n −9n −2 −9
ii) A = for A =
n 1 + 3n 1 4

2.7 Echelon Matrices


A matrix is an echelon matrix, or is in echelon form if the number of zeros preceding the first
nonzero entry of a row increases row by row until only zero rows remain.

e.g.
     
1 2 −3 0 1 0 1 7 −5 0 1 0 5 0 2
A= 0 0 5 2 −4, B = 0 0 0 0 1 , C =  0 1 2 0 4 .
     

0 0 0 7 3 0 0 0 0 0 0 0 0 1 7
Matrices A, B, C are in echelon form. circled elements are called distinguished elements (or
pivots) of the echelon matrix.

Definition 2.7.1. A matrix A is called an echelon matrix, or is said to be in echelon form, if


the following two conditions hold (where a leading nonzero element of a row of A is the first
nonzero element in the row):
1) All zero rows, if any, are at the bottom of the matrix.
2) Each leading nonzero entry in a row is to the right of the leading nonzero entry in the
preceding row.

Definition 2.7.2. A matrix A is said to be in row canonical form (or row-reduced echelon
form) if it is an echelon matrix, that is, if it satisfies properties (1) and (2) above , and if it
satisfies the following additional two properties:
1) Each pivot (leading nonzero entry) is equal to 1.
2) Each pivot is the only nonzero entry in its column.

e.g:

Chapter 2 2.7. ECHELON MATRICES 13


CHAPTER 2. MATRICES

 
1 0 −1 2
a) A = 0 1 1 3 is in echelon form, but not row reduced (in 3rd column, the
 

0 0 0 1
distinguished element is not the only nonzero entry).
 
1 0 0 3
b) B = 0 1 0 2 is in row reduced echelon form (all distinguished elements=1,
 

0 0 1 1
leading 1 is the only nonzero entry in the column)
 
0 1 0 2
c) C = 1 0 2 0 is not an echelon matrix.
 

0 0 0 0
The zero matrix of any size and the identity matrix I of any size are important special
examples of matrices in row canonical form.

2.8 Elementary Row Operations


Definition 2.8.1. The following are called elementary row operations on a matrix
A = (aij )
1) interchange ith and jth rows: Ri ↔ Rj
2) multiply ith row by a nonzero scalar k: Ri → kRi
3) add k times jth row to ith row: Ri → kRj + Ri .

Definition 2.8.2 (Row Equivalence, Rank of a Matrix). If A and B are two mxn matrices, then
A is said to be row equivalent to B, written A ∼ B , if B can be obtained from A by a
sequence of elementary row operations. In the case that B is also an echelon matrix, B is
called an echelon form of A.

   
1 0 −1 1 1 0 −1 1
e.g. A= 2 1 0 1 is is row equivalent to B = −1 1 0 2
−1 1 0 2 0 3 0 5
     
1 0 −1 1 1 0 −1 1 1 0 −1 1
since A =  2 1 0 1 R2 →R2 +2R3  0 3 0 5 R2 ↔R3 −1 1 0 2
−−−−−−−−→ −−−−→
−1 1 0 2 −1 1 0 2 0 3 0 5

Definition 2.8.3. The rank of a matrix A, written rank(A), is equal to the number of pivots in
an echelon form of A.

Theorem 2.8.1. Every mxn matrix is row equivalent to an mxn reduced echelon matrix.

Proof. (Gaussian Elimination - example)


 
 
1 2 −3 1 2 1 2 −3 1 2
R2 →R2 +2R1
 −−−−−−−−→   R2 → 1 R2
Let A = 2 4 −4 6 10. R3 →R3 +3R1 0 0 2 4 6 −−−−2−→

−−−−−−−−→ R →R −3R2
3 6 −6 9 13 R3 →R3 +3R1 0 0 3 6 7 −−3−−−3−−−→
−−−−−−−−→

14 2.8. ELEMENTARY ROW OPERATIONS Chapter 2


CHAPTER 2. MATRICES

 
1 2 −3 1 2
0 0 1 2 3, which is in echelon form.
 

0 0 0 0 −2

  R3 →− 1 R3
 
1 2 −3 1 2 −−−−−2−→
1 2 0 7 0
 R →R −3R3 
0 0 1 2 3 −−2−−−2−−−→ 0 0 1 2 0 in row reduced echelon form. 
 
R1 +3R2
0 0 0 0 −2 −− −−→ 0 0 0 0 1
R →R −11R
−−1−−−
1 3
−−−→

Exercises 2.7

1. Find the reduced echelon matrices of the following matrices


   
  1 −2 3 2 2 5 3
1 −1 2 1 −1 2 −3 6 1 5 4
i) A = 2 1 −1 1, ii) B = 
 2 −3
, iii) C =  
1  4 −1 0 1
1 −2 1 1
−1 1 2 2 0 1 1
     
1 −2 3 −1 1 −1 2 1 i 1−i i 0
iv) D = 2 −1 2 2, v) E = 2 −1 1 1, vi) F =  1 −2 0 i
3 1 2 3 1 −2 1 1 1 − i −1 + i 1 1

2. Are the following pairs of matrices row equivalent?


   
1 0 −1 3 −1 1
i) A = 2 1 0, and B = 0 2 1
1 −1 1 1 −1 1
   
1 −1 1 2 0 −1 2 3
ii) C = −2
 3 0 1, and D =  1 2 −1 0
1 0 −1 3 −2 −5 4 3

3. True or false: similar matrices have the same rank

2.9 Applications of Matrices to Systems of Linear Equations


Consider the system of m linear equations in n unknowns, x1 , x2 , ..., xn :
a11 x1 + a12 x2 + ... + a1n xn = b1
a21 x1 + a22 x2 + ... + a2n xn = b2
..............................................
am1 x1 + am2 x2 + ... + amn xn = bm
We can write the above 
system in matrix form as:     
a11 a12 . . . a1n x1 b1
     
 a21 a22 . . . a2n   x2   b2 
AX = B , where A = 
 , X =   and B = 
 
 ....

 . . . . . . 
  ...  
am1 am2 . . . amn xm bm
In solving the system we row reduce the augmented matrix

Chapter 2 2.9. APPLICATIONS OF MATRICES TO SYSTEMS OF LINEAR EQUATIONS 15


CHAPTER 2. MATRICES

 
a11 a12 . . . a1n b1
 
 a21 a22 . . . a2n b2 
(A|B) = 
 
 . . . . . . . 

am1 am2 . . . amn bm

to echelon form by elementary row operations, to get the equivalent system


 0 0 0 0

a11 a12 . . . a1n b1
 0 a022 0 0 

 . . . a2n b2 

 . . . . . . . 
 
0
0 0 . . . amn bm

The matrices A = (aij )mxn and (A | B) = (aij | bi )mxn+1 are called the coefficient matrix
and augmented matrix, respectively, of the system of linear equations.

Definition 2.9.1. An n-tuple (x1 , x2 , ..., xn ) which satisfies each of the m equations in the
system is called a solution of the system. Two systems of equations are equivalent if every
solution of one system is a solution of the other system and vice versa. A system of linear
equations is called a homogeneous system if bi = 0(i = 1, 2, ..., m). A system with at least
one solution is called a consistent system. The solution (0, 0, ..., 0) of a homogeneous
system is called the trivial solution.

0 0 0 0
Theorem 2.9.1. If (A |B ) = (aij |bi ) is an mx(n + 1) matrix obtained from the mx(n + 1)
matrix (A|B) by an elementary row operation, then the systems
m m
0 0
X X
aij xj = bi (i = 1, 2, ..., m) and aij xj = bi (i = 1, 2, ..., m) are equivalent.
j=1 j=1

We can summarise the important consequences of theorem 2.9.1 as follows:


i) if rank(A) 6= rank(A|B) Then the system is inconsistent;
ii) if rank(A) = rank(A|B) three situations arise;
a) if the number equations= number of unknowns (i.e. m=n) and bi = 0, then the zero
solution is the only solution.
b) if the number equations= number of unknowns (i.e. m=n) and bi 6= 0, then the system has
a unique solution.
c) if the number equations<number of unknowns (i.e. m<n) then the system has more than
one solution. The above four situations are illustrated in the matrices below.
 0 0 0 0

a11 a12 · · · a1n b1
 0 a022 · · · a02n b02 
 
 
 .
 . · · · . .

0 0 ··· 0 bm
Rank(A) 6= rank(A : B). Incosistent system - no solution.

16 2.9. APPLICATIONS OF MATRICES TO SYSTEMS OF LINEAR EQUATIONS Chapter 2


CHAPTER 2. MATRICES

 0 0 0 0
  0 0 0 0
  0 0 0 0

a11 a12 · · · a1n b1 a11 a12 . . . a1n b1 a11 a12 · · · a1n b1
 0 a022 · · · a02n b02   0 a022 . . . a02n b02   0 a022 · · ·
     
     0 0.
 .
 . ··· . .
 .
 . ··· . .
 .
 . ··· . .
0 0
0 0 · · · amn 0 0 0 . . . amn bm 0 0 ··· 0 0

i) Trivial Solution ii) Unique Solution iii) More than one Solution

The solution options are illustrated in the flowchart below

Start

System of
Linear Eqtns,
m equations,
n unknowns

Reduce Aug-
mented Matrix
to Echelon Form

Yes No
Consistent R(A)=R(A:B)? Inconsistent
No solution

Stop
m eqtns=n less eqs
Many Solutions
unknowns?

Yes

Yes
b=0? Trivial Solution

No
Unique Solution

Figure 2.1: Solution Options for a Systems of Linear Equations

Exercises 2.9

Chapter 2 2.9. APPLICATIONS OF MATRICES TO SYSTEMS OF LINEAR EQUATIONS 17


CHAPTER 2. MATRICES

1. Using thhe methods of this section

a) Solve the following systems of homogeneous equations


  

 x+ y− z =0 
 x+ y− z =0 
 4w + x + 2y − 3z = 0
  
i) 2x − 3y + z = 0 ii) 2x + 4y − z = 0 iii) −7w + 2x − 3y + 5z = 0

 
 

x − 4y + 2z = 0 3x + 2y + 2z = 0 8w + 5x + 6y + 9z = 0
  

b) solve each of the following systems of nonhomogeneous equations


  

 x + 2y − 4z = −4 
 x + 2y − 3z = −1 
 x + 2y − 3z = 1
  
i) 2x + 5y − 9z = −10 ii) −3x + y + 2z = −7 iii) 2x + 5y − 8z = 4

 
 

3x − 2y + 3z = 11 5x + 3y − 4z = 2 3x + 8y − 13z = 7
  

2. Determine the values of λ for which the following systems of equations are consistent, and
for those values of λ find the complete solutions.
 

5x + 2y − z = 1 
 x + 5y + 3 = 0
 
i) 2x + 3y + 4z = 7 ii) 5x + y − λ = 0

 

4x − 5y + λz = λ − 5 x + 2y + λ = 0
 

2.10 Elementary Matrices


Definition 2.10.1. An nxn matrix is called an elementary matrix if it can be obtained by
applying an elementary row operation to the identity matrix In .

Thus there are three types of elementary matrices corresponding to the three elementary
row operations:
i) Ri ↔ Rj
   
1 0 0 0 1 0
e.g. 0 1 0 R1 →R2 E01 = 1 0 0
   
−−−−→
0 0 1 0 0 1

ii) Ri → kRi .
   
1 0 0 1 0 0
e.g. 0 1 0 R2 →kR2 E02 = 0 k 0
   
−−−−−→
0 0 1 0 0 1

iii) Ri → kRj + Ri .
   
1 0 0 1 0 0
e.g. 0 1 0 E03 = 0 1 k 
   
R →R +kR
−−2−−−2−−−→
3

0 0 1 0 0 1

2.11 Application to Finding the Inverse of an nxn Matrix


 
1 3 3
Find the inverse of the matrix A = 1 4 3
 

1 3 4

18 2.10. ELEMENTARY MATRICES Chapter 2


CHAPTER 2. MATRICES

First form the (block) matrix M=[A, I] and row reduce M to echelon form:
   
1 3 3 1 0 0 R2 →R2 −R1 1 3 3 1 0 0
1 4 3 0 1 0 −R−3−−−−−→
→R −R1
0 1 0 −1 1 0
−−−−−3−−→
1 3 4 0 0 1 0 0 1 −1 0 1

In echelon form, the left half of M is in triangular form; hence, A has an inverse. Next we
further row reduce M to its row canonical form:
   
1 3 3 1 0 0 R1 →R1 −3R2 1 0 0 7 −3 −3
0 1 0 −1 1 0−− −−−−−−→
R1 →R1 −R3
0 1 0 −1 1 0
−−−−−−−→
0 0 1 −1 0 1 0 0 1 −1 0 1

The identity matrix is now in the left half of the final matrix; hence, the right half is A−1 .
In other
words 
7 −3 −3
A−1 = −1 1 0
 

−1 0 1

Exercises 2.10

Find inverses (if they exist) of the following matrices


     
1 2 4 1 2 3 1 0 1
i) 0 1 −1, ii) 1 3 5 , iii) 2 −1 1
     

1 0 2 1 5 12 1 2 1

2.12 Elementary Column Operations and Equivalent Matrices


Definition 2.12.1. definition 2.8.1, page 14 (elementary row operations and row equivalence
of matrices) can be adapted to define elementary column operations on a matrix and column
equivalence of matrices.
If an mxn matrix A is column equivalent to an mxn matrix B, then there exists an invertible
nxn matrix Q such that B = AQ.
If A is an mxn matrix then there exist an mxm matrix P such that P AQ = N
where N is of the form:
" #
Irxr 0rx(n−r)
N=
0(m−r)xr 0(m−r)(n−r)
where Irxr is the rxr identity matrix and 0pxq is the zero pxq matrix.

 
2 −1 0 1
Example 2.12.1. Let A = 1 2 1 −1
2 9 4 −5

How do we find P?

Chapter 2 2.12. ELEMENTARY COLUMN OPERATIONS AND EQUIVALENT MATRICES 19


CHAPTER 2. MATRICES

Answer:

Reduce (A3x4 |I3 ) to row canonical form by elementary row operations

   
2 −1 0 1 1 0 0 R1 →R1 −2r2 0 −5 −2 3 1 −2 0
1 2 1 −1 0 1 0−−−−−−−→
R →R −2R2
1 2 1 −1 0 1 0
−−3−−−3−−−→
2 9 4 −5 0 0 1 0 5 2 −3 0 −2 1

0 1 52 − 35 − 51 2
   
0 −5 −2 3 1 −2 0 R3 →R3 −R1 5 0
−−−−−−−→
1 2 1 −1 0 1 0R1 → 15 R1 1 2 1 −1 0 1 0
−−−−−→
0 5 2 −3 0 −2 1 0 0 0 0 1 −4 1

0 1 52 − 35 − 51 2
1 0 51 − 15 − 25 1
   
5 0 R2 ↔R1 5 0
−−−−→ 0 1 2 − 3
1 2 1 −1 0 1 0 R1 →R1 −2R2 5 5 − 15 2
5 0
−−−−−−−−→
0 0 0 0 1 −4 1 0 0 0 0 1 −4 1

 2 1
  2 1

−5 5 0 1 0 5 5
Hence P = − 15 2
5 0
 and P A = 0 1
 2
5 − 35 
1 −4 1 0 0 0 0

To find Q we reduce (P A4x4 |I) to row canonical form by elementary column operations

 C →C − 15 C1
1 0 25 1
0 −−3−−−3−− 1 0 0 0 1 0 − 15 − 51
  
5 1 0 0 −→
2 3 C4 →C4 − 1
5 C1
0 1 − 0 1 0 0 0 1 0 0 0 1 − 2 3
− −−−−−−−→ 
5 5 C →C − 2 5 5
5 C2
  
0 0 0 0 0 0 1 0 −−3−−−3−− −→ 0 0 0 0 0 0 1 0 
C →C − 3 5 C2
0 0 0 1 −−4−−−4−− −→ 0 0 0 1

0 − 15 − 15
   
1 1 0 0 0
0 1 − 52 3
5 ,
0 1 0 0
and if Q = 
0 P AQ =  .
0 1 0 0 0 0 0
0 0 0 1 0 0 0 0

Definition 2.12.2 (Equivalent Matrices). Two mxn matrices A and B are equivalent if there
exist invertible matrices P and Q such that P AQ = B .
If P AQ = B then P −1 AQ−1 = A
0 0 0 0
and if P AQ = B and P BQ = C then (P P )A(QQ ) = C .
Equivalently, every mxn matrix A is equivalent to a matrix with a simple form N, where r
denotes the number of nonzero rows in the reduced echelon form of A. N may be regarded
as the canonical form of A under this equivalence relation. N is called the normal form of A.

20 2.12. ELEMENTARY COLUMN OPERATIONS AND EQUIVALENT MATRICES Chapter 2


Chapter 3: Determinants
Outputs

Given an n-square matrix A, ability to:

1) find the characteristic equation, and the determinant of A, det(A)

2) find the minors, cofactors and adjoint of A, Adj(A)

3) use the adj(A) and det(A) to find the inverse of A

4) Apply determinants to find solutions of linear equations (Cramer’s rule)

3.1 Definition of Determinants


Each n-square matrix A = aij is assigned a special scalar called the determinant of A,
a11 a12 . . . a1n
a21 a22 . . . a2n
denoted by det(A) or | A | or .
. . . . . .
am1 am2 . . . amn

Determinants of orders 1 and 2 are defined as follows:


a11 a12
a11 = a11 and = a11 a22 − a12 a21 .
a21 a22

Thus, the determinant of a 1x1 matrix A = a11 is the scalar a11 ; that is, det(a11 = a11 .
The determinant of order two may easily be remembered by using the following diagram:
+
a11 a12

a21 a22

Determinants of nxn matrices with n > 2 are calculated through a process of reduction and
expansion using minors and cofactors.

3.2 Minors and Cofactors of a Square Matrix


Let A be an n-square matrix A = (aij ).
Let M be the square submatrix of A obtained by deleting the ith row and jth column.

The Minor of aij is the determinant |Mij |.


 
−3 4 0
Example 3.2.1. A = −2 7 6.
 

5 −8 0

Minors of elements a11 , a12 , a13 of A are:

21
CHAPTER 3. DETERMINANTS

7 6
|M11 | = = (7)(0) − (−8)(6) = 48
−8 0

−2 6
|M12 | = = (2)(0) − (5)(6) = −30
5 0

−2 7
|M13 | = = (−2)(−8) − (5)(7) = −19.
5 −8

The Cofactor of an nxn matrix A, denoted Aij , is defined in terms of its associated minor as:
Aij = (−1)i+j |Mij |.

The Cofactors of the minors a11 , a12 and a13 above are:
A11 = (−1)1+1 (48) = 48
A12 = (−1)1+2 (−30) = 30
A13 = (−1)1+3 (−3) = 19

The determinant is calculated as the sum:


DetA = a11 A11 + a12 A12 + a13 A13 = (−3)(48) + (4)(30) + (0)(19) = −24

This represents expansion along a row (row 1)

Note that (−)i+j accompanying the minors form a checkerboard pattern with +’s along the
main
 diagonal 
+ − + ...
 
 − + − ... 
 
 + − + ... 
 
...

We can as well choose to expand along a column of A, e.g. along column 1 we get
DetA = (−3)(38) − (−2)(0) + 5(24) = −24

Choosing to expand along a row or column having many zeros, if it exists (e.g. column 3),
greatly reduces the number of computations required to compute det A. This can be
facilitated by reducing the matrix by Elementary Row/Column Operations, see section 3.5.1
below.

3.3 Properties of Determinants


The following properties of determinants will be useful in their evaluation:
i) if A and B are squre matrices of the same order, then |AB| = |A||B|
ii) A matrix A and its transpose have the same determinant: |A| = |AT |
iii) If we swap two rows (columns) in A, |A| changes sign,|A| = −|A|
iv) If we multiply a row (column) in A by a real number α, |A| changes in α|A|, |A| = α|A|
v) If any row (column) of a determinant only consists of zeros, the determinant is 0, |A| = 0
vi) If a determinant has two equal rows or two equal columns, the determinant is 0, |A| = 0.

22 3.3. PROPERTIES OF DETERMINANTS Chapter 3


CHAPTER 3. DETERMINANTS

vii) If a determinant has two proportional rows or two proportional columns, the determinant
is 0, |A| = 0 .
viii) if A is lower or upper triangular (i.e. has zeroes above or below the main diagonal), then
|A| = product of diagonal elements. Thus in particular |I| = 1, where I is the identity matrix.
ix) If A has an inverse, then |A−1 | = 1
|A|

3.4 Major Properties of Determinants


The following are two of the most important and useful theorems on determinants, which we
state as theorems:
a) Property i) above

Theorem 3.4.1. The determinant of a product of two matrices A and B is the product of
their determinants; that is,
det(AB) = det(A)det(B)

The above theorem says that the determinant is a multiplicative function.

and
b) property ix)

Theorem 3.4.2. Let A be a square matrix. Then the following are equivalent:
i) A is invertible; that is, A has an inverse A−1 .
ii) AX = 0 has only the zero solution.
iii) The determinant of A is not zero; that is, det(A) 6= 0.

Thus using the above properties, we can evaluate determinants by the methods given in the
following sections.

3.5 Evaluation of Determinants


3.5.1 Determinants by Elementary Row/Column Operations
recall representation used in elementary row operations
Ri means row i
Ci means column i
Ri ↔ Rj means: swap rows Ri and Rj (the value of the determinant changes sign)
Ri → Ri − Rj means: replace row Ri by Ri − Rj (the value of the determinant does not
change)
Ci ↔ Ci − Cj means: replace column Ci by Cj (the value of the determinant does not
change)

x 2m 1
Example 3.5.1. i) Find the determinant of A = 3 1 1
x m 1

Chapter 3 3.4. MAJOR PROPERTIES OF DETERMINANTS 23


CHAPTER 3. DETERMINANTS

Solution

x 2m 1 0 2m 0
3 1 1 R1 →R1 −R3 3 1 1 = −m(3 − x) = m(x − 3).
−−−−−−−→
x m 1 x m 1

1 2 3
ii) Find the determinant of B = 4 5 6
7 8 9

Solution

1 2 3 −3 −3 −3
4 5 6 R1 →R1 −R2 −3 −3 −3 = 0.
−−−−−−−→
7 8 9 −3 −3 −3

1 1 + m −1
iii) Find the determinant of C = 3 3 + m −3
5 m −1

Solution

1 1 + m −1 0 1 + m −1
3 3 + m −3 C1 →C1 +C3 0 3 + m −3 = 4{(1 + m)(−3) − (−1)(3 + m)} =
−−−−−−−→
5 m −1 4 m −1
4{−3 − 3m + 3 + m} = 4(−2m) = −8m.

3.5.2 Determinants by Factorization


recall that:
• if we multiply a row in a matrix A by a real number α, the determinant |A| changes to α|A|
This means that if the elements of a row of a determinant have a common factor, we can put
this factor in front of the determinant. The same property holds for the elements of a column

1 3 −2
Example 3.5.2. i) Find the determinant of A = 2 4 5
−4 −12 8

Solution

1 3 −2 1 3 −2
A= 2 4 5 f actor −4 f rom row3 2 4 5 =0
−−−−−−−−−−−−−−−−−→
−4 −12 8 1 3 −2

a a2 a3
ii) Find the determinant of B = b b2 b3
c c2 c3

Solution

24 3.5. EVALUATION OF DETERMINANTS Chapter 3


CHAPTER 3. DETERMINANTS

a a 2 a3 1 a a2
B= b b b 2 3 1 b b2
f actorabcf romeachrow

− − −− − − − −− −− −− →
c c2 c3 1 c c2
0 a − b a2 − b2
R1 →R1 −R2
−−−−−−−→ (abc) 0 b − c b2 − c2
R →R −R3
−−2−−−2−−→ 1 c c2
0 1 a+b
(abc)(a − b).(b − c) 0 1 b + c
f actor(a−b) (b−c) in rows 1 and 2
−−−−−−−−−−−−−−−−−−−−→
1 c c2

expanding along the first column, |C| = a.b.c.(a − b).(b − c).(c − a)

3.5.3 Determinants by Pivotal Condensation


[(easy to program for computer implementation)]
Steps to calculate |A|:
1 initialise D (a scalar to record changes in |A| resulting from elementary row operations)
2 use elementary row operations to reduce A to row echelon form
• put D = (1/k)D each time a row is multiplied by k
• No change to D for elementary row operation of the third kind
|A| = product of D and all diagonal elements of row echelon form of A

0 2 2
Example 3.5.3. Find the determinant of A = 1 0 3
2 1 1

Solution

D=1

0 2 2 1 0 3
1 0 3 R1 ↔R2 0 2 2 D → (−1)D = −1
−−−−→
2 1 1 2 1 1

0 2 2 1 0 3
1 0 3 R3 →R3 −2R2 0 2 2 D → D = −1 No change to D
−−−−−−−−→
2 1 1 0 1 −5

0 2 2 1 0 3
1 0 3 R2 → 21 R2 0 1 1 D → (2)D = (2)(−1) = −2
−−−−−→
0 1 −5 0 1 −5

0 2 2 1 0 3
0 1 1 R3 →R3 −R2 0 1 1 D → D = −2 No change to D
−−−−−−−→
0 1 −5 0 0 −6

0 2 2 1 0 3
0 1 1 R3 →− 61 R3 0 1 1 D → (−6)D = (−6)D = (−6)(−2) = 12
−−−−−−→
0 0 −6 0 0 1

The diagonal elements of the row echelon matrix are 1

Chapter 3 3.5. EVALUATION OF DETERMINANTS 25


CHAPTER 3. DETERMINANTS

|A| = D(1)(1)(1) = 12.

3.5.4 Calculating 3x3 Determinants


[(Not to be used for determinants of order 4 or higher)] Repeat the first two columns of the
determinant to the right of it. The determinant is equal to the sum of the products along the
diagonals parallel to the main diagonal minus the products of the diagonals that run from
bottom left to top right.
 
2 3 −4
Example 3.5.4. Find the determinant of A = 0 −4 2
1 −1 5

Solution

− − −
2 3 −4 2 3

0 −4 2 0 4

1 −1 5 1 −1
+ + +

|A| = (2)(−4)(5) + (3)(2)(1) + (−4)(0)(1) − (1)(−4)(−4) − (−1)(2)(2) − (5)(0)(3)

(−40 + 6 + 0) − (−16 + 4 + 0)

= −46

Exercises 3.5

1. Using the methods in the preceding sections, evaluate the following determinants
1
2 1 2 2 3 1 a b c 2 −1 − 13
3 1
i) −1 1 5 , ii) 1 0 1 , iii) c a b , iv)4 2 −1
−1 2 3 3 3 2 b c a 1 −4 1
t+3 −1 1 a b + c a2
v) 5 t−3 1, vi) b c + a b2
6 −6 t + 4 c a + b c2

2. Solve the equation


a−x b−x c
a−x c b−x
a b−x c−x

3.6 Classical Adjoint


[(and application to matrix inversion)]
Let A = aij be an nxn matrix over a field K and let Aij denote the cofactor of aij . The

26 3.6. CLASSICAL ADJOINT Chapter 3


CHAPTER 3. DETERMINANTS

classical adjoint of A, denoted by adj A, is the transpose of the matrix of cofactors of A.


T
Namely, adjA = Aij .

1 2 3
Example 3.6.1. For the matrix 2 3 2
3 3 4

A11 = 6, A12 = −2, A13 = −3, A21 = 1, A22 = −5, A23 = 3’ A31 = −5,
A32 = 4, A33 = −1

6 −2 −3 6 1 −5
The matrix of cofactors is, Aij = 1 −5 3, and the adjoint is adjATij = −2 −5 4.
−5 4 −1 −3 3 −1

Theorem 3.6.1. Let A be any square matrix. Then A(adjA) = (adjA)A = |A|I where I is
6 0, A−1 = |A|
the identity matrix. Thus, if |A| = 1
(adjA).

Example 3.6.2. For the matrix A above,

det(A) = −7, Thus A has an inverse by theorem 3.4.2, and by theorem 3.6.1 the inverse
6 1 −5
A−1 = −7
1
−2 −5 4
−3 3 −1

Exercises 3.6

1 1 1
1. Let A = 2 3 4 find a) adj A, b) A−1 using adj A.
5 8 9

1 t 0
2. For what values of t will the matrix A = 0 1 −1 be noninvertible? For all other val-
t 0 1
ues find the inverse.
x+1 0 −1
3. Find the adjoint of A, where A = 0 x+1 −2
1 1 x−2
 
1 −a b
4. Show that the matrix A =  a 1 2 is invertible for any real values of a and b.
−b 0 1

Chapter 3 3.6. CLASSICAL ADJOINT 27


CHAPTER 3. DETERMINANTS

3.7 Applications to Solutions of Linear Equations


[(Cramer’s Rule)] Consider the system of m linear equations in n unknowns, x1 , x2 , ..., xn :

a11 x1 + a12 x2 + ... + a1n xn = b1


a21 x1 + a22 x2 + ... + a2n xn = b2
..............................................
am1 x1 + am2 x2 + ... + amn xn = bm

Let ∆ denote the determinant of the matrix aij of coefficients ∆ = |A|.


Let ∆i denote the determinant of the matrix obtained by replacing the ith column of A by the
coulmn of constant terms (the b’s).

Theorem 3.7.1 ((Cramer’s Rule)). The system of linear equations AX = b has a unique
solution if and only if |A| =
6 0, and the unique solution is
∆1 ∆2 ∆n
x1 = ∆ , x2 = ∆ ,. . . , xn = ∆ .

The above theorem applies only when there are the same number of equations as
unknowns, and gives a solution when ∆ 6= 0.
If ∆ = 0 the theorem does not say whether a solution exists, except in the case of a
homogeneous system of equations.

Theorem 3.7.2. The system of homogeneous linear equations AX = 0 has a nonzero


solution if and only if ∆ = |A| = 0

Exercises 3.7

1. Solve the the following systems of equations using determinants


 
 x+ y+ z =5
 2x + 3y
 =z + 1
a) x − 2y − 3z = −1 b) 3x + 2z = 8 − 5y
 
2x + y − z = 3 3z − 1 = x − 2y
 

2. Use determinants to find those values of k for which the system



kx + y + z = 1

x + ky + z = 1

x + y + kz = 1

has a) a unique solution, b) more than one solution, c) no solution


Note: Use Gaussian elimination to determine values of k for which the system has more
than one solution or has no solution.

28 3.7. APPLICATIONS TO SOLUTIONS OF LINEAR EQUATIONS Chapter 3


Chapter 4: Vector Spaces
Outputs
1. understanding of the concept of a vector space, and ability to determine if a given field is
a vector space

2. Ability to:

i) determine linear dependence/independence of vectors

ii) find a basis and dimension of a vector space

4.1 Review of Spatial Vectors


[(Vectors in Rn and C n )]

1. We consider vectors with tail at the origin. In this case the vectors are completely
determined by their components, i.e. we write v = [a, b, c] meaning that the 3 components
of v are a, b, c, or v = ai + bj + ck.
In other words, if v is a vector with tail at the origin, the components of v are the same as
the co-ordinates of the head of v.

Similarly, if a point P has co-ordinates (a, b, c) we say P = (a, b, c).


So we may think of a triple as a point or a vector in 3D-space.

Thus every vector in 1D, 2D, or 3D-space has two aspects:


i) it is a geometric object
ii) it is a 1, 2 or 3-tuple of numbers

2. Two vectors, u and v, are equal, written u = v , if they have the same number of
components and if the corresponding components are equal.

Example 4.1.1. Example

If (x − y, x + y, z − 1) = (4, 2, 3)

z z z
P (a, b, c) 2v
u v −v
v

y y y
x x x

Figure 4.1: a) Position Vector b) Vector Equality c) Negatives and Combinations

29
CHAPTER 4. VECTOR SPACES

we must have x − y = 4, x + y = 2, z−1=3


Hence we can solve for x, y, z, to get x=2, y=-1, z=4

3. There are two important operations we can carry out on vectors. These are i) addition, and
ii) scalar multiplication.

i) addition
The vector u+v can be obtained by placing the initial point of v on the terminal point of u
and joining the initial point of u to the terminal point of v. This is called the parallelogram
law of vector addition, i.e. it is the diagonal of the parallelogram formed by u and v.
0 0 0 0 0 0
if (a, b, c) and (a , b , c ) are the endpoints of the vectors u and v, then (a + a , b + b , c + c )
is the endpoint of the vector u+v.

ii) scalar multiplication


The product ku of a vector u by a scalar k is obtained by multiplying the magnitude of u by
k and retaining the same direction if k > 0 or the opposite direction if k < 0. Also, if (a, b, c)
is the endpoint of the vector u, then (ka, kb, kc) is the endpoint of the vector ku
The scalar product of u by a scalar k, written ku, is the vector obtained by multiplying each
component of u by k. That is, ku = (ka1 , ka2 , . . . , kan ).

Note that u+v and ku are also vectors in Rn .


The sum of vectors with different numbers of components is not defined.

Example 4.1.2. Find

a) 2u − 3v , if u = (2, −3, 6) and v = (8, 2, −3)

b) (1 − 2i)u + (3 + i)v , if u = (3 − 2i, 4i, 1 + 6i) and v = (5 + i, 2 − 3i, 5)

Solution

a) for 2u − 3v we first carry out the multiplication

2u − 3v = 2(2, −3, 6) − 3(8, 2, −3) = (4, −6, 12) − (24, 6, −9)

then the addition

= (4 − 24, −6 − 6, 12 + 9) = (−20, −12, 21)

b) for (1 − 2i)u + (3 + i)v

(1 − 2i)u + (3 + i)v = (1 − 2i)(3 − 2i, 4i, 1 + 6i) + (3 + i)(5 + i, 2 − 3i, 5)

= ((1 − 2i)(3 − 2i), (1 − 2i)4i, (1 − 2i)(1 + 6i)) + ((3 + i)(5 + i), (3 + i)(2 − 3i), (3 + i)5)

= (−1 − 8i, 8 + 4i, 13 + 4i) + (14 + 8i, 9 − 7i, 15 + 5i)

30 4.1. REVIEW OF SPATIAL VECTORS Chapter 4


CHAPTER 4. VECTOR SPACES

= (13, 17 − 3i, 28 + 9i)

Negatives and subtraction are defined in Rn as follows: −u = (−1)u and u − v = u + (−v)


The vector −u is called the negative of u, and u − v is called the difference of u and v .

Now suppose we are given vectors (u1 , u2 , . . . . um ) in Rn and scalars k1 , k2 , . . . , km in


Rn .
We can multiply the vectors by the corresponding scalars and then add the resultant scalar
products to form the vector v = k1 u1 + k2 u2 + . . . + km um
Such a vector v is called a linear combination of the vectors u1 , u2 , . . . , um .

Example 4.1.3. i) Find x and y , if (4, y) = x(2, 3)

We must have (4, y) = (2x, 3x)

4=2x, or x=2; and y=3x, or y=6

ii) Find x, y, z if (2, −3, 4) = x(1, 1, 1) + y(1, 1, 0) + z(1, 0, 0)

first multiply by the scalars x, y and z and then add

(2, −3, 4) = (x, x, x) + (y, y, 0) + (z, 0, 0)

(2, −3, 4) = (x + y+ z, x + y, x) set the corresponding components equal to each other
x + y + z = 2

to get the system x + y = −3

x =4

thus x=4, y=-7, z=5

(2, −3, 4) = 4(1, 1, 1) − 7(1, 1, 0) + 5(1, 0, 0)

Basic properties of vectors under the operations of vector addition and scalar multiplication
are given as

Theorem 4.1.1. For all vectors u, v, w and scalars s and t


i) u + v = v + u
ii) u + (v + w) = (u + v) + w
iii) (s + t)v = sv + tv
iv) (st)v = s(tv)
v) s(u + v) = su + sv
vi) v + (−1)v = 0
vii) v + 0 = v
viii) 1v = v

These can be proved in two ways:


i) translate everything to components and compare components on the left and right
ii) geometrically

Chapter 4 4.1. REVIEW OF SPATIAL VECTORS 31


CHAPTER 4. VECTOR SPACES

4.2 Dot (or Inner) Product


Consider arbitrary vectors u and v in Rn ; say, u = (a1 , a2 , . . . , an ) and
v = (b1 , b2 , . . . , bn )
The dot product (or inner product or scalar product) of u and v is denoted < u, v >, and
defined by < u, v >= (a1 b1 + a2 b2 + . . . + an bn ), i.e.
< u, v > is obtained by multiplying corresponding components and adding the resulting
products.
The vectors u and v are said to be orthogonal (or perpendicular) if their dot product is zero,
that is, if < u, v >= 0, then u and v are orthogonal.

Basic properties of of the dot product in Rn are given as

Theorem 4.2.1. For any vectors u, v and scalar k in Rn


i) < (u + v), w >=< u, w > + < v, w >
ii) < (ku), v >= k(< u, v >)
iii) < u, v >=< v, u >
iv) < u, u >≥ 0; and < u, u >= 0 iff u = 0.

Exercises 4.2

1. Find
a) < u, v >, and b) < v, u >, if
i) u = (1, 2, 4), v = (3, 5, 1)
ii) u = (5, 4, 1), v = (3, −4, 1)
iii) u = (3 − 2i, 4i, 1 + 6i), v = (5 + i, 2 − 3i, 7 + 2i)

2. Determine which of the following vectors are orthogonal:


u = (5, 4, 1), v = (3, −4, 1), w = (1, −2, 3)

3. Find kuk if u = (3, −12, −4)



4. Find k if u = (1, k, −2, 5) and kuk = 39

5. Find k so that the vectors u and v are orthogonal where


i) u = (1, k, −3) and v = (2, −5, k)
ii) u = (2, −3k, −4, 1, 5) and v = (6, −1, 3, 7, 2k)

4.3 Norm (or Length) of a Vector


The length or norm of a vector u in Rn , denoted kuk, is defined to be the nonnegative
square root of < u, u >. In particular, if u = (a1 , a2 , . . . , an ), then
√ p
kuk = < u, u > = a21 + a22 + . . . + a2n
Evidently kuk ≥ 0, and kuk = 0 iff u = 0.

A vector u is called a unit vector if kuk = 1 or, equivalently, if < v, v >= 1. For any nonzero

32 4.2. DOT (OR INNER) PRODUCT Chapter 4


CHAPTER 4. VECTOR SPACES

1 v
vector v in Rn , the vector vb = kvk v = kvk is the unique unit vector in the same direction as v.
The process of finding vb from v is called normalizing v.

b , we first find kuk2 =< u, u > by


Example 4.3.1. Suppose u = (1, 2, −2, 4). To find u
squaring each component of u and adding, as follows:

kuk2 = 12 + 22 + (−2)2 + 42 = 1 + 4 + 4 + 16 = 25
v √ 1 2 2 4
Then = 25 = 5 and vb = ( , , − , )
kvk 5 5 5 5

This is the unique unit vector in the same direction as v.

The following formula is known as the Schwarz inequality (or Cauchy–Schwarz inequality,
used in many branches of mathematics).

Theorem 4.3.1 ((Schwarz)). For any vectors u, v in Rn , k< u, v >k ≤ kukkvk

Proof. if u = 0 the relation holds


(<u,v>)u
if u 6= 0, let w = v − kuk2
< v, u >
< w, u >=< v, u > − < u, u >= 0
kuk2
0 ≤ kwk2 = |< w, w >|
< v, u >
=< w, v > − u
kuk2
< v, u >
=< w, v > < v, v > − < u, v >
kuk2
|(< u, v >)|2
= kuk2 −
kuk2
i.e |(< u, v >)|2 = kuk2 kvk2 , and taking square roots, |(< u, v >)| = kukkvk 

Using the above inequality, we can prove the triangle (or Minkowski) inequality.

Theorem 4.3.2 ((triangle)). For any vectors u, v ∈ Rn , ku + vk ≤ kuk + kvk

Proof. Using |(< u, v >)| = kukkvk


ku + vk2 =< u + v, u + v >
≤ kuk2 + (< u, v >) + (v, u) + kvk2
≤ kuk2 + |+(< u, v >) + (v, u) + |+kvk2
≤ kuk2 + |+(< u, v >)| + |(v, u)| + kvk2
≤ kuk2 + 2kukkvk + kvk2
= (kuk + kvk)2 Hence the result 

The triangle inequality states that the length of a side of a triangle is less than the sum of the
other two sides.

Chapter 4 4.3. NORM (OR LENGTH) OF A VECTOR 33


CHAPTER 4. VECTOR SPACES

z z
u
v v
v v +
+ u
v u

u u

y y
x x

Figure 4.2: a) Cauchy-Schwarz inequality b) Triangle inequality

u
u∗
v

y
x

Figure 4.3: a) Distance Between Vectors u,v; u-v

4.4 Distance, Angles, Projections


The distance between vectors
u = (a1 , a2 , . . . , an ) and v = (b1 , b2 , . . . , bn ) in Rn is denoted and defined by
p
d(u, v) = ku − vk = (a1 − b1 )2 + (a2 − b2 )2 + . . . + (an − bn )2

The angle θ between nonzero vectors u, v in Rn is defined by


< u, v >
cosθ =
kukkvk
Note that if < u, v >= 0, then θ = 900 (or θ = π2 ).
This agrees with our previous definition of orthogonality.

The projection of a vector u onto a nonzero vector v is the vector denoted and defined by
< u, v > 2 < u, v >
proj(u, v) = v= v
kvk < v, v >

Example 4.4.1. Let u = (1, −2, 3) and v = (2, 4, 5). Then


p √ √
i) d(u, v) = (1 − 2)2 + (−2 − 4)2 + (3 − 5)2 = 1 + 36 + 4 = 41

1.2 + (−2).4 + 3.5 2 − 8 + 15 9


ii) cosθ = p p =p √ =p √
(1 + 4 + 9) (4 + 16 + 25) (14) 45 (14) 45

9 1
iii) proj(u, v) = (2, 4, 5) = (2, 4, 5)
45 5

34 4.4. DISTANCE, ANGLES, PROJECTIONS Chapter 4


CHAPTER 4. VECTOR SPACES

Exercises 4.4

     
x y x 6 4 x+y
1. Find x, y, z, w if 3 = +
z w −1 2w z+w 3
     
1 2 3
2. a) Find i) 3u − 2v ii) 5u + 3v − 4w, If u =  3,v = 1, w = −2
−4 5 6

3. Suppose u and v are two non-zero vectors in R3 .


What does each of the following conditions imply about the linear independence or depen-
dence of the set {u, v}?
a) u = 3v
b) au + bv = 0 ⇒ a = b = 0
c) < u, v >= 0

4. Find a) < u, v >, b) < v, u >, c) kuk, and, d) kvk if


i) u = (1 − 2i, 3 + i), v = (4 + 2i, 5 − 6i)
ii) u = (1, −3, 4), v = (3, 4, 7)

5. Find k so that
a) u and v are orthogonal, if u = (3, k, −2), v = (6, −4, −3)

b) kuk = 39, where u = (1, k, −2, 5)

6. Normalize each vector: a) u = (5, −7), b) v = (1, 2, −2, 4), c) w = ( 12 , 13 , 34 )

4.5 Vector Spaces


4.5.1 4.20 Introduction
Before we give the formal definition of a vector space, let us first recall some familiar
examples of vector spaces we have dealt with in section 4.1.

1. We dealt with R2 , a vector space of all vectors of the form (x, y), where x and y are real
numbers.
We will write the set of all vectors in R2 as V2 (R) = {(x, y)|x, y ∈ R}.
For instance, (−4, 3.5) is a vector in V2 (R).

We can add two vectors in V2 (R) by adding their components separately, thus for
instance (1, 2) + (3, 4) = (4, 6).
We can multiply a vector in V2 (R) by a scalar by multiplying each component separately,
thus for instance 3(1, 2) = (3, 6).
Among all the vectors in V2 (R) is the zero vector (0, 0).
Vectors in V2 (R) are used for many physical quantities in two dimensions; they can be
represented graphically by arrows in a plane, with addition represented by the
parallelogram law and scalar multiplication by scaling.

Chapter 4 4.5. VECTOR SPACES 35


CHAPTER 4. VECTOR SPACES

2. The vector space V3 (R) is the space of all vectors of the form (x, y, z), where x, y, z are
real numbers: V3 (R) = {(x, y, z) : x, y, z ∈ R}.
We will write the set of all vectors in R3 as V3 (R) = {(x, y, z)|x, y, z ∈ R}.
Addition and scalar multiplication proceeds similar to V2 (R):
e.g. (1, 2, 3) + (4, 5, 6) = (5, 7, 9), and 4(1, 2, 3) = (4, 8, 12).
Among all the vectors in V3 (R) is the zero vector (0, 0, 0).
Vectors in V3 (R) are used for many physical quantities in three dimensions, such as
velocity, momentum, current, electric and magnetic fields, force, acceleration, and
displacement; they can be represented by arrows in space.
However, addition of a vector in V2 (R) to a vector in V3 (R) is not defined; e.g.
(1, 2) + (3, 4, 5) doesn’t make sense.

3. One can similarly define the vector spaces V4 (R), V5 (R), etc.
Vectors in these spaces are not often used to represent physical quantities, and are more
difficult to represent graphically, but are useful for describing populations in biology,
portfolios in finance, or many other types of quantities which need several numbers to
describe them completely.
Instead of restricting ourselves to the field of real numbers R only, we can work over any
arbitrary field K (e.g. the field of complex numbers C or the rational numbers Q).

4.6 Definition of a Vector Space


Definition 4.6.1. A set V is called a vector space over the field K, or a K-space, if:
a) V is closed under a binary operation called addition (+), which takes two vectors v and w
∈ V and returns another vector v + w ∈ V

b) V is closed under scalar multiplication, which takes a scalar k in R and a vector v ∈ V ,


and returns another vector kv ∈ V
Furthermore, for V to be a vector space, the following properties must be satisfied:
A1. (u + v) + w = u + (v + w). (Addition is commutative).
A2. u + v = v + u. (Addition is associative)
A3. for any u ∈ V, u + 0 = 0 + u = u. (Additive identity)
A4. For each u ∈ V , u + (−u) = (−u) + u = 0. (Additive inverse)

M1. k(u + v) = ku + kv , for any scalar k ∈ K . (Multiplication is linear)


M2. (a + b)u = au + bu; for any scalars a, b ∈ K . (Multiplication distributes over
addition)
M3. (ab)u = a(bu); for any scalars a, b ∈ K . (Multiplication is associative)
M4. 1u = u, for the unit scalar 1 ∈ K . (Multiplicative identity)

we can also prove the following simple properties of a vector space.

Theorem 4.6.1. Let V be a vector space over a field K.


i) For any scalar k ∈ K and 0 ∈ V , k0 = 0.

36 4.6. DEFINITION OF A VECTOR SPACE Chapter 4


CHAPTER 4. VECTOR SPACES

ii) For 0 ∈ K and any vector u ∈ V ; 0u = 0.


iii) If ku = 0, where k ∈ K and u ∈ V , then k = 0 or u = 0.
iv) For any k ∈ K and any u ∈ V ; (−k)u = k(−u) = −ku.

Proof. i) By Axiom [A2](u + 0 = 0 + u = u) with u = 0, we have 0 + 0 = 0. Hence, by


Axiom [M1](k(u + v) = ku + kv ), we have
k0 = k(0 + 0) = k0 + k0
Adding -k0 to both sides gives the desired result

ii) For scalars, 0 + 0 = 0. Hence, by Axiom [M2]((a + b)u = au + bu), we have


0u = (0 + 0)u = 0u + 0u
Adding -0u to both sides gives the desired result.

iii) Suppose ku = 0 and k 6= 0. Then there exists a scalar k1 such that k1 k = 1. Thus
u = 1u = (k1 k)u = k1 (ku) = k1 0 = 0

iv) Using u + (−u) = 0 and k + (−k) = 0 yields


0 = k0 = k [u + (−u)] = ku + k(−u) and 0 = 0u = [k + (−k)] u = ku + (−k)u
Adding -ku to both sides of the first equation gives −ku = k(−u) and adding -ku to both
sides of the second equation gives −ku = (−k)u. Thus, (-k)u = k(-u) = -ku. 

A philosophical perspective:
we never say exactly what vectors are, only what vectors do. This is an example of
abstraction, which appears everywhere in mathematics (but especially in algebra): the exact
substance of an object is not important, only its properties and functions. (For instance,
when using the number “3” in mathematics, it is not important whether we refer to 3 rocks, 3
sheep, or whatever; what is important is how to add, multiply, and otherwise manipulate
these numbers, and what properties these operations have). This is tremendously powerful:
it means that we can use a single theory (linear algebra) to deal with many very different
subjects (physical vectors, population vectors in biology, portfolio vectors in finance,
probability distributions in probability, functions in analysis, etc.). [A similar philosophy
underlies “object-oriented programming” in computer science.] Of course, even though
vector spaces can be abstract, it is often very helpful to keep concrete examples of vector
spaces such as R2 and R3 handy, as they are of course much easier to visualize.

4.7 Examples of Vector Spaces


1. Space K n
Vn (K) the set of all elements of an arbitrary field K, i.e.
Vn (K) = {(α0 , α1 , . . . αn )|αi ∈ K(i = 1 2, . . . , n)}, with the following operations:
• Vector Addition:
(α1 , α2 , . . . , αn ) + (β1 , β2 , . . . , βn ) = (α1 + β1 , α2 + β2 , + . . . , +αn + βn )
• Scalar Multiplication: k(α1 , α2 , . . . , αn ) = (kα1 , kα2 , . . . , kαn )
• The zero vector in K, the n-tuple of zeros, 0 = (0, 0, . . . , 0)

Chapter 4 4.7. EXAMPLES OF VECTOR SPACES 37


CHAPTER 4. VECTOR SPACES

• the negative of a vector, defined by −(α1 , α2 , . . . , αn ) = (−α1 , −α2 , . . . , −αn )


Vn (K) has been verified to be a vector space over K in section 4.1, theorem 4.1.1 on
page 31

2. Matrix Space Mmxn (K)


Let Mmxn (K) be the set of all mxn matrices over K. Addition of matrices and scalar
multiplication of matrices by elements of K, have been defined in section 2.3, page 6,
where most of the axioms of a vector space were verified. The rest of the axioms are
easily verified and Mmxn (K) is a K-space. Mn (K) denotes the vector space of all nxn
matrices over K.

3. Polynomial Space Pn (K)


Let Pn (K) denote the set of all polynomials Pn (K) of degree m ≤ n with coefficients in
K, i.e.
Pn (K) = {α0 + α1 x + α2 x2 + . . . + αm xm |αi ∈ K, (i = 1, 2, . . . , n)}.
+ is the usual addition of polynomials, and if α ∈ K , then scalar multiplication by α is
defined by, α(α0 + α1 x + α2 x2 + . . . + αm xm ) = αα0 + αα1 x + αα2 x2 + . . . + ααm xm
Then Pn (K) is a K-space.
To verify this, all the axioms in definition 4.6.1 above must be shown to be satisfied.
These are simple to verify, example for definition 4.6.1, A2)
if f (x) = α0 + α1 x + α2 x2 + . . . + αm xm and g(x) = β0 + β1 x + β2 x2 + . . . + βm xm , then
f (x) + g(x) = (α0 + β0 ) + (α1 + β1 )x + (α2 + β2 )x2 + . . . + (αm + βm )xm
= (β0 + α0 ) + (β1 + α1 )x + (β2 + α2 )x2 + . . . + (βm + αm )xm = g(x) + f (x)

4.8 Basis and Dimension


If a subspace S of V is generated by r vectors, S = (u1 , u2 , . . . , ur ), but cannot be
spanned by fewer than r vectors, we say that S has dimension r, or S is r-dimensional.

A set of r linearly independent vectors in Rn is called a basis of the subspace S. Any other
vector in this subspace is then a linear combination of the r vectors of this basis.

Definition 4.8.1 ((Basis)). A set S = (u1 , u2 , . . . , un ) of vectors is a basis of V if it has the


following two properties:
i) S is linearly independent.
ii) S spans V.

Alternative definition of basis.

Definition 4.8.2. A set of vectors S = (u1 , u2 , . . . , un ) is a basis of V if every v ∈ V can


be written uniquely as a linear combination of the basis vectors, ui .

The following is a fundamental result in linear algebra.

38 4.8. BASIS AND DIMENSION Chapter 4


CHAPTER 4. VECTOR SPACES

Theorem 4.8.1. Let V be a vector space such that one basis has m elements and another
basis has n elements. Then m = n.
if V has a basis with n elements, theorem 4.8.1 tells us that all bases of V have the same
number of elements, so this definition is well defined.

The vector space 0 is defined to have dimension 0.


A vector space V is said to be of finite dimension n or n-dimensional, written dim V = n.

4.9 Examples of Bases


Below are some examples of bases of some of the example vector spaces in section 4.7,
page 37
1) Vector space Kn :
Consider the following n vectors in Kn :

e1 = (1, 0, 0, . . . , 0, 0); e2 = (0, 1, 0, . . . , 0, 0); . . .; en = (0, 0, 0, . . . , 0, 1);


• These vectors are linearly independent.
• They form a matrix in echelon form.
• Any vector v = (a1 , a2 , . . . , an ) in Kn can be written as a linear combination of the above
vectors.
• Specifically, we can write any vector v as v = a1 e1 + a2 e2 + . . . + an en

Accordingly, the vectors form a basis of Kn called the usual or standard basis of Kn .
Thus, as expected, Kn has dimension n. In particular, any other basis of Kn has n elements.

2) Vector space M = Mrxs of all rxs matrices:


The following six matrices form a basis of the vector space M2x3 of all 2x3 matrices over K:
! ! ! ! ! !
1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0
, , , , ,
0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1

All such matrices form a basis of Mrxs called the usual or standard basis of Mrxs .
Accordingly, dimMrxs = rs.

3) Vector space Pn (x) of all polynomials of degree ≤ n.


The set S = (1, x, x2 , x3 , . . . , xn ) of n + 1 polynomials is a basis of Pn (x). Specifically,
any polynomial f(x) of degree ≤ n can be expessed as a linear combination of these powers
of x, and one can show that these polynomials are linearly independent. Therefore,
dimPn (x) = n + 1.

The following three theorems on bases will be used frequently.

Theorem 4.9.1. Let V be a vector space of finite dimension n. Then:

Chapter 4 4.9. EXAMPLES OF BASES 39


CHAPTER 4. VECTOR SPACES

i) Any n + 1 or more vectors in V are linearly dependent.


ii) Any linearly independent set S = (u1 , u2 , . . . , un ) with n elements is a basis of V.
iii) Any spanning set T = (v1 , v2 , . . . , vn ) of V with n elements is a basis of V.

Theorem 4.9.2. Suppose S spans a vector space V. Then:


i) Any maximum number of linearly independent vectors in S form a basis of V.
ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors
in S. Then the remaining vectors form a basis of V.

Theorem 4.9.3. Let V be a vector space of finite dimension and let S = (u1 , u2 , . . . , ur ) be a
set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be
extended to a basis of V.

4.10 Subspaces
Definition 4.10.1. A nonempty set W of V is called a subspace of V if W is itself a vector
space over K with the same definitions of vector addition and scalar multiplication as in V.

By inspecting the defining axioms of a vector space we note that some of the axioms will
automatically be true in W, if W is a subspace of V, except the following, which must be
verified:
i) That W is closed under addition, i.e. if u, v ∈ W then u + v ∈ W .
ii) 0 ∈ W
iii) For each v ∈ W , −v ∈ W
iv) if α ∈ K, v ∈ W , then αv ∈ W .

Lemma [Link]. A nonempty set W of V is a subspace of V if and only if αu + v ∈ W , for


all α ∈ K, u, v ∈ W .

Proof. suppose that W is a subspace of V, then by definition if α ∈ K, u, v ∈ W , then


αu + v ∈ W .
Conversely, if αu + v ∈ W , then since W is nonempty there exists an element u ∈ W and
thus (−1)u + u = −u + u = 0 ∈ W ,
and hence if α ∈ K, v ∈ W , αv + 0 = αv ∈ W and in particular −1v = −v ∈ W

Finally if u, v ∈ W , then 1u + v = u + v ∈ W , i.e. i)-iv) above have been verified, and W is a


subspace of V. Simple criteria for identifying subspaces are stated in the following
theorem: 

Theorem 4.10.1. W is a subspace of V if and only if:


i) W is nonempty
ii) W is closed under addition, i.e. if u, v ∈ W , then u + v ∈ W .
iii) W is closed under scalar multiplication, i.e. if v ∈ W , then αv ∈ W , for every α ∈ K .

Properties ii) and iii) may be combined into the following equivalent single statement:
ii) For every u, v ∈ W , a, b ∈ K , the linear combination au + bv ∈ W .

40 4.10. SUBSPACES Chapter 4


CHAPTER 4. VECTOR SPACES

4.11 Examples of Subspaces


1. Every vector space V has two (trivial) subspaces, namely V and {0}. Every other
subspace is called a proper subspace.

2. Let W = {(0, α2 , αn |αi ∈ K(i = 2, 3, . . . , n) ⊂ Vn (K), then W is a subspace of Vn (K),


since
i) W is nonempty, for example (0, 0, . . . , 0) ∈ W
ii) if α ∈ K, u = (0, α2 , . . . , αn , v = (0, β2 , . . . , βn ∈ W , then
αu + v = (0, αα2 + β2 , . . . , ααn + βn ∈ W .

3. Solution Space of a Homogeneous System of Linear Equations


Let V be the set of solutions of the homogeneous system of m linear equations in n
variables
AX = 0, where A = (aij ), X = (x1 , x2 , . . . , xn )T .
Then V is a subspace of VN (K), called the solution space of the system, i.e.
Let AY = 0, where A = (aij ), Y = (y1 , y2 , . . . , yn )T .

0 ∈ V and if α ∈ K, X, Y ∈ V , then
A(αX + Y ) = αAX + AY = 0, and
AX + AY ∈ V

Example 4.11.1. Find an R-basis for the solution space of the homogeneous system of lin-
ear equations
(
−w + x − 2y + 3z = 0
w + 2x + y − z = 0

Solving with matrices as in section 2.9, page 15


! ! !
1 −2 3 −1 1 −2 3 −1 5 0 1 −1
R →R −2R R →5R +2R
2 1 −1 1 −−2−−−2−−−→
1
0 5 −7 3 −−1−−−−
1 2
−−→ 0 5 −7 3
!
1 1
1R
R1 → 5 1 1 0 5 5
−−−−−→ .
R2 → 1
5 R2
0 1 − 57 3
5
−−−−−→

Which gives

X + 51 z + 15 w = 0
Y − 75 z + 35 w = 0

z and w are free variables, so we arbitrarily set z = λ, w = µ, then

(x, y, z, w) = (−λ − µ, 7λ − 3µ, 5λ, 5µ) = λ(−1, 7, 5, 0) + µ(−1, −3, 0, 5)

The solution space is < (−1, 7, 5, 0), (−1, −3, 0, 5) >, which is an R-basis

We can state the following theorem on the solution space of a homogeneous System

Chapter 4 4.11. EXAMPLES OF SUBSPACES 41


CHAPTER 4. VECTOR SPACES

Theorem 4.11.1. The solution set W of a homogeneous system AX = 0 in n unknowns is a


subspace of K.

Note that the solution set of a nonhomogeneous system AX = B is not a subspace of K.


Infact, the zero vector 0 does not belong to its solution set.

Example 4.11.2. Please use the following two examples in Exercises 4.6 which follow

1. Determine whether the subset U = {(a, b, c)|a = 2b = 3c} of V3 (R) is a vector space

Solution

Clearly u = (0, 0, 0) ∈ U , since 0 = 2.0 = 3.0


0 0 0 0 0 0
if u = (a, b, c) and v = (a , b , c ), with a = 2b = 3c and a = 2b = 3c
0 0 0
αu + βv = α(a, 12 a, 13 a) + β(a , 12 a , 13 a )
0 0 0
(αa + βa , 21 αa + 12 βa , 13 αa + 31 βa )
0 0 0
(αa + βa , 12 (αa + βa ), 13 (αa + βa ))
Thus, au + bv ∈ U , and hence U is a subspace of V.

2. Show that the subset U = {(a, b, c)|a + b + c ≥ 0} of V3 (R) is a not a vector space

Solution

We have to show that one of the properties of theorem 4.10.1 does not hold.

Clearly u = (0, 0, 0) ∈ U , since 0 ≥ 0

if v = (1, 2, 3), v ∈ U , since 1 ≥ 0

but for k = −5 ∈ R, −5v = (−5, −10, −15) ∈


/ U , since −5 is negative
Hence U is not a subspace of V.

Exercises 4.6

1. Which of the following subsets of Vn (R) are vector spaces?


i) {(α1 , α2 , . . . , αn )|α1 + α2 + α3 = 0}

ii) {(α1 , α2 , . . . , αn )|α1 + α2 + α3 = 1}

iii) {(α1 , α2 , . . . , αn )|α3 = α1 α2 }

iv) {(α1 , α2 , . . . , αn )|α1 − α2 = α2 − α3 }

2. Which of the following subsets of {C[0, 1]} are R-spaces?

42 4.11. EXAMPLES OF SUBSPACES Chapter 4


CHAPTER 4. VECTOR SPACES

i) {f ∈ C[0, 1]|f (1) = 0}

ii) {f ∈ C[0, 1]|f (1) = 1}


R1
iii) {f ∈ C[0, 1]|f (x) = 0 f (x)dx = 0}

3. Which of the following subsets are subspaces of M2 (R)?


  
a b
i) |a = b
c d
  
a b
ii) |a + b = 1
c d
  
a b
iii) |a = b = d
c d
0
4. Let M3 (K) = {A ∈ M3 (K)|AT = −A} ⊂ Mn (K) (i.e. the subset of 3x3 matrices, A,
satisfying the condition AT = −A).
a) Show that W is a subspace of M3x3 .

b) Find a basis for W and its dimension.

5. If W is a subspace of V, prove that the following two conditions hold: i) 0 ∈ W ;


ii) if u, v ∈ W , then u + v, ku ∈ W

4.12 Linear Independence, Linear Span, Row Space of a Matrix


4.12.1 Linear Combinations and Independence
Suppose u1 , u2 , . . . , uk are any vectors in a vector space V. Any vector of the form
a1 u1 + a2 u2 + . . . + ak uk , where the ai are scalars, is called a linear combination of the
vectors u1 , u2 , . . . , uk .

To show that v = (3, 9, −4, −2) is a linear combination of the vectors


u1 = (1, −2, 0, 3), u2 = (2, 3, 0, −1) and u3 = (2, −1, 2, 1),
we set αu1 + βu2 + γu3 = v 

 αu1 + 2βu2 + 2γu3 = 3


−2αu1 + 3βu2 − γu3 = 9

giving the system of equations:


 2γu3 = −4


3αu1 − βu2 + γu3 = −2



7βu2 + 3γu3 = 15

or 2γu3 = −4


− 2γu3 = 4

The system is consistent and has the solution α = 1, β = 3, γ = −2
i.e. (3, 9, −4, −2) = (1, −2, 0, 3) + 3(2, 3, 0, −1) − 2(2, −1, 2, 1)
or v = u1 + 3u2 − 2u3

If the system was inconsistent (no solution), v would not be a linear combination of ui

Chapter 4 4.12. LINEAR INDEPENDENCE, LINEAR SPAN, ROW SPACE OF A MATRIX 43


CHAPTER 4. VECTOR SPACES

A subset {v1 , v2 , . . . , vk } in V is linearly dependent over K if there exist


α1 , α2 , . . . , αk ∈ K (not all zero) such that α1 v + α2 v + . . . , αk v = 0

if not the subset {v1 , v2 , . . . , vk } is linearly independent over K, or in other words,


if α1 v + α2 v + . . . , αk v = 0 implies that αi = 0(i = 1, 2, . . . , k), then the subset
{v1 , v2 , . . . , vk } is linearly independent over K .

Example 4.12.1. 1) The set (1, 3, −1), (2, 0, 1), (1, −1, 1) is linearly dependent, since if

α(1, 3, −1) + β(2, 0, 1) + γ(1, −1, 1) = (0, 0, 0)

(α + 2β + γ, 3α − γ, −α + β + γ) = (0, 0, 0)

Solving the system




 α + 2β + γ = 0

3α −γ =0


−α + β + γ = 0

     
13 −1 R2 →R2 −3R1 1 2 1 1 2 1
 −−−−−−−−→ 
2 0 1 R3 →R3 +R1 0 3 2 R3 →R3 −R2 0 3 2
   
−−−−−−−→ −−−−−−−→
1 −1 1 R3 →− 1 R3 0 3 2 0 0 0
−−−−−2−→

The system has a nonzero solution, so the vectors are dependent

but (1, 1, −1), (2, 1, 0), (−1, 1, 2) is linearly independent,since

α(1, 1, −1) + β(2, 1, 0) + γ(−1, 1, 2) = (0, 0, 0)

(α + 2β − γ, α + 2β + γ, −α + 2γ) = (0, 0, 0)

Solving the system




 α + 2β − γ = 0

α + 2β + γ = 0


−α + 2γ = 0

     
1 2 −1 1 2 −1 1 2 −1
 R →R −R1 
 1 1 −1 −−2−−−2−−→ 0 −1 2 R3 →R3 −R2 0 −1 2
   
R →R +R1 −
− −−−− −→
−1 0 2 −−3−−−3−−→ 0 2 1 0 0 5

The system has the trivial solution x=y=z=0, so the vectors are linearly independent

2) {1 + i, i} are linearly dependent over C, since if

44 4.12. LINEAR INDEPENDENCE, LINEAR SPAN, ROW SPACE OF A MATRIX Chapter 4


CHAPTER 4. VECTOR SPACES

α(1 + i) + β(i) = 0, then

taking α = 1, β = −1 + i

we have 1(1 + i) + (−1 + i)i = 1 + i − i − 1 = 0

Hence α 6= 0 and β 6= 0 and so {1 + i, i} is dependent over C,

but linearly independent over R, for if α(1 + i) + βi = 0, α, β ∈ R,

then α + (α + β)i = 0

Evidently α = 0 and α + β = 0 or α = β = 0.

4.13 Linear Span


Consider the vectors e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1)
We say the three vectors e1 , e2 , e3 generate (or span) the vector space R3 because any
vector (a, b, c) in R3 is a linear combination of ei
(a, b, c) = a(1, 0, 0) + b(0, 1, 0) + c(0, 0, 1) = ae1 + be2 + ce3

Example 4.13.1. Show that u = (1, 2, 0) and v(0, 1, 0) generate R2 .

Solution

We must show that any vector in R2 can be written as a linear combination of u and v

(a, b) = xu + yv

= (x, 2x) + (0, y) or (a, b) = (x, 2x + y)

This is consistent and has a solution, so u and v generate R2

Exercises 4.12

1. Express v = (2, −5, 3) in R3 as a linear combination of the vectors


u1 = (1, −3, 2), u2 = (2, −4, −1) and u3 = (1, −5, 7)

2. Express the polynomial v = x2 + 4x − 3 in P (x) as a linear combination of the polyno-


mials
p1 = x2 − 2x + 5, p2 = 2x2 − 3x and p3 = x + 3

3. Express M as a linear combination of the matrices A, B, C, where


       
4 7 1 1 1 2 1 1
M= , and A = ,B= ,C =
7 9 1 1 3 4 4 5

4. Suppose the vectors u, v and w are linearly independent. Show that the vectors u + v ,

Chapter 4 4.13. LINEAR SPAN 45


CHAPTER 4. VECTOR SPACES

u − v , u − 2v + w are also linearly independent.

5. Show that the vectors u = (1 + i, 2i) and w = (1, 1 + i) in C 2 are linearly dependent
over the complex field C, but linearly independent over the real field R.

6. Show that the subset {(1, 0, 1, 1), (1, 0, 2, 4)} of V4 (R) is linearly independent over R and
extend it to an R-basis of V4 (R).

7. Show that
i) {et , sint, t2 }
ii) {et , sint, cost}
are independent over R.

8. True or False:
a) {(1, 2, 0), (1, 1, 3), (2, 3, 3)} is a spanning set for R3 .
b) {(1, 0, 1), (2, 1, 1), (0, 1, −1)} is a linearly independent set of vectors in R3 .
c) Any five 2x2 matrices must be linearly dependent
d) (3, 2, −4) is the coordinate vector of −3 + x − 4x2 , relative to the ordered basis {x +
1, x − 1, 1 + x + x2 } of P2 .

4.14 Dimension and Subspaces


The following theorem gives the basic relationship between the dimension of a vector space
and the dimension of a subspace.

Theorem 4.14.1. Let W be a subspace of an n-dimensional vector space V. Then


dimW ≤ n. In particular, if dimW = n, then W = V .

4.15 Row Space of a Matrix


Let A =
(aij ) be an arbitrary mxn
matrix over a field K.
a11 a12 . . . a1n
 
 a21 a22 . . . a2n 
i.e. A=
 .

 . . . . .

am1 am2 . . . amn

The rows of A, R1 = (a11 , a12 , . . . , ; a1n ); R2 = (a21 , a22 , . . . , ; a2n ); . . .


; Rm = (am1 , am2 , . . . , ; amn ) may be viewed as vectors in Kn ; hence, they span a
subspace of Kn called the row space of A, denoted rowsp(A). That is,
rowsp(A) = span(R1 , R2 , . . . , Rm )

Analogously, the columns of A may be viewed as vectors in Km called the column space of
A and denoted colsp(A). Observe that colsp(A) = rowsp(AT ).

Theorem 4.15.1. If A = (aij ) ∈ Mn (K) and cj = (c1j , c2j , . . . , ; cnj )(j = 1, 2, . . . , n) are
the columns of A, then cj = (c1j , c2j , . . . , ; cnj ) is linearly dependent over K if and only if det
A=0.

46 4.14. DIMENSION AND SUBSPACES Chapter 4


CHAPTER 4. VECTOR SPACES

Corollary [Link]. The rows of a matrix A ∈ Mn (K) are linearly dependent over K if and
only if det A=0.

Application to Finding a Basis for a subspace W of K n


Frequently, we are given a set S = {u1 , u2 , . . . , ur } of vectors in K n and we want to find a
basis for the subspace W of K n spanned by the given vectors, that is, a basis of
W = span(S) = span(u1 , u2 , . . . , ur )
Two algorithms are given in the following section, which find such a basis (and hence the
dimension) of W.

4.16 Basis-Finding Problems


Example 4.16.1. 1) Suppose we know a set of vectors in a subspace, how can we tell if they
form a basis?

Solution

The number of linearly independent vectors in the subspace must be equal to the dimen-
sion of the subspace.

Question: Determine whether the following form a basis for R3

i) e1 = (1, 1, 1), e2 = (1, −1, 5)

ii) e1 = (1, 1, 1), e2 = (1, 2, 3), e3 = (2, −1, 1)

Solution

i) No, a basis for R3 must have exactly 3 elements

ii) Yes, these are 3 linearly independent vectors

2) How do we find the basis of a subspace?

Solution

the The number of linearly independent vectors (independent spanning set) in this set is the
dimension of the subspace.

Example

Let W be the subspace of R5 generated by the set

S = {(1, 2, −1, 3, 4), (2, 4, −2, 6, 8), (1, 3, 2, 2, 6), (1, 4, 5, 1, 8), (2, 7, 3, 3, 9)}.

Chapter 4 4.16. BASIS-FINDING PROBLEMS 47


CHAPTER 4. VECTOR SPACES

Question 1: Can we find a basis and dimension of W?

Answer: Yes (refer two algorithms below).

4.16.1 Finding a Basis and Dimension of a Subspace of a Vector Space


Row Space Algorithm
Step 1. Form the matrix M whose rows are the given vectors.
Step 2. Row reduce M to echelon form.
Step 3. Output the nonzero rows of the echelon matrix.

Example 4.16.2. Let W be the subspace of R5 spanned by the following vectors:


u1 = (1, 2, −1, 3, 4), u2 = (2, 4, −2, 6, 8), u3 = (1, 3, 2, 2, 6), U4 = (1, 4, 5, 1, 8), u5 =
(2, 7, 3, 3, 9) Find a basis for the row of space W, and find dim W.
Solution
   
1 2 −1 3 4 1 2 −1 3 4
2 4 −2 6 8 0 1 −1 2
1
   
   
M =
1 3  ∼ 0
2 2 6 0 2 2 1


1 4 5 1 8 0 0 0 0 0
   

2 7 3 3 9 0 0 0 0 0
The nonzero rows of the echelon matrix M are (1, 2, −1, 3, 4), (0, 1, 1, −1, , 2), (0, 0, 2, 2, 1).
They form a basis of the row space of A and hence of W. Thus, in particular, dim W = 3.

Question 2: Can we extend the above basis to a basis of the whole space R5 , i.e find 5 lin-
early independent vectors?

Answer: Yes, we can add vectors, e.g. from the standard basis, (0, 0, 0, 1, 0) and (0, 0, 0, 0, 1)

{(1, 2, −1, 3, 4), (0, 1, 1, −1, 2), (0, 0, 2, 2, 1), (0, 0, 0, 1, 0), (0, 0, 0, 0, 1)}

• The five vectors above are linearly independent.

• They form a matrix in echelon form.

They form a basis of R5 , (which is an extension to the basis of R5 )

4.16.2 Casting-out Algorithm


Step 1. Form the matrix M whose columns are the given vectors.
Step 2. Row reduce M to echelon form.
Step 3. For each column Ck in the echelon matrix without a pivot, delete (cast out) the vector
uk from the list S of given vectors.
Step 4. Output the remaining vectors in S (which correspond to columns with pivots).

Example 4.16.3. Consider the subspace in example 4.16.2 above:


Find a basis for the column of space W.
Solution

48 4.16. BASIS-FINDING PROBLEMS Chapter 4


CHAPTER 4. VECTOR SPACES

     
1 2 1 1 2 1 2 1 1 2 1 2 1 1 2
 2 4 3 4 7 0 0 1 2
3  0 0 1 2 3
     
     
−1
M =  ∼ 0
−2 2 5 3 0 3 6 5∼ 0 0 0 0 -4 
 

 3 6 2 1 3 0 0 −1 −2 −3  0 0 0 0 0
     

4 8 6 8 9 0 0 2 4 1 0 0 0 0 0
The pivot positions are in columns C1, C3, C5. Hence, the corresponding vectors
{u1 , u3 , u5 } = {(1, 2, −1, 3, 4), (1, 3, 2, 2, 6), (2, 7, 3, 3, 9)}. form a basis of W, and dim W = 3.

Example 4.16.4. Given that the reduced row-echelon form of the matrix
   
1 1 0 1 0 1 0 1 0 0
   
−1 2 3 4 −1
 is R = 0 1 2 2 0

A=
 2 2 6 4

 2
 0
 0 0 0 1

3 4 11 8 4 0 0 0 0 0

Find

a) The dimension of A, dim A

b) Also find a basis for each of the following:

i) the row space of A

ii) the column space of A

iii) the null space of A.

Solution

a) dim A =3

b) i) A basis for the row space A is (1, 0, 1, 0, 0), (0, 1, 2, 2, 0), (0, 0, 0, 0, 1)

ii) In the echelon form of


A,the 
pivots
 are
 in thefirst,
 second and fifth columns. Thus, columns


 1 0 1 


      
−1 2 −1
 2 , 2 ,  2
C1 , C2 and C5 of A, i.e.      

     

 

 3 4 4 

form a basis for the column space of A.

iii) A basis for null space of A will be given by the solution to the system:

Chapter 4 4.16. BASIS-FINDING PROBLEMS 49


CHAPTER 4. VECTOR SPACES

 
  x
1 0 1 0 0  
  y
0 1 2 2 0  
   z = 0
0 0 0 0 1  
 s
  
0 0 0 0 0
t

We note that t=0, z and s are free variables (do not appear with a nonzero entry as the first
entry in any row)

If we set z=1,s=0 we get x=-1, y=-2,z=1,s=0,t=0

z=0,s=1 we get x=0, y=-2,z=0,s=1,t=0


   

 −1 0 
 
−2 −2

   


   
    
Whence   ,
 1  0
 

 0  1

   

 

 
0 0
 

is a basis for null space of A.

We emphasize that in the first algorithm we form a matrix whose rows are the given vectors,
whereas in the second algorithm we form a matrix whose columns are the given vectors

Exercises 4.8

1. Prove that {(1, 2, 0), (0, 5, 7), (−1, 1, 3)} is an R-basis for V3 (R) and find the coordinates
of (0, 13, 17) and (2, 3, 1) relative to this basis.
     
a b x 0
2. Let M = |a, b ∈ R and N = |x, y ∈ R be subspaces of M2 (R).
−b c y 0
Find R-basis for M, N, M ∩ N and M + N .

3. Let U be the subspace of V3 (R) generated by the two vectors u1 = (1, 2, 3) and u2 =
(3, −5, 1).

4. Show that v1 = (1, 0, 0) is not in U but that v2 = (5, −23, −9) is in U .


Express v2 as a linear combination of u1 and u2 .

5. Are the vectors u = (3, −1, 0, −1) and v = (1, 0, 4, −1) in the subspace of V4 (R) gen-
erated by {(2, −1, 3, 2), (−1, 1, 1, −3), (1, 1, 9, 5)}?
Hence determine two R-bases for V4 (R), one containing u and one containing v

6. Let W be the subspace of R4 spanned by the vectors u1 = (1, −2, 5−3), u2 = (2, 3, 1, −4), u3 =
(3, 8, −3, −5) Find
a) a basis and dimension of W
b) Extend the basis of W to a basis of R4

50 4.16. BASIS-FINDING PROBLEMS Chapter 4


CHAPTER 4. VECTOR SPACES

7. Consider the following subspaces of R5 :


U = span(u1 , u2 , u3 ) = span{(1, 3, −2, 2, 3), (1, 4, −3, 4, 2), (2, 3, −1, −2, 9)}
W = span(w1 , w2 , w3 ) = span{(1, 3, 0, 2, 1), (1, 5, −6, 6, 3), (2, 5, 3, 2, 1)}
Find a basis and the dimension of i) U + W , and ii) U ∩ W .

8. Let U and W be the following subspaces of R3 :


U = {a, b, c)|a = b = c} and W = {(0, b, c)} (Note that W is the yz-plane.)
Show that R3 = U ⊕ W .

9. Find the dimension and a basis of the solution space W of each homogeneous system:
 
 x + 2y + z − 2t = 0
  x + y + 2z = 0


a) 2x + 4y + 4z − 3t = 0 b) 2x + 3y + 3z = 0 c) −3t + x − 2y + z = 0
 
3x + 6y + 7z − 4t = 0 x + 3y + 5z = 0
 

4.17 Intersection of Subspaces, Sums and Direct Sums


4.17.1 Intersection of Subspaces
If U and W are subspaces of a vector space V, then the intersection of U and W, denoted
U ∩ W is a subspace of V
Proof
It is evident that 0 ∈ U and 0 ∈ W , because U and W are subspaces of V.
Hence u, v ∈ U ∩ W ,
Suppose u, v ∈ U ∩ W , then
u, v ∈ U and u, v ∈ W ,
Furthermore, for α ∈ K , αu + v ∈ U and αu + v ∈ W
which implies αu + v ∈ U ∩ W .

We can generalise this for the intersection of any number of subspaces:

Theorem 4.17.1. The intersection of any number of subspaces of a vector space V is a


subspace of V.

But in general the union of U and W, U ∪ W is not a subspace of V.


Example 4.17.1. u = {(α, 0)|α ∈ K} and v = {(0, β)|β ∈ K} are both subspaces of V,

but for example (1, 0) + (0, 1) = (1, 1) ∈


/ U ∪W

4.17.2 4.29 Sums of Subspaces


Let U and W be subsets of a vector space V. The sum of U and W, written U + W , consists
of all sums
u + w where u ∈ U and w ∈ W . That is,
U + W = {(v : v = u + w}; where u ∈ U and w ∈ W }
Now suppose U and W are subspaces of V. Then U + W is a subspace of V.
Proof

Chapter 4 4.17. INTERSECTION OF SUBSPACES, SUMS AND DIRECT SUMS 51


CHAPTER 4. VECTOR SPACES

Because U and W are subspaces, 0 ∈ U and 0 ∈ W . Hence, 0 = 0 = 0 belongs to U + W .


0 0 0 0 0
Now suppose v, v ∈ U + W . Then v = u + w and v = u + v , where u, u ∈ U and
0
w; w ∈ W .
0 0 0
Then av + bv = (au + bu ) + (aw + bw ∈ U + W
Thus, U + W is a subspace of V.
Recall that U ∩ W is also a subspace of V. The following theorem relates the dimensions of
these subspaces.

Theorem 4.17.2. Suppose U and W are finite-dimensional subspaces of a vector space V.


Then U + W has finite dimension and dim(U + W ) = dimU + dimW − dim(U ∩ W )

Example 4.17.2. Let V = M2x2 , the vector space of 2x2 matrices. Let U consist of those
matrices whose second row is zero, and let W consist of those matrices whose second col-
umn is zero. Then
! !
a b a 0
U= , W = , and
0 0 c 0
! !
a b a 0
U +W = , U ∩W =
c 0 0 0

That is, U + W consists of those matrices whose lower right entry is 0, and U ∩ W con-
sists of those matrices whose second row and second column are zero.

Note that dim U = 2, dim W = 2, dim (U ∩ W = 1.

Also, dim (U + W ) = 3, which is expected from theorem 4.17.2 above.

That is, dim (U + W ) = dim U + dim V − dim (U ∩ W ) = 2 + 2 − 1 = 3

4.17.3 Direct Sums of Subspaces


The vector space V is said to be the direct sum of its subspaces U and W, denoted
V = U ⊕ W , if every v ∈ V can be written in one and only one way as v = u + w, where
u ∈ U and w ∈ W .
The following theorem characterizes such a decomposition.

Theorem 4.17.3. The vector space V is the direct sum of its subspaces U and W if and only
if: i) V = U + W , ii) U ∩ W = {0}.

Example 4.17.3. a) Let U be the xy-plane and let W be the yz-plane; i.e., U = {(a, b, 0)|a, b ∈
R} and W = {0, b, c)|b, c ∈ R}

Then R3 = U + W , because every vector in R3 is the sum of a vector in U and a vector


in W.

However, R3 is not the direct sum of U and W, because such sums are not unique.

52 4.17. INTERSECTION OF SUBSPACES, SUMS AND DIRECT SUMS Chapter 4


CHAPTER 4. VECTOR SPACES

For example, (3, 5, 7) = (3, 1, 0) + (0, 4, 7) and also (3, 5, 7) = (3, −4, 0) + (0, 9, 7)

b) Let U be the xy-plane and let W be the z-axis; that is,

U = {a, b, 0)|a, b ∈ R} and W = {0, 0, c)|c ∈ R}

Now any vector (a, b, c ∈ R3 can be written as the sum of a vector in U and a vector in V
in one and only one way: (a, b, c) = (a, b, 0) + (0, 0, c) Accordingly, R3 is the direct sum
of U and W; that is, R3 = U ⊕ W.

4.18 Coordinates
Let V be an n-dimensional vector space over K with basis S = {v1 , v2 , . . . , vn }.Then any
vector v ∈ V can be expressed uniquely as a linear combination of the basis vectors in S ,
say as
n
X
v = α1 v1 + α2 v2 + . . . αn vn = αi vi , where αi ∈ K .
i=1
The n-tuple (αi ) ∈ Vn (K) are called coordinates of v relative to the K-basis S , and denoted
[v]S

Example 4.18.1. Given the K-basis S for V3 (K) = {(1, 0, 1), (2, −1, 1), (4, 1, 1)}

Find the coordinates of (1, −1, 1) relative to this basis.

We form the sum α1 (1, 0, 1) + α2 (2, −1, 1) + α3 (4, 1, 1) = (1, −1, 1)

and solve for α1 , α2 and α3


     
1 2 4 1 1 2 4 1 1 2 4 1
0 −1 1 −1 R2 →R2 −2R1 0 −1 1 −1 R1 →5R1 +2R2 0 −1 1 −1
−−−−−−−−→ −−−−−−−−→
1 1 1 1 0 −1 −3 0 0 0 −4 1
1
     
1
1 0 6 1 1
2 0 0 1 1
1 0 0 2
R1 → R1 R1 → R 1 R1 → R1
−−−−51−→ 0 −4 0 −3 −−−−15−→ 0 −4 0 −3 −−−−15−→ 0 1 0 − 34 .
R2 → 5 R2 R2 → 5 R2 R2 → R2
1
−−−−−→ 0 0 −4 1 −−−−−→ 0 0 −4 1 −−−−5−→ 0 0 1 4

1
 
2
3
Giving α1 = 21 , α2 = 43 , α3 = − 14 ,
1 3 1

or 2, 4, −4 and so [v]S =  
4
− 14

Example 4.18.2. Consider the vector space P2 (x) of polynomials of degree ≤ 2. The poly-
nomials:

p1 = t + 1, p2 = t − 1, p3 = (t − 1)2 = t2 − 2t + 1 form a basis S = {p1 , p2 , p3 } of


P2 (x).

To find the coordinate vector [v]S of the vector v = 2t2 − 5t + 9, relative to S ,

Chapter 4 4.18. COORDINATES 53


CHAPTER 4. VECTOR SPACES

We set v = xp1 + yp2 + zp3 , using unknown scalars x, y, z, and simplify:

2t2 − 5t + 9 = x(t + 1) + y(t − 1) + z(t2 − 2t + 1)


= xt + x + yt − y + zt2 − 2zt + z
= zt2 + (x + y − 2z)t + (x − y + z).

Setting the coefficients of the same powers of t on LHS and RHS equal to each other, we
obtain the system

z=2
x + y − 2z = −5
x−y+z =9

The solution of the system is x = 3, y = -4, z = 2.

Thus, v = 3p1 − 4p2 + 2p3 ,


 
and hence; [v]S = 3 −4 2

4.19 Isomorphism of V and K n


Let V be a vector space of dimension n over K, and suppose S = (u1 , u2 , . . . , un ) is a
basis of V. Then each vector v ∈ V corresponds to a unique n-tuple [v]S ∈ K n . On the other
hand, each n-tuple (c1 , c2 , . . . , cn , ) ∈ Kn corresponds to a unique vector
c1 u1 + c1 u2 + . . . + cn un in V. Thus, the basis S induces a one-to-one correspondence
between V and K n . The one-to-one correspondence between V and K n preserves the
vector space operations of vector addition and scalar multiplication.
We say that V and K n are isomorphic, written V ∼
= Kn .

54 4.19. ISOMORPHISM OF V AND K N Chapter 4


Chapter 5: Linear Mappings on Vector Spaces
Outputs
1. understanding of linear transformations, and ability to determine whether a given trans-
formation is linear

2. Ability to:

i) find the matrix of a linear transformation relative to given bases

ii) find kernel and image of a linear transformation

iii) determine singular and nonsingular linear transformations

5.1 Definition of a Linear Transformation


Definition 5.1.1. Let V and U be vector spaces over the same field K. A mapping
T : V → U is called a linear mapping or linear transformation if it satisfies the following two
conditions:

i) For any vectors v, w ∈ V, T (v + w) = T (v) + T (w).


ii) For any scalar k and vector v ∈ V, T (kv) = kT (v).
Namely, T : V → U is linear if it “preserves” the two basic operations of a vector space, that
of vector addition and that of scalar multiplication.

Substituting k = 0 into condition (2), we obtain T (0) = 0. Thus, every linear mapping takes
the zero vector into the zero vector. Now for any scalars a, b ∈ K and any vector v, w ∈ V ,
we obtain T (av + bw) = T (av) + T (bw) = aT (v) + bT (w)

More generally, for any scalars ai ∈ K and any vectors vi ∈ V , we obtain the following basic
property of linear mappings:
T (a1 v1 + a2 v2 + . . . + am vm ) = a1 T (v1 ) + a2 T (v2 ) + . . . + am T (vm ).

A linear mapping T : V → U is completely characterized by the condition


T (av + bw) = aT (v) + bT (w),
and so we shall often use this condition as the definition.

5.1.1 Preliminaries
A mapping f : A → B is said to be one-to-one (or 1-1 or injective) if different elements of A
0 0
have distinct images; that is, If f (a) = f (a ) implies a = a .

A mapping f : A → B is said to be onto (or f maps A onto B or surjective) if every b ∈ B is


the image of at least one a ∈ A.

A mapping f : A → B is said to be a one-to-one correspondence between A and B (or

55
CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

4 4 4

x2
3. 3.0 3.

3 x3 −
2x

= x2
2. 2. 2.

1
x)

)=

)
f(
1. 1. 1.

h(x
g (x
−4.−3.−2.−1. 0 1. 2. 3. 4. −4.−3.−2.−1. 0 1. 2. 3. 4 −4.−3.−2.−1. 0 1. 2. 3. 4
−1. −1. −1.
−2. −2. −2.
−3. −3.
−4. −4. −4.

Figure 5.1: a) One-to-one (Injective) b) Onto (Surjective) b) Neither 1-1 nor Onto

bijective) if f is both one-to-one and onto.

In fig. 5.1 above


i) The function f(x) is one-to-one: each horizontal line does not contain more than one point
of f(x).
ii) The function g(x) is onto: each horizontal line contains at least one point of g(x).
iii) The function h(x) is neither one-to-one nor onto: e.g., both 2 and -2 have the same image
4, and -16 has no pre-image.

The mapping T : V → V defined by T(v)=v, that is, the function that assigns to each element
in V, itself, is called the identity mapping. It is usually denoted by I . Thus, for any v ∈ V , we
have I(v) = v .

5.2 Examples of Linear Transformations


1. Let T : V3 (R) → V2 (R) be the "projection" mapping into the xy-plane, defined by
T (x, y, z) = (x, y, 0).
0 0 0
If we let u = (a, b, c), v = (a , b , c )
0 0 0
Then T (u + v) = T (a + a , b + b , c + c )
0 0
= T (a + a , b + b , 0
0 0
= (a, b, 0) + (a , b , 0)
and for any scalar k ∈ K ,
T (ku) = kT (v) = T (u) + T (ka, kb, kc)
= T (u) + T (ka, kb, 0)
= T (u) + kT (a, b, 0)
= kT (u)
T is linear

2. Let T : R2 → R2 be the "translation" mapping defined by


T (x, y) = (x + 1, y + 1).
Note that for u = (0, 0)
T (0) = T (0, 0) = (1, 2) 6= 0 The zero vector is not mapped into the zero vector

56 5.2. EXAMPLES OF LINEAR TRANSFORMATIONS Chapter 5


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

T is not linear

3. Let T : V → R be an integral mapping defined by


R1
T (f (t)) = 0 f (t)dx
We know from calculus that
R1 R1 R1
0 [u(t) + v(t)]dt = 0 u(t)dt + 0 v(t)dt
and
R1 R1
0 ku(t)dt = k 0 u(t)dt
That is, T (u + v) = T (u) + T (v) and T (ku) = kT (u).
Thus, the integral mapping is linear

4. The mapping T : V → V defined by


I(v) = v, v ∈ V
maps each v ∈ V into itself (i.e. leaves every vector unchanged). this is called the identity
mapping.
for every u, v ∈ V, k ∈ K
I(ku + v) = ku + v = kI(u) + I(v)
Hence I is linear.

5. The mapping T : V → W that assigns the zero vector 0 ∈ U to every vector v ∈ V


defined by
0(v) = 0, v ∈ V, 0 ∈ W, ∀u, v ∈ V, k ∈ K
is called the zero mapping. We have
I(ku + v) = 0 = k0 + 0
= kT (u) + T (v)
T is linear.

Example 5.2.1. 1. Let T : V3 (R) → V2 (R) be defined by


T (α1 , α2 , α3 ) = (α1 + α2 , α2 − α3 ). Is T linear?
Solution
0 0 0
Let u = (a, b, c), v = (a , b , c )
0 0 0
Then T (u + v) = T (a + a , b + b , c + c )
0 0
= T (a + a , b + b , 0
0 0
= (a, b, 0) + (a , b , 0)

and for any scalar k ∈ K ,


T (ku) = kT (v) = T (u) + T (ka, kb, kc)
= T (u) + T (ka, kb, 0)

Chapter 5 5.2. EXAMPLES OF LINEAR TRANSFORMATIONS 57


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

= T (u) + kT (a, b, 0)
= kT (u)
T is linear

2. Let M, N be mxm and nxn matrices respectively


Let T : Mmxn (K) → Mmxn (K) be defined by
T (A) = M AN , ∀A ∈ Mmxn (K). Is T linear?
Solution
If A, B ∈ Mmxn (K) and k ∈ K
T (αA + B) = M (αA + B)N
= αM AN + M BN
= αT (A) + T (B)
T is linear

3. Let T : Mn (K) → Mn (K) be defined by


T (A) = AS , S ∈ Mn (K).
Is T linear?
Solution
If A, B ∈ Mn (K) and k ∈ K
T (αA + B) = (αA + B)S
= αAS + BS
= αT (A) + T (B)
T is linear

Example 5.2.2. Please use the following two examples in Exercises 5.2 which follow

1. Suppose T : R3 → R2 is defined by T (x, y, z) = (x + y + z, 2x − 3y + 4z). Show that


T is linear.
Solution
0 0 0
Let u = (a, b, c) and v = (a , b , c ). Then
0 0 0
T (u + w) = T (a + a , b + b , c + c )
0 0 0 0 0 0
=( a+a +b+b +c+c , 2(a + a ) − 3(b + b ) + 4(c + c ) )
0 0 0 0 0 0
= (a + b + c, 2a − 3b + 4c) + (a + b + c , 2a − 3b + 4c )
= T (u) + T (v)
and, for any scalar k,
T (ku) = T (ka, kb, kc) = (ka + kb + kc, 2ka − 3kb + 4kc) = kT (u)
Thus, T is linear.

58 5.2. EXAMPLES OF LINEAR TRANSFORMATIONS Chapter 5


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

2. Suppose T : R2 → R3 is defined by T (x, y) = (x + 3, 2y, x + y). Show that T is not


linear: Thus, T is not linear.
Solution
Let u = (0, 0), the zero vector
T (u) = (3, 0, 0)
Thus, the zero vector is not mapped into the zero vector.
Hence, T is not linear.

Exercises 5.2

1. Which of the following mappings T : V3 (R) → V2 (R) are linear transformations?


i) T (x, y, z) = (x + y − z, 2x + y)
ii) T (x, y, z) = (x + 1, 2x + 2y − z)
iii) T (x, y, z) = (xy − z, yx)

2. Which of the following mappings T : Mn (K) → Mn (K) are linear transformations?


i) T A = AS , where S is a fixed matrix in Mn (K)
ii) T A = AT
0
3. Which of the following mappings T : C (R) → C(R) are linear transformations?
Rx
i) T (f (x)) = 0 f (t)g(t) dt
0
ii) T (f (x)) = f (x)
0
iii) T (f (x)) = f (x)f (x)

4. Show that the following mappings are not linear:


i) T : R3 → R2 defined by T (x, y, z) = (x + 1, y + z).
ii) T : R2 → R2 defined by T (x, y) = (xy, y).
iii) T : Mn (K) → Mn (K) defined by
T (A) = M + A), where A is the vector space of real n-square matrices in V, and M a
fixed nonzero matrix in Mn (K).

5.3 Kernel and Image of a Linear Transformation


Consider the linear transformation T : V → W .
We define
i) The Kernel of T, denoted KerT; as the set of all v ∈ V for which T (v) = 0, i.e. elements in
V which map into 0 ∈ W
KerT = {v ∈ V |T (v) = 0}

ii) The Image of T, denoted ImT; as the set of all T (v) ∈ W .

Chapter 5 5.3. KERNEL AND IMAGE OF A LINEAR TRANSFORMATION 59


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

ImT = {T (v)|v ∈ V }

Note
Both kernel and image are sets of vectors. The kernel is the set of (input) vectors from the
domain of T, and the image is the set of all functional values (output vectors) in the range of
T.
Example 5.3.1. Consider T : R3 → R3 , the projection of a vector v into the xy-plane,
that is,

T (x, y, z) = (x, y, 0)

Clearly the image of T is the entire xy-plane, that is, points (or vectors) of the form (x, y, 0).
Moreover, the kernel of T is the z-axis, that is, points (or vectors) of the form (0, 0, c). That
is,

ImT = {(a, b, c)|c = 0) (= xy-plane), and

KerT = {(a, b, c)|a = 0, b = 0} (= z-axis).

Theorem 5.3.1. Let T : V ∈ U be a linear mapping. Then


a) the kernel of T is a subspace of V, and
b) the image of T is a subspace of U.

Proof. a) Because T (0) = 0, we have 0 ∈ KerT . Now suppose v, w ∈ KerT and a, b ∈ K .


Because v and w belong to the kernel of T, T (v) = 0 and T (w) = 0.
Thus, T (av + bw) = aT (v) + bT (w) = a0 + b0 = 0 + 0 = 0
and so av + bw ∈ KerT .
Thus, the kernel of T is a subspace of V.

b) Because T (0) = 0, we have 0 ∈ ImT .


0
Now suppose u, u ∈ ImT and a, b ∈ K .
0 0
Because u, and u belong to the image of T, there exist vectors u, u ∈ V such that T (v) = u
0 0
and T (V ) = u .
0 0 0
Then T (av + bv ) = aT (v) + bT (v ) = au + bu ∈ ImT Thus, the image of T is a subspace
of U. 

5.4 Rank and Nullity of a Linear Mapping


Let T : V ∈ U be a linear mapping. The rank of F is defined to be the dimension of its
image, and the nullity of F is defined to be the dimension of its kernel; namely,
rank(T ) = dim(Im T ) and
nullity(T ) = dim(Ker T )

Theorem 5.4.1. Let V be of finite dimension, and let T : V ∈ U be a linear mapping. Then
dimV = dim(Ker T ) + dim(Im T ) = nullity(T ) + rank(T )

60 5.4. RANK AND NULLITY OF A LINEAR MAPPING Chapter 5


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

Example 5.4.1. Let T : R4 → R3 be the linear mapping defined by

T (α1 , α2 , α3 , α4 ) = (α1 − α2 + α3 + α4 , α1 + 2α2 − α3 + α4 , 3α2 − 2α3 ).

Find R-bases and the dimensions of i) ImT and ii) KerT

Solution

We consider the following generators (normal basis) of R4 and their images.

(1, 0, 0, 0) → (1, 1, 0)

(0, 1, 0, 0) → (−1, 2, 3)

(0, 0, 1, 0) → (1, −1, −2)

(0, 0, 0, 1) → (1, 1, 0)

then we form the matrix with the generators of R3 as rows and row reduce
   
1 1 0 1 1 0
S = −1 2 3  ∼ 0 1 1 
   

1 −1 −2 0 0 0

Thus, (1, 1, 0) and (0, 1, 1) form a basis of ImT . Hence, dim(ImT ) = 2 and rank(T ) =
2.

To find a basis and the dimension of the kernel of the map T.

We set T (v) = 0, where v = (α1 , α2 , α3 , α4 ),

i. e. (α1 − α2 + α3 + α4 , α1 + 2α2 − α3 + α4 , 3α2 − 2α3 ) = (0, 0, 0, 0)

Set corresponding components equal to each other to form the following homogeneous sys-
tem


α1 − α2 + α3 + α4 = 0

α1 + 2α2 − α3 + α4 = 0


3α2 − 2α3 =0

whose solution space is Ker T


  
1 −1 1 1 −1 11 1
S = 1 2 −1 1 ∼ 0 3 −2 0
   

0 0 3 −2 0 0 0 0

Thus, (1, −1, 1, 1) and (0, 3, −2, 0) form a basis of Ker T

and dim(Ker T ) = 2, i.e. nullity(T ) = 2.

Chapter 5 5.4. RANK AND NULLITY OF A LINEAR MAPPING 61


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

Note that the transpose of the matrix of the generators (normal basis) of R4 is called the
matrix of the linear transformation with respect to the standard basis ei , i.e
T (α1 , α2 , α3 , α4 ) = (α1 − α2 + α3 + α4 , α1 + 2α2 − α3 + α4 , 3α2 − 2α3 ).
can be written as: T (α1 , α2 , α3 , α4 ) = Te (α1 , α2 , α3 T , where
, α4 )
  α1
1 −1 1 1  
 α2 
Te = 1 2 −1 1 and (α1 , α2 , α3 , α4 )T = 
α .
  
 3
0 0 3 −2
α4

5.5 Matrix of a Linear Transformation


Example 5.5.1. Consider the linear transformation

T : R2 → R2 , defined by

T (x, y) = (2y, 3x − y)

We consider the following generators (normal basis) of R2 and their images.

T (1, 0) → (0, 3)

T (0, 1) → (2, −1)


  
0 2 x
Notice that T (x, y) = = (2y, 3x − y)
3 −1 y

!
0 2
[T ]std = is the matrix we would usually use to represent T by matrix multiplication.
3 −1
We call [T ]std the matrix representation of T relative to the standard basis of R2 .
When we use the standard basis, the rows of [T ]std are the coefficients of x, y, z in the
components of T (x, y, z)

Example 5.5.2. Let V be the vector space of functions with basis S = (sint, cost, e3t ),
df (t)
and let D : V → V be the differential operator defined by D(f (t)) = .
dt
The matrix representing D in the basis S is:

D(sint) = cost = 0(sint) + 1(cost) + 0(e3t )

D(cost) = −sint = −1(sint) + 0(cost) + 0(e3t )

D(e3t ) = 3e3t = 0(sint) + 0(cost) + 3(e3t )

 
0 −1 0
and so [D]S = 1 0 0.
 

0 0 3
Note that the coordinates of D(sint), D(cost), D(e3t ) form the columns, not the rows, of D.

62 5.5. MATRIX OF A LINEAR TRANSFORMATION Chapter 5


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

Example 5.5.3. Now consider a new basis S = {(1, 3), (2, 5)} for our earlier linear trans-
formation T : R2 → R2 , deifned by T (x, y) = (2y, 3x − y)

We consider the images of these basis vectors, and write them as linear combinations of
the given basis vectors.

T (1, 3) → (6, 0) = α(1, 3) + β(2, 5) = (α + 2β, 3α + 5β)

T (2, 5) → (10, 1) = α(1, 3) + β(2, 5) = (α + 2β, 3α + 5β)


( (
α + 2β = 6 α + 2β = 10
solving and for α and β
3α + 5β = 0 3α + 5β = 1

gives

T (1, 3) = 30(1, 3) − 48(2, 5)

T (2, 5) = 18(1, 3) − 29(2, 5)


 
30 −48
Hence [T ]S = is the matrix representation of T relative to the basis S .
18 29

5.5.1 Algorithm for Finding Matrix Representations of a Linear Transformation


Relative to a Given Basis
Example 5.5.4. Given a linear operator T on a vector space V and a basis
S = (u1 , u2 , . . . , un ) of V,
how do we find the matrix representation of T, i.e. [T ]S , relative to the basis S ?

Consider the linear transformation T : R3 → R3 defined by


T (x, y, z) = (x − y, x + 2y − z, 2x + y + z), and basis
S = {u1 , u2 , u3 } = (1, 0, 1), (−2, 1, 1), (1, −1, 1)

To find the matrix of T relative to the basis S:

1) Find the image of u1 = (1, 0, 1) under the transformation


T (1, 0, 1) = (1 − 0, 1 + 2.0 − 1, 2.1 + 0 + 1) = (1, 0, 3)

2) Write the image of u1 , i.e. v = (1, 0, 3) as a linear combination of the three basis vectors
(1, 0, 3) = α(1, 0, 1) + β(−2, 1, 1) + γ(1, −1, 1), or
(1, 0, 3) = (α − 2β + γ, 0α + β − γ, α + β + γ)
and
 solve the system

 α − 2β + γ = 1

0α + β − γ = 0 for α, β and γ


α+ β+γ =3

     
1 −2 1 1 1 −2 1 1 1 −2 1 1
0 1 −1 0 R3 →R3 −R1 0 1 −1 0 R3 →R3 −R2 0 1 −1 0
     
−−−−−−−→ −−−−−−−→
1 1 1 3 0 3 0 2 0 0 3 2

Chapter 5 5.5. MATRIX OF A LINEAR TRANSFORMATION 63


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

   
1 0 −1 1 1 0 −1 1
0 1 −1 0 R2 →3R2 +R3 0 3 0 2 R1 →3R1 +R3
   
R1 →R1 +2R2
−−−−−−−−→ −−−−−−−−→ −−−−−−−−→
0 0 3 2 0 0 3 2
 
3 0 0 5
0 1 0 2, giving
 

0 0 3 2
(1, 0, 3) = 53 (1, 0, 1) + 32 (−2, 1, 1) + 32 (1, −1, 1)
working similarly, we find the images of u2 = (−2, 1, 1) and u3 = (1, −1, 1) under the
transformation, and write each image as a linear combination of the three basis vectors
(−3, −1, −2) = − 11 1 4
3 (1, 0, 1) + 3 (−2, 1, 1) + 3 (1, −1, 1)
12
(2, −2, 2) = 3 (1, 0, 1) + 30 (−2, 1, 1) + 36 (1, −1, 1)
and  
5 −11 12
1
[T ]S = 2 1 0
 
3
2 4 6

Summary of the steps to find [T ]S


1) Find the image of uk under the transformation, i.e. find T (uk )

2) Write T (uk ) as a linear combination of the basis vectors (u1 , u2 , . . . , un ),


T (u1 ) = α11 u1 + α12 u2 + . . . + α1n un
T (u2 ) = α21 u1 + α22 u2 + . . . + α2n un
... = ... + ... + ... + ...
T (un ) = αn1 u1 + αn2 u2 + . . . + αnn un

3) The transpose of the matrix


 T  
α11 α12 . . . α1n un α11 α21 . . . αn1 un
   
 α21 α22 . . . α2n un 
 =  α12 α22 . . . αn2 un  = [T ]S
 

 ... ... ... . . .  ... ... ... . . .
 
 
αn1 αn2 . . . αnn un α1n α2n . . . αnn un

5.6 Change of Basis


Example 5.6.1. Consider the following two bases of R2 :
0
S = {(u1 , u2 } = {(1, 2), (3, 5)} and S = {(v1 , v2 } = {(1, −1), (1, −2)}
0
We want to find the change-of-basis matrix P from S to the “new” basis S .
0
To do this, we write each of the new basis vectors v1 and v2 of S as a linear combination of
the basis vectors u1 and u2 of S.
We have
T (v1 ) = αu1 + βu2 or (1, −1) = α(1, 2) + β(3, 5) = (α + 3β, 2α + 5β)
T (v2 ) = αu1 + βu2 or (1, −2) = α(1, 2) + β(3, 5) = (α + 3β, 2α + 5β)
T (v1 ) = −8(1, 2) + 3(3, 5)
T (v2 ) = −11(1, 2) + 4(3, 5)

64 5.6. CHANGE OF BASIS Chapter 5


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

!
−8 −11
and P =
3 4
0
To find the change-of-basis matrix Q from the "new" basis S to the "old" basis S .

we write each of the old basis vectors u1 and u2 of S as a linear combination of the basis
0
vectors v1 and v2 of S .
We have
T (u1 ) = αv1 + βv2 or (1, 2) = α(1, −1) + β(1, −2) = (α + β, 2α + 5β)
T (u2 ) = αv1 + βv2 or (3, 5) = α(1, −1) + β(1, −2) = (α + 3β, 2α + 5β)
T (u1 ) = 4(1, 2) − 3(3, 5)
T (u2 ) = 11(1, 2) −!8(3, 5)
4 11
and Q =
−3 −8

Q is the inverse of P, so we can find Q by forming the matrix M = (P : I) and row reducing
M to row canonical form:

   
−8 −11 1 0 R2 →8R2 +3R1 −8 −11 1 0
−−−−−−−−→
3 4 0 1 0 −1 3 8

   
−8 −11 1 0 R1 →R1 −11R2 −8 0 −32 −88
−−−−−−−−→
0 −1 3 8 0 −1 3 8

R →− 1 R1
−8 0 −32 −88 −−1−−−8−→
  
1 0 4 11
R →−R2
0 −1 3 8 −−2−−−→ 0 1 −3 −8
!
4 11
Thus Q = P −1 =
−3 −8

A given linear transformation can be represented by matrices with respect to many choices
of bases for the domain and range. Finding the matrix of a linear transformation relative to
a given basis, e.g. [T ]std turns out to be easy, whereas finding the matrix of T relative to other
bases is more difficult. Here’s how you can use change-of-basis matrices to make things
simpler
0
Suppose you have a linear transformation U ⇒ V with bases S and S for U and V respec-
0
tively, and you want the matrix representation of T relative to these bases, i.e. [T ]SS

Here are the steps:

Find

1) [T ]std
std , the matrix representation of T relative to the standard basis (find this from the

Chapter 5 5.6. CHANGE OF BASIS 65


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

definition of the Linear Transformation).

2) Change-of-basis matrices [T ]std std


S and [T ]S 0

(basis elements written in terms of the standard basis, used as the columns of the matrix)
0
3) [T ]Sstd = ([T ]std
S0
)−1
0 0
Then [T ]SS = [T ]Sstd [T ]std std
std [T ]S

Example 5.6.2. Consider the following linear transformation from U into V, T : R2 → R3


defined by

T (x, y) = (x + y, 2x − y, x − y) and the bases


0
S = {(2, 1), (1, 1)} and S = {(1, 1, 1), (1, 2, 1), (−1, 0, −2)}

for R2 and R3 , respectively


0
We want to find the matrix representation of T relative to the ordered bases S and S
0
We shall denote the matrix [T ]SS

Note that the linear transformation can be written as


    
" # 1 1 x 1 1
x
T = 2 −1  y , we denote 2 −1 = [T ]std
std , the matrix representation of T
    
y
1 −1 z 1 −1
relative to the standard basis

Note that [T ]std


std was obtained from the images of the standard basis {(1, 0), (0, 1)} un-
der the tansformation: T(1,0)=(1,2,1); T(0,1)=(1,-1,-1).

Now we find the matrix representation of T relative to the basis S of U, [T ]std


S

(2, 1) = α(1, 0) + β(0, 1) 2(1, 0) + 1(0, 1)

(1, 1) = α(1, 0) + β(0, 1) 1(1, 0) + 1(0, 1)


" #
std 2 1
[T ]S =
1 1
0
and the matrix representation of T relative to the basis S of V, [T ]std
S0

(1, 1, 1) = α(1, 0, 0) + β(0, 1, 0) + γ(0, 0, 1) 1(1, 0, 0) + 1(0, 1, 0) + 1(0, 0, 1)

(1, 2, 1) = α(1, 0, 0) + β(0, 1, 0) + γ(0, 0, 1) 1(1, 0, 0) + 2(0, 1, 0) + 1(0, 0, 1)

66 5.6. CHANGE OF BASIS Chapter 5


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

(−1, 0, −2) = α(1, 0, 0) + β(0, 1, 0) + γ(0, 0, 1) − 1(1, 0, 0) + 0(0, 1, 0) − 2(0, 0, 1)


 
1 1 −1
[T ]std = 1 2 0
 
S0
1 1 −2
 −1  
1 1 −1 4 −1 −2
0
[T ]Sstd = 1 2 0 = −2 1 1
   

1 1 −2 1 0 1

Therefore
    
4 −1 −2 1 1 −1 " # 7 7
0
 2 1
[T ]SS = −2 1 1 1 2 0 = −2 −3
   
1 1
1 0 1 1 1 −2 2 2

How does this work? Well, consider the graphic below

A
U =⇒ R2
w w
w w
B −1 [T ]A w w [T ]
w w
 

B
V =⇒ R3

[A] transforms ordered basis of U into standard basis of R2

[T ] is the definition of the transformation

[B] transforms ordered basis of V into standard basis of R3


0
To calculate [T ]SS , we are looking for [B]−1 [T ][A]

Note that you can also compute, according to formulas below:


0
[T ]std std S S
std = [T ]S 0 [T ]S [T ]std : [T ]SS = [T ]Sstd [T ]std std
std [T ]S : etc.

Applying the method of this section to example 5.6.1:


!
1 0
Example 5.6.3. We take the definition as the identity mapping, hence [T ]std
std =
0 1

We calculate
i) [T ]std
S from
(1, 2) = α(1, 0) + β(0, 1) = 1(1, 0) + 2(0, 1)
(3, 5) = α(1, 0) + β(0, 1)!= 3(1, 0) + 5(0, 1)
1 3
Whence [T ]std
S =
2 5

Chapter 5 5.6. CHANGE OF BASIS 67


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

ii) [T ]std
S from
(1, −1) = α(1, 0) + β(0, 1) = 1(1, 0) − 1(0, 1)
(1, −2) = α(1, 0) + β(0, 1) =! 1(1, 0) − 2(0, 1)
1 1
Whence [T ]std
S =
−1 −2

Then ! ! ! !
0 1 1 1 0 1 3 4 11
[T ]SS = =
−1 −2 0 1 2 5 −3 −8

Exercises 5.5

1. Let T be the linear transformation defined by


T (x, y, z) = (x − y, x + 2y − z, 2x + y + z)
Find the matrix of T relative to
i) the standard basis of V3 (R)
ii) the R-basis {v1 , v2 , v3 } for V3 (R), where v1 = (1, 0, 1), v2 = (−2, 1, 1),v3 =
(1, −1, 1).

2. The matrix of a linear transformation T on V3 (R) relative to the standard basis is


 
0 1 1
1 0 1
1 1 0
Find the matrix of T relative to the R-basis {v1 , v2 , v3 }, where
v1 = (1, 0, 1), v2 = (−2, 1, 1),v3 = (1, −1, 1).

3. If {u1 , u2 } and {v1 , v2 , v3 } are R-bases for V2 (R) and V3 (R) respectively and if a linear
transformation T from V2 (R) into V3 (R) is defined by
T u1 = v1 + 2v2 − v3
T u2 = v1 − v2
find the matrix of T relative to these bases. Find also the matrix of T relative to the R-bases
{−u1 + u2 , 2u1 − u2 } and {v1 , v1 + v2 , v1 + v2 + v3 } for V2 (R) and V3 (R) respectively.
What is the relationship between these two matrices?

4. Let T : V4 (R) → V3 (R) be the linear transformation defined by


T (α, β, γ, δ) = (α − β + γ + δ, α + 2β − γ + δ, 3β − 2γ)
Find R-bases for kerT and ImT.

5. Let T : V3 (R) → V2 (R) be the linear transformation defined by


T (α, β, γ) = (α + β − γ, 2α + β)
Find R-bases for kerT and ImT.

6. Find the rank and nullity of a linear transformation from V4 (R) to V3 (R) defined by
T (α, β, γ, δ) = (α − γ + 2δ, 2α + β + 2γ, β + 4γ)

68 5.6. CHANGE OF BASIS Chapter 5


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

Show that (1, 3, k) is in ImT if and only if k=5.


Find the condition for (1, x, 1, y) to be in kerT.

7. Find the rank and nullity of thelinear transformation


 from V4 (R) to V3 (R) whose matrix rel-
1 2 −1 2
ative to the standard basis is 2 6 3 −3
0 2 5 −7
 
1 1
8. If S = , Find R-bases for KerT and ImT for the linear transformations TA=AS and
1 2
TA=SA.

9. Let V and U be subspace, and let T : V → U be linear a linear transformation from V


into U . Prove that
i) KerT is a subspace of V
ii) ImT is a subspace of U

5.7 Singular and Nonsingular Linear Transformations, Isomorphisms


Consider the following Linear Mappings
1) T : R3 → R3 defined by T (x, y, z) = (x, y, 0), the projection of a vector v into the
xy-plane
the image of T is the entire xy-plane,i.e, vectors of the form (x,y,0)
ImT = {(a, b, c)|c = 0}, i.e. the xy-plane
the kernel of T are vectors of the form (0,0,c)
KerT {(a, b, c)|a = 0 = b}, i.e the z-axis
T : V → U is an example of a Linear Transformation which is singular (noninvertible).

2) G : R3 → R3 defined by G(x, y, z) = ((x cos θ − y sin θ, x sin θ + y cos θ, z)), rotation of a


vector v about the z-axis through an angle θ
We observe that the distance of a vector v from the origin O does not change under the
rotation, only the zero vector is mapped into the zero vector
KerG = 0
every vector u ∈ R3 is the image of a vector v ∈ R3 , v can be obtained by rotating u back by
the angle θ
ImG = R3 , the entire space
G : V → U is an example of a Linear Transformation which is nonsingular (invertible).

A linear mapping T : V → U is said to be singular if there exists v 6= 0 such that T (v) = 0.


Thus, T : V → U is nonsingular if the zero vector 0 is the only vector whose image under T
is 0 or, if KerT = 0.

If a linear mapping T : V → U is one-to-one, only 0 ∈ V can map into 0 ∈ U , and so T is


nonsingular, so we can state:

A linear mapping T : V → U is one-to-one if and only if T is nonsingular.


Suppose V has finite dimension and dimV = dimU . Suppose T : V → U is linear. Then T
is called an isomorphism if and only if T is nonsingular.

Chapter 5 5.7. SINGULAR AND NONSINGULAR LINEAR TRANSFORMATIONS, ISOMORPHISMS 69


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

Example 5.7.1. How to prove that a linear transformation g(x) is bijective


Recall that F : A → B is bijective if and only if F is:
1) injective: F (x) = F (y) ⇒ x = y , and
2) surjective: for all b ∈ B there is some a ∈ A such that F (a) = b

e.g. let g(x)=2f(x)+3 over R

Is g(x) injective?
Take x, y ∈ R and assume that g(x) = g(y).
Therefore 2g(x) + 3 = 2g(y) + 3. We can cancel out the 3 and divide by 2, then we get
g(x) = g(y).
Since g(x) is bijective, then it is injective, and we have that x=y.

Is g(x) surjective?
Take some y ∈ R, we want to show that y = g(x) that is, y = 2g(x) + 3
y−3
Subtract 3 and divide by 2, again we have 2 = g(x)
y−3
As before, if g(x) was surjective then we denote w = 2 , since g(x) is surjective there is
some x such that g(x) = w. Now g(x)=y as required.

Example 5.7.2. Please use the following two examples in Exercises 5.7 which follow

Let G : R2 → R2 be defined by G(x, y) = (x − y, x − 2y).

a) Show that G is nonsingular

b) Find a formula for G−1 .

Solution

Find KerG by setting G(v) = 0, where v = (x, y),


( (
x− y =0 x−y =0
(x − y, x − 2y) = (0, 0) or or
x − 2y = 0 −y =0

The only solution is x = 0, y = 0.

Hence, F is nonsingular.

To find G−1 :

Method 1

we set G(x, y) = (a; b), so that G−1 (a, b) = (x, y).


( (
x− y =a x−y =a
We have (x − y, x − 2y) = (a, b) or and
x − 2y = b y =a−b
(
x = 2a − b
solving for x and y, we get
y =a−b

70 5.7. SINGULAR AND NONSINGULAR LINEAR TRANSFORMATIONS, ISOMORPHISMS Chapter 5


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

and hence G−1 = (2a − b, a − b), and putting it in terms of (x,y),

G−1 = (2x − y, x − y)

Method 2

Find the
 matrixrepresentation of G relative to 
the given
 basis (in this case, standard basis)
1 −1 2 −1
[G] = and the inverse [G]−1 =
1 −2 1 −1
  
2 −1 x
Then G−1 = = (2x − y, x − y)
1 −1 x

Exercises 5.7

1. Show that
a) the linear Transformations defined below are nonsingular, and
b) Give a rule fo T −1 like the one which defines T
i) T : R2 → R3 defined by T (x, y) = (x + y, x − 2y, 3x + y).
ii) T : R3 → R3 defined by T (α, β, γ) = (3α − β, α − β + γ, −α + 2β − γ)

2. If {v1 , v2 , v3 , v4 } is an R-basis for the vector space V, for what values of λ is the linear trans-
formation defined by
T v1 = v1 + λv2
T vi = 2vi−1 + λvi (i=2,3,4)
nonsingular?

5.8 Operations with Linear Transformations


We can combine linear mappings in various ways to obtain new linear mappings. for
example, we consider two linear mappings F : V → U and G : V → U over a field K. We
can define their sum F + G and the scalar product kF, k ∈ K as follows:
(F + G)(v) = F (v) + G(v) and kF )(v) = kF (v)
if F and G are linear, then F + G and kF are also linear, since for any vectors v, w ∈ V and
any scalars a, b ∈ K
(F + G)(av + bw) = F (av + bw) + G(av + bw)
= aF (v) + bF (w) + aG(v) + bG(w)
= a[F (v) + G(v)] + b[F (w) + G(w)]
= a(F + G)(v) + b(F + G)(w)
and (kF )(av + bw) = kF (av + bw)
= k[F (av) + kF (bw)]
= akF (v) + bkF (w) = a(kF )(v) + b(kF )(w)
Thus F + G and kF are linear.

Chapter 5 5.8. OPERATIONS WITH LINEAR TRANSFORMATIONS 71


CHAPTER 5. LINEAR MAPPINGS ON VECTOR SPACES

Example 5.8.1. Define F : R3 → R2 and G : R3 → R2 by

F (x, y, z) = (2x, y + z), and

G(x, y, z) = (x − z, y).

Find formulas defining the maps:

a) F + G, b) 3F and c) 2F - 5G.

Solution

a) (F + G)(x, y, z) = F (x, y, z) + G(x, y, z) = (2x, y + z) + (x − z, y) = (3x − z, 2y + z)

b) 3F ((x, y, z) = 3F (x, y, z) = 3(2x, y + z) = (6x, 3y + 3z)

c) (2F − 5G)(x, y, z) = 2F (x, y, z) − 5G(x, y, z) = 2(2x, y + z) − 5(x − z, y) = (4x, 2y +


2z) + (−5x + 5z, −5y) = (−x + 5z, −3y + 2z)

The the collection of all linear mappings from V into U with the above operations of addition
and scalar multiplication forms a vector space over K, denoted by Hom(V,U) (Note Hom
stands for "homomorphism").

5.8.1 Composition of Linear Mappings


Suppose V, U, and W are vector spaces over the same field K, and suppose F : V → U and
G : U → W are linear mappings. We can picture these mappings as follows:
F G
V −→ U −→ W

The composition function G ◦ F is the mapping from V into W defined by


(G ◦ F )(v) = G(F (v)).
G ◦ F is linear whenever F and G are linear.

Specifically, for any vectors v, w ∈ V and any scalars a, b ∈ K , we have


(G ◦ F )(av + bw) = G(F (av + bw)) = G(aF (v) + bF (w)) = aG(F (v)) + bG(F (w)) =
a(G ◦ F )(v) + b(G ◦ F )(w)
Thus, G ◦ F is linear.

Example 5.8.2. Let F : R3 → R2 and G : R2 → R2 be defined by F (x, y, z) = (2x, y +


z) and G(x, y) = (y, x).

Derive formulas defining the mappings: a) G ◦ F and b) F ◦ G.

Solution

a) (G ◦ F )(x, y, z) = G(F (x, y, z)) = G(2x, y + z) = (y + z, 2x)

b) The mapping F ◦ G is not defined, because the image of G is not contained in the do-
main of F.

72 5.8. OPERATIONS WITH LINEAR TRANSFORMATIONS Chapter 5


Chapter 6: Inner Product Spaces
Outputs
1. understanding of the concept of a vector space which admits an inner product

2. Ability to:

i) determine orthogonality of vectors

ii) orthogonalise/orthonormalise vector sets

6.1 Introduction
There are further concepts in the structure of vector spaces that did not appear in our
investigation in chapter 4, such as "length", "angle" between two vectors, "orthogonality" of
vectors, etc. (although some of these concepts did appear in 4 on section 4.1, 4.2, 4.2 and
4.4). Here we place an additional structure on a vector space V to obtain an inner product
space.
Also, we will adopt the notation used for vector spaces in chapter 4, i.e.:
u, v, w are vectors in V
a, b, c, k are scalars in K
Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise
stated or implied

6.2 Preliminaries
Recall that:
a) Referring to fig. 6.1a), the length or norm of a vector v = (α, β, γ) ∈ V3 (R), denoted kuk,
√ p p
is defined as kuk = OR2 + P R2 = OQ2 + QR2 + P R2 = α2 + β 2 + γ 2
and
i) if λ ∈ K , kλvk = |λ|kvk
ii) if kvk = 1, v is called a unit vector (normalised)
u
every nonzero vector can be normalised by setting v =
kvk
iii) kvk = 0 iff v = 0.

b)
If u and v are two vectors in V3 , represented by the points P and Q respectively, fig. 6.1b
The distance between two vectors u, v denoted d(u, v) is the distance between P and Q, or
0
length P Q = length OP .
0 0 0 0
P is the point (α − α , β − β , γ − γ q
)
0 0 0
d(u, v) = (α − α , β − β , γ − γ ) = (α − α0 )2 , (β − β 0 )2 , (γ − γ 0 )2 = ku − vk

c) The angle between two vectors u, v is P OQ = θ, such that 0 < θ ≤ π


by cosine rule P Q2 = OP 2 − 2OP OQcosθ, i.e.

73
CHAPTER 6. INNER PRODUCT SPACES

Figure 6.1: a) length of a vector b) distance between two vectors u-v

ku − vk2 − kuk2 − kvk2


cosθ =
−2kukkvk
0 0 0 0 0 0
(α − α )2 + (β − β )2 + (γ − γ )2 − (α2 + β 2 + γ 2 + α 2 + β 2 + γ 2
=
−2kukkvk
0 0 0
αα + ββ + γγ
=
kukkvk
0 0 0
If u = (α, β, γ) and v = (α , β , γ ), then the inner (or dot) product of u and v, denoted (u, v)
0 0 0
or (u.v) is defined by (u, v) = αα + ββ + γγ
From the above, (u, v) = kukkvkcosθ

Note
p
i) kvk = (u, v)
ii) Two vectors u, v are perpendicular, or orthogonal if cosθ = 0

Exercises 6.2

1. Which of the following pairs of vectors are orthogonal

i) (2, -1, 1) and (1, 2, 1)

ii) (2, 1, -3) and (1, 1, 1)

iii) (7, 5, 3) and (1, -2, 1)

2. Find a vector perpendicular to (2, -1, 2) and (1, -1, 2)

3. Find the lengths of the following vectors

i) (1, 2, 1)

ii) (-3, 2, 5)

iii) (1, 0, -1)

4. Find the angle between each pair of the following pairs of vectors

i) (3, -2, 1) and (1, -1, 1) ii) (2, 1, -1) and (1, 0, 2)

74 6.2. PRELIMINARIES Chapter 6


CHAPTER 6. INNER PRODUCT SPACES

6.3 Euclidean and Unitary Spaces


Definition 6.3.1. An inner product in V is a function which assigns to each ordered pair of
vectors u, v ∈ V , a scalar (u, v) ∈ K , with the following properties:
i) (u + v, w) = (u, w) + (v, w)
ii) (λv, w) = λ(v, w)
iii) (v, w) = (w, v)
iv) (v, v) ≥ 0 if v 6= 0

We can now state the properties of the inner product defined above, as a basis for the
definition of inner product spaces.

Definition 6.3.2. If u, v, w ∈ V3 (R) and λ ∈ R, then


i) [I]1 : (au1 + bu2 , v) = a(u1 , v) + b(u2 , v) (Linear Property)
ii) [I]2 : (u, v) = (v, u) (Symmetric Property)
iii) [I]3 : (u, u) > 0; and (u, u) = 0 if and only if u = 0 (Positive Definite Property):

The vector space V with an inner product is called an inner product space. A real inner
product space is called a Euclidean space. A complex inner product space is called a unitary
space.

6.4 Examples of Inner Product Spaces


1) If u = (α1 , α2 , . . . , αn ) and v = (β1 , β2 , . . . , βn ) ∈ Vn (R)
we define (u, v) = α1 β1 + α2 β2 + . . . + αn βn
This is an inner product on Vn (R) and hence Vn (R) is a euclidean space.
This is the standard inner product on Vn (R).
2) If u = (α1 , α2 , . . . , αn ) and v = (β1 , β2 , . . . , βn ) ∈ Vn (C)
we define (u, v) = α1 β¯1 + α2 β¯2 + . . . + αn β¯n
This is an inner product on VC (R) and hence VC (R) is a unitary space.
This is the standard inner product on VC (R).

3) If f, g ∈ C[a, b], define


Rb
(f, g) = a f (x)g(x) dx
Properties [I1 ] − [I3 ] of an inner product space apply to integration.
This is the standard inner product on C[a, b].

4) If u = (α1 , α2 , . . . , αn ) and v = (β1 , β2 , . . . , βn ) ∈ Vn (R)


define (u, v) = α1 β1 + 2α2 β2 + . . . + αn βn
This can be verified to be an inner product on Vn (R).

Henceforth and unless specified otherwise, V will denote an inner product space.

Example 6.4.1. Please use the following two examples in Exercises 6.4 which follow

Chapter 6 6.3. EUCLIDEAN AND UNITARY SPACES 75


CHAPTER 6. INNER PRODUCT SPACES

1) Verify that the following defines an inner product in R2

(u, v) = x1 y1 − x1 y2 − x2 y1 + 3x2 y2 , where u = (x1 , x2 ), v = (y1 , y2 )

Solution

i) Method 1

Let w = (z1 , z2 )

u + w = (x1 + z1 , x2 + z2 )

(u + w, v) =< ((x1 + z1 ), (y1 , y2 )) >

= (x1 + z1 )y1 − (x2 + z2 )y1 − (x1 + z1 )y2 + 3(x2 + z2 )y2

= x1 y1 + z1 y1 − x2 y1 − z2 y1 − x1 y2 − z1 y2 + 3x2 y2 + 3z2 y2

= x1 y1 − x2 y1 − x1 y2 + 3x2 y2 + z1 y1 − z2 y1 − z1 y2 + 3z2 y2

= (u, v) + (w, v)

(u, v) = x1 y1 − x2 y1 − x1 y2 + 3x2 y2 = x1 y1 − y2 x1 − y1 x2 + 3x2 x2 =(v, u)

(u, u) = x1 x1 − x2 x1 − x1 x2 + 3x2 x2

= x1 2 − 2x1 x2 + x2 2 + 2x22

(x1 − x2 )2 + x2 2 is greater than 0 if u is not zero.

Hence (u, v) is an inner product.

ii) Method 2

Using matrices. We can write (u, v) in matrix notation as follows:


   
x 1 −1 y1
(u, v) = u Av = 1
T
x2 −1 3 y2

Because A is real and symmetric (u, v) = (v, u), we need only show that A is positive def-
inite. The diagonal elements 1 and 3 are positive, and the determinant |A| = 3−1 = 2 is
positive. A is positive definite (refer chapter 7, section 7.8.5).

Hence (u, v) is an inner product.

2) Determine whether the following defines an inner product in R2

(u, v) = α1 β1 − α2 β1 + α1 β2 + 2α2 β2 , where u = (α1 , α2 ), v = (β1 , β2 )

Solution

It is clear that (u, v) 6= (v, u)

76 6.4. EXAMPLES OF INNER PRODUCT SPACES Chapter 6


CHAPTER 6. INNER PRODUCT SPACES

since (u, v) = α1 β1 − α2 β1 + α1 β2 + 2α2 β2

while (v, u) = α1 β1 − β2 α1 + α2 β1 + 2α2 β2

The two functions differ in the terms: e.g. α2 β1 is negative in (u, v), but positive in (v, u).

Hence the function is not an inner product

Exercises 6.4

1. Which of the following are inner products on V2 (R), if u = (α1 , α2 ), v = (β1 , β2 ),


i) (u, v) = α1 β1 + 2α2 β1 + α1 β2 − α2 β2
ii) (u, v) = α1 β1 − α2 β1 − α1 β2 + 2α2 β2
iii) (u, v) = α1 β1 − α2 β1 + α1 β2 + 2α2 β2
iv) (u, v) = α12 β12 + α22 β22

2. Which of the following are inner products on V3 (R)?


if u = (α1 , α2 , α3 ), v = (β1 , β2 , β3 ),
i) (u, v) = α1 β1 + 2α2 β2 + 3α3 β3 + α1 β2 + α2 β1 + α1 β3 + α3 β1 + 2α2 β3 + 2α3 β2
ii) (u, v) = α1 β1 + 2α2 β2 + 3α3 β3 + 2α1 β2 + 2α1 β3 + 4α2 β3
iii) (u, v) = α1 β1 + α3 β3 − α1 β2 − α2 β1 − α1 β3 − α3 β1 + α2 β3 + α3 β2

3. Which of the following are inner products on C[−1, 1], the vector space of real valued con-
tinuous functions defined on C[−1, 1], if If f, g ∈ C[−1, 1]?
R1
i) (f, g) = −1 f (x)g(x) dx
R1
ii) (f, g) = x2 −1 f (x)g(x) dx

4. a) Verify that the following is an inner product on R2 , where u = (x1 , x2 ) and v = (y1 , y2 )
(u, v) = x1 y1 − 2x1 y2 − 2x2 y1 + 5x2 y2
b) Consider the vectors u = (1, −3) and v = (2, 5) in R2
Find
i) (u, v) with respect to the standard inner product in R2
ii) (u, v) with respect to the inner product in R2 in a) above

5. Show that each of the following is not an inner product on R3 , where u = (x1 , x2 , x3 ) and
v = (y1 , y2 , y3 ):
a) (u, v) = x1 y1 + x2 y2
b) (u, v) = x1 y2 x3 + y1 x2 y3 .

6. Find the values of k so that the following is an inner product on R2 , where u = (x1 , x2 )
and v = (y1 , y2 )

6.5 Orthogonal Vectors


Consider u = (2, 1, −1), v = (1, 0, 2). Find (u, v)

Chapter 6 6.5. ORTHOGONAL VECTORS 77


CHAPTER 6. INNER PRODUCT SPACES

Solution
(u, v) = 2.1 + 1.0 + (−1).2 = 0
conclusion: u,v are orthogonal

Let V be an inner product space.

Definition 6.5.1. If u, v ∈ V and (u, v) = 0, then u and v are said to be orthogonal (or
perpendicular) to each other. A subset S of V is called an orthogonal set if the elements of S
are mutually orthogonal. An orthogonal set is called an orthonormal set if each vector has
unit length, i.e. if kvk = 1.

Example 6.5.1. 1) Refer to section 4.9 (page 39) for examples of bases

The stadndard bases for Vn (R) and Vn (C) are orthonormal relative to the standard inner
product.

2) Exercise

In V4 (R) with standard inner product, find the vectors orthogonal to u = (1, 1, 2, −1)

Solution

Let v = (α1 , α2 , α3 , α4 )

If v is orthogonal to u, then (u, v) = 0

= (1, 1, 2, −1).(α1 , α2 , α3 , α4 ) = 0

α1 + α2 + 2α3 − α4 ) = 0

The set of all vectors v orthogonal to u is gievn by all the solutions to this linear equation,
e.g.

(−1, 1, 0, 0), (0, −2, 1, 0), (0, 1, 0, 1) is an R-basis for the solution space to the linear equa-
tion and all R-linear combinations of these vectors are orthogonal to u.

Lemma [Link]. An orthogonal set of nonzero vectors in an inner product space V is linearly
independent.

Proof. Let {v1 , v2 , . . . , vn } be an orthogonal set of nonzero vectors in V.


Consider α1 v1 + α2 v2 + . . . + αn vn ) = 0, αi ∈ K , i=1,2,...,n
for 1 ≤ k ≤ n
n
X Xn
0 = ( (ai vi vk ) = ( ai (vi , vk )
i=1 i=1
ak (vk , vk
Sincevk 6= 0, (vk , vk 6= 0 then αn = 0. Hence {v1 , v2 , . . . , vn } is linearly independent over
K. 

Given an arbitrary basis of an inner product space V, is possible to find an orthonormal ba-
sis {u1 , u2 , . . . , un } of V?

Answer

78 6.5. ORTHOGONAL VECTORS Chapter 6


CHAPTER 6. INNER PRODUCT SPACES

Consider the set {v1 = (1, 1, 1), v2 = (0, 1, 1), v3 = (0, 0, 1)}

We can normalise v1 as follows:

v1 (1, 1, 1) 1 1 1
u1 = = √ = (√ , √ , √ )
kv1 k 3 3 3 3
Next we set w2 = v2 − (v2 , u1 )u1

2 1 1 1 2 1 1
= (0, 1, 1) − √ ( , , ) = (− , , )
3 3 3 3 3 3 3

We then normalise w2
w2 2 1 1
i.e. u2 = = (− √ , √ , √ )
kw2 k 6 6 6
Finally we set w3 = v3 − (v3 , u1 )u1 − (v3 , u2 )u2

1 1 1 1 1 2 1 1 1 1
= (0, 0, 1) − √ ( , , ) − √ (− √ , √ , √ ) = (0, − , )
3 3 3 3 6 6 6 6 2 2

Then we normalise w3
w3 1 1
u3 = = (0, − √ , √ )
kw3 k 2 2

Hence an orthonormal basis of R3 is


 
2 1 1 2 1 1 1 1
{u1 , u2 , u3 } = (− , , ), (− √ , √ , √ ), (0, − √ , √ )
3 3 3 6 6 6 2 2

6.6 Gram-Schmidt Orthogonalisation Procedure


Theorem 6.6.1. Every finite dimensional inner product space has a basis consisting of
orthonormal vectors

Alternative wording of theorem 6.6.1

Theorem 6.6.2. Let Let {v1 , v2 , . . . , vn } be an arbitrary basis of an inner product space V.
Then there exists an orthonormal basis {u1 , u2 , . . . , un } of V, such that the transition matrix
from vi to ui is triangular, i.e. for i=1,2,...,n ui = αi1 v1 , αi2 v2 , . . . , αii vi .

v1
Proof. Set u1 = {u1 } is normalised
kv1 k
w2
Next set w2 = v2 − (v2 , u1 )u1 and u2 = kw2 k
by lemma [Link] w2 (and hence u2 is orthogonal to u1 . {u1 , u2 } is orthonormal
w3
Next set w3 = v2 − (v3 , u2 )u2 − (v3 , u1 )u1 and u3 = kw3 k
by lemma [Link] w3 (and so u3 ) is orthogonal to u1 , u2 . {u1 , u2 , u3 } is orthonormal
In general, after getting {u1 , u2 , . . . , ui }
wi+1
set wi+1 = vi+1 − (vi+1 , ui )ui − (vi+1 , ui−1 )ui−1 − · · · − (vi+1 , u1 )u1 and ui+1 = kwi+1 k
Note that wi+1 6= 0 since vi+1 ∈
/ L(v1 , v2 , . . . , vn )
As above, the set {u1 , u2 , . . . , ui+1 } is orthonormal.

Chapter 6 6.6. GRAM-SCHMIDT ORTHOGONALISATION PROCEDURE 79


CHAPTER 6. INNER PRODUCT SPACES

By induction, we obtain an orthonormal set {u1 , u2 , . . . , un } which is independent, and


hence a basis ov V. 

1
Example 6.6.1. Extend the orthonormal set {v1 = (2, 0, −1, 2), v2 = (2, 1, 0, −2)} to
3
give an orthonormal basis for V4 (R)

Solution

Note that kv1 k = 1 = kv2 k

1
{v1 = (2, 0, −1, 2), v2 = (2, 1, 0, −2), v3 = (1, 0, 0, 0), v4 = (0, 0, 0, 1)} is an R-basis
3
for V4 (R)

To find an orthonormal set

we set u1 = v1

u1 = v1
1
set w3 = v3 − (v3 , u2 )u2 − (v3 , u1 )u1 = (1, −2, 2, 0)
9
1
w4 = v4 − (v4 , u3 )u3 − (v4 , u2 )u2 − (v4 , u1 )u1 = (1, −2, 2, 0)
9
{ 13 (2, 0, −1, 2), 31 (2, 1, 0, −2), 31 (1, −2, 2, 0), 13 (0, 2, 2, 1)} is an an orthonormal basis for V4 (R).

Exercises 6.6

1. Show that each of the following pairs of vectors are orthogonal


i) (2, 3, −2, 1, 0, 1) and (2, −1, 1, 0, 2, 1) in V6 (R)
ii) (i, 1, −i) and (1 − i, 2, 1 + i) in V3 (C)
iii) 1 and cos πx in C[0, 1]

2. Apply the Gram–Schmidt orthogonalization process to orthogonalize and then orthonor-


malize
i) v1 = (1, −1, 1), v2 = (2, 1, 1), v3 = (1, 0, 1) in V3 (R)
ii) v1 = (1, 1, 1, 1), v2 = (1, 2, 4, 5), v3 = (1, −3, −4, −2) in V4 (R)

3. Consider the subspace U of R4 spanned by the vectors


v1 = (1, 1, 1, 1), v2 = (1, 1, 2, 4), v1 = (1, 2, −4, −3)
Find
i) an orthogonal basis of U, ii) an orthononormal basis of U

4. Let V be the subspace of C[0,1] containing real polynomials of degree at most 3. Apply
the Gram–Schmidt orthogonalization process to the R-basis {1, x, x2 , x3 } for V.

80 6.6. GRAM-SCHMIDT ORTHOGONALISATION PROCEDURE Chapter 6


Chapter 7: Characteristic Roots and Vectors
Outputs

1) Eigenvectors and Eigenvalues understood

2) Ability to:

i) use Eigenvectors and Eigenvalues to diagonalise square matrices

ii) use Eigenvectors and Eigenvalues to diagonalise orthogonal and unitart matrices

Recall section 2.6 page 11, that we can form polynomials in the matrix A:
f (x) = a0 + a1 x + ... + an xn , where ai are scalars.
we define f A) to be the matrix f (A) = a0 + a1 A + ... + an An
In the case where f (A) is the zero matrix, then A is called a zero or root of the polynomial
f (A).

Now suppose that T : V → V is a linear operator on a vector space V. We can define f(T) in
the same way as we did for matrices:
f (T ) = an T n + . . . + a1 T + a0 I , where I is now the identity mapping.
We also say that T is a zero or root of f (t) if f (T ) = 0; the zero mapping.

Example 7.0.1. Let T : V → V be defined by T (x, y) = (x + y, x) and let f (t) = t2 −


2t + 3

Find f (T )(x, y)

Solution

7.1 Eigenvectors and Eigenvalues


Let T : V → V be a linear transformation from V to itself.
Some possible simple examples:
a) Identity transformation
T = Iv , so that T (v) = v , v ∈ V
b) Scaling or dilation
T = λIv , so that T (v) = λv , v ∈ V , λ ∈ K .

In general linear transformations are determined by what they do to a basis. The theory of
Eigenvectors and Eigenvalues helps us understand to what extent (and how) a linear
transformation can be understood as a scaling in varios directions.

Definition 7.1.1. An Eigenvvector of a linear transformation T is a nonzero vector v ∈ V


such that T (v) = λv , for some scalar λ.The scalar λ is known as the Eigenvalue
corresponding to v.

81
CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

Example 7.1.1. Consider the linear transformation T : R2 → R2 defined by T (x, y) =


(5x, 3y)

v = (1, 0) is an eigenvvector of T with eigenvalue 5.

T (v) = T (1, 0) = (5, 0) = 5v

w = (0, 1) is an eigenvvector of T with eigenvalue 3.

T (w) = T (0, 1) = (0, 3) = 3v

Generally v = (x, 0) is an eigenvvector of T with eigenvalue 5. Similarly v = (0, y) is an


eigenvvector of T with eigenvalue 3

Note
a) each eigenvvector has a unique eigenvalue associated to it.
b) each eigenvvalue has more than one eigenvector associated to it.
c) eigenvector (German: eigen=own) a vector which keeps its own direction when acted
upon by T. Other names: principal value, proper value, characteristic value.

Observe that T (v) = λv if and only if (T − Iλ)v = 0

Theorem 7.1.1. λ is an eigenvalue of an operator T if and only if the kernel of (T − λI) is


nontrivial (i.e. if the operator T − λI is singular).

Proof. for any v 6= 0


T (v) = λv or T (v) − (λI(v) = 0
(T − λI)(v) = 0
i.e. T − λI is singular. 

The set of all vectors v , (including 0), such that T v = λv is called the eigenspace of T
corresponding to λ)
A scalar λ is an eigenvalue of an nxn matrix A when det(A − λI) = 0.

Recall: for a square matrix A − λI , dim(ker(A − λI) 6= 0 if and only if it is not invertible, i.e.
if and only if its determinant=0.

Thus to find the eigenvalues of A − λI , we are looking for scalars λ such that
det(A − λI) = 0
det(A − λI) will always be a polynomial of degree n in λ.

Definition 7.1.2. The matrix A − λI is called the characteristic matrix of A.

82 7.1. EIGENVECTORS AND EIGENVALUES Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

 
a11 − λI a12 . . . a1n
 
 a21 a22 − λI . . . a2n 
A − λI =  

 . . . . . .

am1 am2 . . . amn−λI

Its determinant, ∆(λ) = det(A − λI), which is a polynomial in λ, is called the characteristic
polynomial of the matrix A.
We also call ∆(λ) = det(A − λI) = 0 the characteristic equation of A.

The following is one of the most important theorems of linear algebra.

Theorem 7.1.2. (Cayley–Hamilton) Every matrix A is a root of its characteristic polynomial


!
2 3
Example 7.1.2. 1) Let A =
0 −1
!
2−λ 3
det(A − λI) = = (2 − λ)(−1 − λ) = 0, or λ2 − λ − 2 = 0
0 −1 − λ

as expected from the Cayley–Hamilton theorem, A is a zero of ∆(λ)


!2 ! ! ! ( ! !)
2 3 2 3 1 0 4 3 2 3 2 0
∆(A) = − −2 = − + =
0 −1 0 −1 0 1 0 1 0 −1 0 2
!
0 0
0 0

We see that the above characteristic polynomial has zeros λ = 2, −1.

These are the eigenvalues of A.

Once we know the eigenvalues of a matrix A, we can find bases for the kernel, ker(A−λI).
ker(A − λI) is called the eigenspace corresponding to the eigenvalue λ.

Eigenvalues tell us that the linear transformation is scaling by the amount λ. Eigenvectors
tell us where the the scalings are done.

For the above example, we want bases for the eigenspaces for ker(A−2I) and ker(A+
I).
! ! (
0 3 x 0x + 3y = 0
We set E1 : A − 2I)X = or
0 −3 y 0x − 3y = 0
(" #)
1
is a basis for the eigenspace E2 corresponding to the eigenvalue λ = 2.
0
! !
3 3 x 
and E2 : (A + I)X = or 3x + 3y = 0
0 0 y

Chapter 7 7.1. EIGENVECTORS AND EIGENVALUES 83


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

(" #)
−1
is a basis for the eigenspace E−1 corresponding to the eigenvalue λ = −1.
1

   
3 0 −5 3−λ −5
0
2) Let B = 0 3 0, det(B − λI) =  0 3−λ 0
   

0 0 0 0 0 0−λ

The characteristic equation of B is (3 − λ)2 (−λ) = 0,

and this polynomial has zeros λ = 3, with algebraic multiplicity 2, and λ = 0 with alge-
braic multiplicity 1.
     
0 0 −5 x 0 0 1 x
  
E3 : (B − 3I)X = 0 0 0  y  ∼ 0 0 0  y  or 0x + 0y + 1z = 0
   

0 0 −3 z 0 0 0 z

This gives z=0,with x and y as free variables, so we take x=1, y=0 and x=0,y=1
   
( 1 0 )
E3 = 0 , 1
   

0 0

1 0 − 35 ( 5 )
    
x 3
E0 : (B − 0I)X = 0 1 0  y  E0 =  0
    

0 0 0 z 1

Note

dim[E3 ] = 2, algebraic multiplicity (algmult) λ = 3 is 2

dim[E0 ] = 1, algebraic multiplicity λ = 0 is 1

dim[Eλ ] not always = algmult(λ), but

1 ≤ dim[Eλ ] ≤ algmult(λ) always

dim[Eλ ] is also called geometric multiplicity of λ.


!
1 1
3) Let C =
0 1

λ = 1 algmult(1)=2
! (" #)
0 1 1
C −I = E0 =
0 0 0

84 7.1. EIGENVECTORS AND EIGENVALUES Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

dim[E1 ] = 1, algmult(1)=2, dim[E1 ] < algmult(1)


!
0 −1
4) Let D = λ2 + 1 = 0, and λ = ±i
1 0
! !
i −1 1 −i
D − iI = R →R +iR
1 i −−2−−−
2 1
−−→ 0 0
! ! ! (" #)
−1 −i x 0  −i
= or −x − iy = 0 and Ei =
0 0 y 0 1
! !
−i −1 1 −i
D + iI = R →R −iR
1 −i −−2−−−
2 1
−−→ 0 0
! ! ! (" #) (" #)
1 −i x 0  i −i
= or x − iy = 0 and is basis for Ei and
0 y
0 0 1 1
is basis for E−i

Theorem 7.1.3. Nonzero eigenvectors belonging to distinct eigenvalues are linearly inde-
pendent.

Exercises 7.1

1. Find the characterisitc polynomial, eigenvalues and eigenvectors of the matrices


   
  1 0 −1 0 1 0
1 2
i) , ii) 1 2 1, iii) 0 0 1
2 1
2 2 3 1 −3 3

2. a) True or False:
If A and B are similar matrices, say B = P −1 AP , where P is invertible, then A and B have
the same characteristic polynomial
b) matrices A and B can have different characteristic polynomials (and so be nonsimilar
matrices) but may have the same minimal polynomial.

7.2 Similarity of Matrices


A matrix A may be a matrix representation of a linear transformation T : V → with respect to
a given basis, i.e. some basis S of V, such that A = [T ]S
0
Suppose S is another basis of V, then B = [T ]S 0
What is the relationship between A and B?.

They must be similar ’to some degree’ because they represent the same linear
transformation, only with respect to different bases.
It is intuitive that they have the same: rank, nullity, determinant, eigenvalues.

We shall say that nxn matrices A and B are similar if they represent the same linear

Chapter 7 7.2. SIMILARITY OF MATRICES 85


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

transformation under different choices of bases.


We say A and B are similar if there is some invertible matrix P, such that
A = P −1 BP or B = P −1 AP

7.3 Diagonalisation of Square Matrices and Linear Transformations


A linear transformation T : V → V is said to be diagonalisable if there is some basis S of V
such that [T ]S is a diagonal Matrix.

A square matrix A is diagonalisable if the corresponding linear transformation for which A is


the standard matrix is diagonalisable.

We are saying if A is diagonalisable, then A is similar to a diagonal Matrix.


If a matrix/linear transformation is diagonalisable it tells us how the scalings are done in
different directions.

Question
When is a square matrix diagonalisable?
Answer
An nxn matrix A is diagonalisable if and only if there is a basis R consisting of the
eigenvectors of A.

Theorem 7.3.1. A linear operator T : V → V can be represented by a diagonal matrix B if


and only if V has a basis consisting of eigenvectors of T. In this case the diagonal elements
of B are the corresponding eigenvalues.

Alternative form of theorem 7.3.1

Theorem 7.3.2. An n-square matrix A is similar to diagonal matrix B if and only if A has n
linearly independent eigenvectors In this case the diagonal elements of B are the
corresponding eigenvalues.

In this theorem, we have B = P −1 AP where A is the matrix representation of T, P is the


matrix of the n linearly independent eigenvectors of A and B is a diagonal matrix.

Proof. (of theorem 7.3.1) Suppose A is diagonalisable, then A = P −1 DP , where D is a


diagonal matrix and P is invertible.
The columns of P form a basis for Rn .
Let P = {v1 , v2 , . . . , vn } and let the diagonal entris of D be λ1 , λ2 , . . . , λn
Since AP=PD Note: {AP = P DP −1 P } = P D
AP = A(v1 , v2 , . . . , vn ) = (Av1 , Av2 , . . . , Avn ) and P D = P (λ1 e1 , λ2 2, . . . , λn n)
We see that columns of P are eigenvectors of A.
Let λ1 , λ2 , . . . , λn be corresponding eigenvalues.

Now suppose S = v1 , v2 , . . . , vn is a basis of Rn consisting of eigenvectors of A.

86 7.3. DIAGONALISATION OF SQUARE MATRICES AND LINEAR TRANSFORMATIONS Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

Let λ1 , λ2 , . . . , λn be corresponding eigenvalues.


Let P = (v1 , v2 , . . . , vn ), and let D be diagonalisable with λ1 , λ2 , . . . , λn .
P is invertible.
Further [Av1 , v2 , . . . , vn ) = (Av1 ] = = [λ1 v1 , λ2 v2 , . . . , λn vn ] and so A = P −1 DP . 

theorem 7.3.1 implies that if A is diagonalisable, then the characterisitc polynomial of A


factors completely into linear factors.
 
1 1
1. Consider C = the characteristic polynomial of C is (1 − λ)2 = 0
0 1
( )
1
The only eigenvalue is λ=1, and is a basis for E1
0
We have only one eigenvector and so C is not diagonalisable.

A is diagonalisable iff the sum of the dimensions of the eigenspaces is n.


  (1 0)
3 0 −5
2. Consider B = 0 3 0, λ(3 − λ) = 0 and λ = 0, 3. 0 , 1 is basis for E3
0 0 0 0 0
( 5 )
3
and  0 is basis for E0
1
Note
Algmut(λ3 ) = 2
dim[E3 ] could be 1 or 2
since dim[E3 ] = 2, B is diagonalisable, because dim[E3 ] + dim[E0 ] = 3 = n
If dim[E
 3 ] was 
1, B wouldnot be diagonalisable
5
1 0 − 53
  
1 0 3 3 0 0
P = 0 1 0; D = 0 3 0; P −1 = 0 1 0; B = P DP −1
0 0 1 0 0 0 0 0 1
  (   )
0 −1 2 i −i
3. D = , λ + 1 = 0, λ = ±i; C-basis= ,
1 0 1 1
Algmut(real eigenvalues)=0, D is not diagonalisable over R. There is no invertible matrix P
and diagonal matrix E, both with real entries, such that D = P EP −1
But There is P and E if we allow their entries to be complex.
Sum of dimensions
 of eigenspaces
  =2=n.  
i −i i 0 −i 1
P = ; E= ; P −1 = 1
2 ; D = P −1 EP
1 1 0 −i i 1

Exercises 7.3

Find an invertible matrix P such that P −1 AP is a diagonal matrix if A is:

 
1 0 −1
1. 1 2 1,
2 2 3

Chapter 7 7.3. DIAGONALISATION OF SQUARE MATRICES AND LINEAR TRANSFORMATIONS 87


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

 
1 −1 −1
2. 1 −1 0,
1 0 −1
 
2 1 0 0
2 1 0 0
3. A = 
0 0 0
.
3
2 1 3 0

7.4 Minimum Polynomial


Definition 7.4.1. Let A an n-square matrix over a field K. There are nonzero polynomials
f (λ) for which f(A)=0, e.g. the characteristic polynomial of A.
Among such polynomials, the monic polynomial (monic=leading coefficient=1) of lowest
degree m(λ) is called the minimum polynomial of A. Such a polynomial is unique.

If a matrix A has distinct eigenvalues, then the minimum and characteristic polynomial of A
coincide. A matrix A is diagonalisable if and only if its minimum polynomial factors into
distinct linear factors.

Example 7.4.1. Find the minimum polynomial and characteristic polynomial of


 
3 0 −5
A = 0 3 0
0 0 0

We saw in section 7.1, page 81, that characteristic equation of A is (3 − λ)2 (−λ) = 0,

the possible minimum polynomial of A is one of

i) λ(3 − λ), ii) λ(3 − λ)2 , iii) (3 − λ)2 .

We determine the minimum polynomial by elimination:


    
3 0 −5 0 0 −5 0 0 0
(A − 0I)(3 − λ) = 0 3 0 0 0 0 = 0 0 0
0 0 0 0 0 −3 0 0 0

(A − 0I)(3 − λ) is the minimum polynomial of A. Since the minimum polynomial factors


into disntict linear factors, A is diagonalisable. No need to test λ(3−λ)2 and (3−λ)2 ,
since we already found the minimum polynomial.

Exercises 7.4.1

1. Find the minimum polynomial of the following matrices


 
  1 1 0
1 −1
i) , ii) 0 1 0
−1 1
0 0 2

88 7.4. MINIMUM POLYNOMIAL Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

2. If a matrix A has has the characteristic polynomial (λ − 1)3 (λ + 1)2 (λ − 3), find all the
possible minimum polynomials of A.

3. By calculating the minimum polynomial, determine which of the following matrices are di-
agonalisable
     
  1 −1 −1 2 0 −1 0 0 3
4 −1
i) ii) 0
, 3 2, iii) −1 2 2 , iv) 0 4 0,
1 2
0 −1 0 1 −1 −1 3 0 0
  1 1

0 0 1 2 0 2
v) 0 1 0, vi)  0 1 0
1 1
1 0 0 2 0 2

7.5 Diagonalisation of Symmetric Matrices


7.5.1 Real-Symmetric and Hermittian Matrices
Conjugate Transpose
The conjugate transpose of a complex matrix A, denoted A∗ is given by
A∗ = AT , where the entries of A are the complex conjugates of corresponding entries of A.

 
3 + 7i 0
Example 7.5.1. Determine A∗ for the matrix A =
2i 4 − i

Solution
   
3 + 7i 2i ∗ 3 − 7i −2i
A= = A =
0 4−i 0 4+i
 
3 − 7i 0
A∗ = AT =
−2i 4 + i

7.5.2 Properties of the Conjugate Transpose


1) (A∗ )A∗ = A
2) (A + B)∗ = A∗ + B ∗
3) (kA)∗ = kA∗
4) (AB)∗ = A∗ B ∗

7.6 Orthogonal and Unitary Matrices


Definition 7.6.1. a) A real matrix A, for which (AT )A = A(AT ), or equivalently AT = A−1 is
called an orthogonal matrix.

b) A complex matrix A, for which (A∗ ) = A−1 ), or equivalently AA∗ = A∗ A = I is called a


unitary matrix.
! ! !
1+i 1−i 1−i 1+i 4 0
e.g. A = 1
2 A∗ = 1
2 A∗ A∗ = 1
4 I2
1−i 1+i 1+i 1−i 0 4
Ann is unitary iff its row (column) vectors form an orthonormal set in C n .

Chapter 7 7.5. DIAGONALISATION OF SYMMETRIC MATRICES 89


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

c) A square matrix A, for which A = A∗ , is called a Hermitian matrix.

We can recognise symmetric, Hermitian matrices by inspection.


A square matrix A is Hermitian iff:
1) the entries on the main diagonal of A are real
2) the entry aij is the complex conjugate of aji
!
1 3−i
A= is not Hermitian. There is an imaginary entry on its main diagonal
3+i i
!
0 3 − 2i
B= is symmetric but not Hermitian, a12 is not a complex conjugate of a21
3 − 2i 4
 
3 2 − i −3i
C = 2 + ii 0 1 − i is Hermitian
 

3i 1 + i 0
 
−1 2 3
D= 2 0 −1 is Hermitian. All A real symmetric matrices are Hermitian.
 

3 −1 4

The following conditions for a matrix A are equivalent:


i) A is unitary (orthogonal)
ii) the rows of A form an orthonormal set
iii) the columns of A form an orthonormal set

Exercises 7.6

1. determine the conjugate transpose of the given matrix.


   
i −i 1 + 2i 2 − i
i) , ii)
2 3i 1 1
 
1+i 1−i
2. Show that matrix A is unitary, where A = .
1−i 1+i

3. Explain why the given matrix is not unitary.


   
i 0 1 i
i) , ii)
0 0 i −1

4. Determine whether the given matrix is or is not Hermitian.


 
0 2+i 1    
1 0 1 2+i 3−i
i) 2 − i 1 0, ii) , iii)
0 1 2−i 2 3+i
1 0 1

7.7 Hermittian and Symmetric Matrices


 
1 2
Example 7.7.1. 1) Find a real orthogonal matrix P for which P T AP diagonal, if A =
2 1

90 7.7. HERMITTIAN AND SYMMETRIC MATRICES Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

No. A is a Symmetric Matrix (real) A is a Hermitian Matrix (complex)

1. Eigenvalues of A are real Eigenvalues of A are real

Eigenvectors corresponding to distinct Eigenvectors corresponding to distinct


2.
eigenvalues are orthogonal eigenvalues are orthogonal

A is orthogonally diagonalisable. There A is unitarily diagonalisable. There ex-

3. exists an orthogonal matrix P, such that ists a unitary matrix U, such that U ∗ AU

P T AP is diagonal is diagonal

Table 7.1: Comparison of Hermittian and Symmetric Matrices

Solution

Step 1

Form the characteristic equation and find eigenvalues


 
1−λ 2
|A − λI| = , = (1 − λ)2 − 4 = 0 or = (λ)2 − 2λ − 3 = 0;
2 1−λ
λ = −1, 3.

Step 2

Find the eigenvvectors


(
x+y =0
    
2 2 x 0 1
E−1 : = or and v1 =
2 2 y 0 x+y =0 −1
(
−x + y = 0
    
−2 2 x 0 1
E3 : = or and v2 =
2 −2 y 0 x−y =0 1

Step 3
 
1 1 −1
normalise {v1 , v2 } to get P = 2 1 1
 
2 i
2) Let D = . Verify that A is normal. Find a unitary matrix P such that P ∗ AP is di-
i 2
agonal. Find P ∗ AP .

Solution

Step 1

Form the characteristic equation and find eigenvalues


 
2−λ i
|A − λI| = , (2 − λ)2 − i ∗ 2 = 0 or (λ)2 − 4λ + 5 = 0 and
i 2−λ

Chapter 7 7.7. HERMITTIAN AND SYMMETRIC MATRICES 91


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

λ = 2 + i, 2 − i

= (3 − λ)(λ2 − 2) − (−2 + i){(−2 − i)λ − (3i + 3)} + 3i{(−2 − i)(−1 − i) + 3λi}

Step 2

Find the eigenvvectors


(
−ix + iy = 0
      
−i i x 0 1
E2+i : = or and v2+i =
i −i y 0 ix − iy = 0 1
(
ix + iy = 0
      
i i x 0 1
E2−i : = or and v2−i =
i i y 0 ixiy =0 −1

Step 3

normalise {v2+i , v2−i } to get


        
1 1 1 1 1 2 i √1 1 1 2+i 0
P =√ , P ∗ AP = √1 =
2 1 −1 2 1 −1 i 2 2 1 −1 0 2−i

Exercises 7.7

1. Find an orthogonal matrix P and a diagonal matrix D such that D = P T AP , if A is


     
1 0 2 1 0 −4 11 0 6
i) 0 2 0, ii)  0 5 4, iii)  0 5 6
2 0 2 −4 4 3 6 6 −2

2. True or False:
Every symmetric matrix is diagonalizable.

92 7.7. HERMITTIAN AND SYMMETRIC MATRICES Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

7.8 Quadratic Forms


Outcomes

i) understand what is a quadratic form


ii) find a matrix, given a quadratic form
iii) convert a quadratic into diagonal form
iv) determine if a matrix is positive, negative or indefinite

Definition 7.8.1. A quadratic form in the real variables x1 , x2 , . . . , xn is a polynomial


p(x1 , x2 , . . . , xn ) with real coefficient, such that each term has degree 2.

Example 7.8.1. 1) for three variables, p(x, y, z) = 3+2x−4xy+23x2 y 2 z 2 is not quadratic


because the degrees of its terms are 0, 1, 2, 6.

2) q(x, y, z) = x2 + 3xy − 4xz − 15z 2 is a quadratic form because the degree of each
term is 2.

A quadratic form in n variables corresponds to an nxn symmetrix matrix.

Example 7.8.2. Consider the quadratic form −2x2 + 4xy + 3y 2


   
−2 4 x
, X = x y , XT =

Let A =
4 3 y
    
 −2 4 x x
X T AX = x y = (−2x + 4y4x + 3y) = (−2x + 4y)x = (4x + 3y)
4 3 y y
= −2x2 + 4xy + 3y 2

Note

• X T AX = −2x2 + 4xy3y 2 is a polynomial in quadratic form;

• A is called the matrix of the quadratic form.

Example 7.8.3. Find the quadratic form associated with the matrix
 
−3 −5
A=
−5 4

Solution
   
x T x
Let X = , X =
y y
  
 −3 −5 x
then X T AX = x y = −3x2 − 10xy + 4y 2
−5 4 y

Note
• in both examples above, A is a symmetric matrix, i.e. A = AT ;

Chapter 7 7.8. QUADRATIC FORMS 93


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

! !
  a h x
• the general quadratic form is X T AX = x y = ax2 + 2hxy + by 2 ;
h b y

• we can have a quadratic form in any number of variables.

Example 7.8.4. Find the quadratic form in 3 variables x,y,z associated with the matrix
 
1 4 7
A = 4 2 5 
7 5 3

Solution

Let X T = (x, y, z)
  
1 4 7 x
X T AX = x y z 4 2 5  y  = x2 + 8xy + 14zx + 2y 2 + 10zy + 3z 2


7 5 3 z

7.8.1 General Quadratric Form in Three Variables


   
a d e x
Let A = d b f  , X =  y , a, b, c, d, e, f are real numbers.
   

e f c z
  
  a d e x
T
X AX = x y z d b f   y 
  

e f c z
 
x
= ax + dy + ez, dx + by + f z, ex + f y + cz)  y 
 

z
= ax2 + dyx + ezx + dxy + by 2 + f zy + exz + f yz + cz 2

Example 7.8.5. write the quadratic form x2 + y 2 + z 2 as X T AX

Solution

• coefficients of x2 + y 2 + z 2 are 1 (the leading diagonal entries = 1);

• there are no xy, yz and xz terms (the remaining entries in matrix = 0).
    
1 0 0 x x
x2 + y 2 + z 2 = x y z 0 1 0  y  = x y z I  y 
 

0 0 1 z z

7.8.2 Converting a Quadratric Form to Diagonal Form


Question
Can we change the matrix of the quadratic form into diagonal form, without cross-product
terms?

94 7.8. QUADRATIC FORMS Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

Answer; Recall that a symmetric matrix is orthogonally diagonalizable, i.e. there exists a
matrix P, such that P T AP is diagonal.
Question
How can we do this?
Answer
Let X = P Y , where X = (x1 , x2 ), Y = (y1 , y2 )

P is an orthogonal matrix which diagonalises A.

Let D be the diagonal matrix with leading diagonal entries being the eigenvalues of A.
Then X T AX = Y T DY

Here’s why
X T AX = (P Y )T A(P Y )
= P T Y T A(P Y )
= Y T (P T AP )
= Y T DY

What is the advantage of writing the quadratic form in diagonal form?


The advantage is that in Y T DY there are no cross-product terms.

Example 7.8.6. 1) write the quadratic form 10x21 − 8x1 x2 + 4x22 as X T AX

2) Find the orthogonal matrix P which diagonalises A

3) Find the diagonal matrix D, such that P T AP = D


 
T DY y1
4) determine the diagonal form f (Y ) = Y , where Y =
y2

Solution

• leading diagonal entries: 10, 4; • other entries: -4 ( 12 of x1 and x2 coefficients).


  
10 −4 x1
10x21 4x22

1) the quadratic form is − 8x1 x2 + = x1 x2
−4 4 x2
 
10 −4
to find the orthogonal matrix P, we know A = is symmetric, and therefore di-
−4 4
agonalisable.

2) We diagonalise A by finding eigenvalues of A and the corresponding normalised eigen-


vectors.

10 − λ −4
det(A − λI) = = (10 − λ((4 − λ) − 16 λλ2 − 14λ − 24 = 0
−4 4 − λ

λ = 2, 12

Chapter 7 7.8. QUADRATIC FORMS 95


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

(
8x1 + 4x2 = 0
    
8 −4 x1 0
E2 : (A − 2λ = = or and so 2x1 = x2
−4 2 x2 0 −4x1 + 2x2 = 0
    
1 1
E2 = , normalised to √1
2 5 2
    
−2 −4 x1 0 
E12 : (A − 12λ = = or −x1 − 2x2 = 0 and so x2 = −2x1
−4 −8 x2 0
    
2 1 2
E12 = , normalised to √
−1 5 −1

 
1 1 2
3) the normalised matrix P = (E1 , E2 ) = √
5 2 −1

      
1 T 1 2 10 −4 1 1 2 2 0
4) the daigonal matrix D = P AP = √ √ =
5 2 −1 −4 4 5 2 −1 0 12

  
T DY
 2 0 y1
5) f (Y ) = Y = y1 y2
0 12 y2
 
y
= (2y1 12y2 ) 1 = 2y12 + 12y22
y2

7.8.3 Principal Axes


Theorem 7.8.1. Let A ∈ Mnxn be a symmetric matrix. Then there exists a change of
variable X = P Y such that the quadratic form X T AX becomes Y T DY , with D an nxn
diagonal matrix. The columns of P are the principal axes.

 
λ1 0 ... 0  
 y1
0 λ2 . . . 0  

 
   y2 
Proof. X T AX = Y T DY = y1 y2 . . . yn  0 . . . . . .
 0   ..
 
  .
 0 0 λk . . . 0

yn
0 0 . . . λn


The principal axes theorem states that X T AX = Y T DY = λ1 y12 + λ2 y22 + . . . + λn yn2


 
y1
 y2 
 
where λ1 , λ2 , . . . , λn are eigenvalues of A and y = 
 ..

 .
yn
If A is not diagonal, the quadratic takes the form

96 7.8. QUADRATIC FORMS Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

3. y

2.

1.

x
0
4 y
−3. −2. −1. 0 1. 2. 3.

2
−1. 0
x

−4 −2 2 4
−2.
−2

−4
−3.

Figure 7.1:
a) Ellipse b) Hyperbola
Conics in Standard Form

Q(X) = ax2 + 2bxy + cy 2 + dx + ey = C ,

and the the ellipse or hperbola is rotated. Converting Q(X) to diagonal form transforms Q(X)
0 0
2 + a 0x 2 = C 0 0
into stadard form: Q(X) = a11 x11 22 22

Geometric view of Principal Axes


Consider
q(x, y) = Q(X) = X T AX
The general form of a conic is q(x, y) = ax2 + 2bxy + cy 2 + dx + ey = f Q(X) = C is
either: • an ellipse, a cricle or a hperbola, or • two intersecting lines, a point or no points

If A is diagonal, the quadratic is of the form

Q(X) = a11 x211 + a22 x222 = C

Example 7.8.7. Identify and sketch the graph of the conic given by the equation 5x2 −6xy+
5y 2 = 8

Solution

5−λ −4
det(A − λI) = , (5 − λ)2 − 9 = 0 or λ2 − 10λ + 16 = 0; λ = 2, 8
−4 5 − λ
! ! !
3 −3 x1 0 
E2 : (A − 2λ = = or 3x1 − 3x2 = 0 and so 2x1 = x2
−3 3 x2 0

Chapter 7 7.8. QUADRATIC FORMS 97


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

3. y

0
0
2. 0 x
y x
0
4
1. y 4
2
x 2
0
−3. −2. −1. 1. 2. 3.
−2
−1. −2
−4
−4
−2.

−3.

Figure 7.2:
a) Ellipse b) Hyperbola
Rotated Conics

" #  " #
1 1
E2 = , normalised to √1
1 2 1
! ! !
−3 −3 x1 0 
E8 : (A − 8λ = = or −3x1 − 3x2 = 0 and so x2 = −x1
−3 −3 x2 0
" #  " #
1 1 1
E8 = , normalised to √
−1 2 −1
!
1 1 1
the normalised matrix P = (E1 , E8 ) = √
2 1 −1

0 0
   
Taking new axes 0x , 0x in the directions u = √1 √1 ,v= √1 − √12 , (i.e. putting
2 2 2
0 0 0 0
x= √1 (x + y ), y = √1 (x −y )
2 2

0 0 0 0
0 0 x2 y2 x2 y2
the equation becomes x2+y2 =8 or + = 1; + 2 =1
4 1 22 1
This is an ellipse with: length = 2 on x-axis, length = 1 on y-axis

Example Identify and sketch the graph of the conic given by the equation 2x2 −2xy+
√ 7.8.8. √
2y 2 − 2 2x + 4 2y = 8

Solution
 
x
Convert the first three terms to X T AX , where X =
y
√ √
The terms −2 2x, 4 2y are not quadratic, but we can write them in matrix form as

98 7.8. QUADRATIC FORMS Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

√ √ √ √
 
x
−2 2x + 4 2y = (−2 2, 4 2)
y
√ √ √ √
Let B = (−2 2, 4 2), then 2x2 − 2xy + 2y 2 − 2 2x + 4 2y = 8 can be written as
X T AX + BX = 8

2−λ −1
det(A − λI) = ; λ2 − 4λ − 3 = 0 or λ = 1, 3
−1 2 − λ
   
1 −1 1
E1 : (A − λ = and x1 = x2 ; E1 =
−1 1 1
   
−1 −1 1
E3 : (A − 3λ = and x2 = −x1 ; E3 =
−1 −1 −1
 
1 1 1
P = (E1 , E3 ) = √
2 1 −1
0 0
and the equation of the conic is x 2 + 3y 2 + Bx = 8

What is Bx?    0
x x
Recall that X = , but we need BX in terms of Y = 0
y y
 0
x
BX = B(P Y ) Recall (Principal Axes Theorem says) that X = P Y , where Y = 0
y
  0  0
√ √  1 1

1 x  x 0 0
= −2 2 4 2) √ 0 = 2 6 0 = = 2x + 6y and so
2 1 −1 y y
02 02 0 0
x + 3y + 2x + 6y = 8

completing the squares


0 0
(x + 1)2 + 3(y + 1)2 = 8 + 1 + 3
0 0
(x + 1)2 (y + 1)2
√ 2 + =1
12 22
00 0 00 0
Let x = x + 1, y = y + 1
00 00
x 2 y 2 √ 00 00
√ 2 + 2 = 1 an ellipse with length= 12 on x axis, 2 on y axis
12 2
00 00 0 0
(x , y ) = (0, 0) is at (x , y ) = (−1, −1)

7.8.4 Classification of Quadratric Forms


Consider the surfaces defined as q(x, y) = z
Curves in R2 are cuts of these surfaces with the plane
Some of the surfaces are always above z=0, a), others sometimes above and sometimes
below, b), and others always below, c)

We say q(x, y) is:


• positive definite if
q(x, y) > 0, ∀x ∈ Rn , X 6= 0

Chapter 7 7.8. QUADRATIC FORMS 99


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

4.
00
0
0 y y x
y 3. x
00

2.

1.
x
−4. −3. −2. −1. 0 1. 2. 3. 4.
−1.

−2.

−3.

−4.
√ √
Figure 7.3: 2x2 − 2xy + 2y 2 − 2 2x + 4 2y = 8

2 1 0
1 0 −1
0 1 −1 1 −2 1
0 0 0 0 0 0
0.5 1 −1 0.5 1 −1 0.5 1 −1
a) x2 + y 2 b) x2 − y 2 c) −x2 − y 2

Figure 7.4: Positive, Negaive and indefinite Quadratic Surfaces

• negative definite if
q(x, y) < 0, ∀x ∈ Rn , X 6= 0
• indefinite if q(x, y) assumes both positive and negative values
q(x, y) > 0, ∀x ∈ Rn , X 6= 0
• positive semidefinite if
q(x, y) ≥ 0, ∀x ∈ Rn , X 6= 0
• negative definite if
q(x, y) > 0, ∀x ≤ Rn , X 6= 0

Let Q(X) = X T AX with A ∈ Mnxn and symmetric


Let λi be the eigenvalues of A
Q(X) is:
• positive definite iff λi > 0, ∀λi
• negative definite iff λi < 0, ∀λi
• indefinite iff there positive and negative eigenvalues
• positive semidefinite iff λi ≥ 0, ∀λi
• negative semidefinite iff λi ≤ 0, ∀λi

100 7.8. QUADRATIC FORMS Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

7.8.5 Classification of symmetric matrices


A symmetric matrix is positive definite if its corresponding quadratic form is positive definite.
Analogously for negative definite, etc.

Exercises 7.8

1. Classify the following matrices


       
0 i 1 1 2 1 1 2
i) ii) iii) iv)
i 1 1 1 1 2 2 1

2. Reduce the following real quadratic forms to diagonal form


i) q(x, y) = x2 + y 2 + xy
ii) q(x, y) = x2 + y 2 − xy
iii) q(x, y) = 2xy + 2xz + 2yz
iv) q(x, y) = 5x2 + 11y 2 − 2z 2 + 12xz + 12yz

3. Find the principal axes, centre and sketch the graph of the following conics
i) xy = 2
ii) 3x2 − 2y 2 + 12xy = 42
iii) 3x2 − 2y 2 + 12xy = 42
iv) 7x2 + 4y 2 − 4xy = 24

Chapter 7 7.8. QUADRATIC FORMS 101


Solutions to Numerical Exercises
chapter 2

Exercises 2.2, page 12

! !
4 −3 3 3 10 −25 −5
1. i) , ii) not defined (ND)., iii) 3A + 4B − 2C =
2 −5 −1 −4 7 −2 10

2. x = 2, y = 4, z = 1, w = 3
! !
10 2 26 18
3. A2 = , a) A3 =
3 7 27 −1
! !
−4 8 0 0
i) f (A) = A3 − 3A2 − 2A + 4I = , ii) g(A) =
12 −16 0 0
!
3
4. u =
5

5. i) A2x3 B3x4 = C2x4 , ii) A4x1 B1x2 = C4x2 , iii) A3x4 B3x4 = N D, iv) A5x2 B2x3 = C5x3
 
−1 −8 −10 !
15 −21
6. a) AB =  1 −2 −5, BA =
 
10 −3
9 22 15
 
b) AB = 6 1 −3 , BA = N D
! !
0 0 5 5
c) AB = , BA =
0 0 −5 −5
 
1 2 4  
  1 2
3
0 3 4
7. At =  , i) B t = 2 4 −5
 
1 4 4
3 −5 0
 
0 5 4
 
! 12 10 −1
5 1
8. a) AAt = , At A = −1 5 −4
 
1 26
12 −4 16
 
4 −2 6  
t
b) AA = −2 At A = 14
1 −3,
 

6 −3 9
   
49 0 0 49 0 0
t
c) AA =  0 49 0, t
A A =  0 49 0
   

0 0 49 0 0 49

102
CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

! !
x y a b
9. A = , or
0 x 0 a

10. A(α)−1 = A(−α)


A(3α) − 3A(2α) + 3A(2α) − I ⇒ x3 − 3x2 + 3x − 1 = 0

Exercises 2.7, page 15

   
1 1
 5
 1 −2 3 1 0 2 2
1 0 0 8    
0 1 −5 6 1 2 1
1. i) A = 0 1 0 − 18 , ii) B =  , iii) C = 
  
0 0 0 0 0 0 0
1
0 1 1
  
8
0 0 0 0 0 0 0
   
1 −2 3 −1 1 −1 2 1
iv) D = 0 3 −4 4 , v) E = 0 1 −3 −1,
   

0 0 7 −10 0 0 −4 −1
 
1 0 1+i 1
vi) F = 0 1+i 1−i 
1

2 2 
0 0 0 0

2. i) Yes, ii) No

3. True

Exercises 2.9, page 17

1. a) i) (2,3,5), ii) (0,0,0), iii) 21 (3, −12, 3, 2)

b) i) (2,-1,1), ii) No Solution, iii) (-3-a,2+2a,a)

2. i) λ = 14, (−1 + µ, 3 − 2µ, µ); λ 6= 14, (0, 1, 1)

ii) λ = 1, ( 13 , − 23 )

Exercises 2.10, page 19

 
−2 4 6
1. 41  1 2 −1,
 

1 −2 −1
 
11 −9 1
1
2. −7 9 −2,
 
3
2 −3 1

Chapter 7 7.8. QUADRATIC FORMS 103


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

 
−3 2 1
1
3. 2 −1 0 1

5 −2 −1

chapter 3

Exercises 3.5, page 26

1. i) -18, ii) 12, iii) a3 + b3 + c3 − 3abc, iv) 76 , v) (t + 2)(t − 2)(t + 4),


vi) (a − b)(a − c)(b − c)(a + b + c)

a+b+c
2. 0, b − c,
2

Exercises 3.6, page 27

5 1
− 12
   
−5 −1
1 2 2
1. AdjA =  2 4 −2, A−1 =  −1 −2 1
   

1 −3 1 − 12 1
2 − 12
 
1 −t −t
1
2. t = ±1, −t 1 1
 
2
(1 − t )
−t t2 1
 
x2 − x −1 x+1
3. AdjA =  −2 x2 − x − 1 2(x + 1)
 

−(x + 1) −(x + 1) (x + 1)2

4. |A| = (a + b)2 + 1 ≥ 1, for all values of a and b.

Exercises 3.7, page 28

1. a) ∆ = 5, ∆x = 5, ∆y = 5, ∆z = 5
x = 4, y = −2, z = 3;
b) x = 3, y = −1, z = 2

2. unique solution when determinant of matrix of coeffs, D 6= 0, i.e when k 6= 1, or k 6= 2.


Note: When D = 0, Cramer’s rule does not say whether a solution exists. Use Gaussian
elimination show that the system has more than one solution when k = 1 and no solution
when k = −2.

chapter 4

Exercises 4.2, page 32

104 7.8. QUADRATIC FORMS Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

1. i) (u, v) = 17, ii) (v, u) = 17


a) (u, v) = 0, b) (v, u) = 0
(u, v) = 20 + 35i, c) (v, u) = 20 − 35i

2. u and v , and v and w

3. kuk = 13

4. k = ±3
1 7
5. i) k = ii) k = −
4 13

Exercises 4.4, page 35

1. x = 2, y = 4, z = 1, w = 3
   
−1 13
2. a) i)  7 ii)  24
   

−22 −29

3. a) u, v are linearly dependent


b) u, v are linearly independent
c) u, v are linearly independent (orthogonal)
√ √
4. a) i) u.v = 9 + 13i, ii) v.u = 9 − 13i, iii) kuk =
15, iv) kvk = 81
√ √
b) i) u.v = 19, ii) v.u = 19, iii) kuk = 26, iv) kvk = 74

5. a) u and v are orthogonal, if u.v = 0, i.e is < (3, k, −2), (6, −4, −3) >= 0
18 − 4k + 6 = 0, or k=3
√ p √
b) kuk = 39, where u =< (1, k, −2, 5), (1, k, −2, 5) >= 30 + k 2 = 39, or k = ±3
1 1
6. a) u
b= √1 (5, −7), b) vb = (1, 2, −2, 4), b= √
c) u (6, −4, 9)
74 5 133

Exercises 4.6, page 42

1. i) Yes, ii) No, iii) Yes, iv) Yes

2. i) Yes, ii) No, iii) Yes

3. i) Yes, ii) No, iii) Yes


     
( 0 1 0 0 0 1 0 0 0 )
4. a) Use subspace test, b) −1 0 0 ,  0 0 0 , 0 0 1 dim W=3
     

0 0 0 −1 0 0 0 −1 0

Exercises 4.12, page 45

Chapter 7 7.8. QUADRATIC FORMS 105


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

1. v cannot be written as a linear combination of the vectors u1 , u2 , u3 .

2. v = −3p1 + 2p2 + 4p3 .

3. M = 2A + 3B − C .

4. Hint: Express the vectors u + v , u − v , u − 2v + w as a linear combination,


a(u + v) + b(u − v) + c(u − 2v + w) = 0, and show that a = b = c = 0.

5. Hint: Show tha u, w can be written as a linear combination over the complex field C, but
not over the real field R.

6. e.g. {(1, 0, 1, 1), (1, 0, 2, 4), (1, 0, 0, 0), (0, 1, 0, 0)}.

7. i) Hint: Recall, the that set {u1 , u2 , u3 } is linearly independent if au1 + bu2 + cu3 = 0 ⇒
a=b=c=0
if aet + bsint + ct2 = 0,
letting t = 0, a1 + b(0) + c(0) = 0 ⇒ a = 0
pi 2
letting t = 2, 0et + b(1) + 0( pi4 ) = 0 ⇒ b = 0
letting t = pi, 0eπ + b(0) + c(pi2 ) = 0 ⇒ c = 0
aet + bsint + ct2 = 0 ⇒ a = b = c = 0, Accordingly, u, v, w are linearly independent
ii) similarly for {et , sint, cost}

8. a) True, b) True, c) True, d) True

Exercises 4.8, page 50

1. {(1, 2, 1), (0, 1, −2)}


( ! ! !) ( ! !)
1 0 0 1 0 0 1 0 0 0
2. e.g. , , , ,
0 0 −1 0 0 1 0 0 1 0
( !) ( ! ! ! !)
1 0 1 0 0 1 0 0 0 0
, , , ,
0 0 0 0 −1 0 0 1 1 0

3. −4u1 + 3u2 .

4. u - No, e.g. {(2, −1, 3, 2), (−1, 1, 1, 3), (3, −1, 0, −1), (1, 0, 0, 0)}

v - Yes, e.g. {(1, 0, 4, −1), (2, −1, 3, 2), (1, 0, 0, 0), (0, 1, 0, 0)}

5. a) (1,-2,5,-3) and (0,7,-9,2) form a basis of the row space of A and dim W = 2.
b) (1,-2,5,-3), (0,7,-9,2), (0, 0, 1, 0), and (0, 0, 0, 1) are linearly independent (they form an
echelon matrix), and so they form a basis of R4 , which is an extension of the basis of W.

6. a) dim(U + W ) = 3, (1, 3, −2, 2, 3), (0, 1, −1, 2, −1), (0, 0, 1, 0, −1)

b) dim(U ∩ W ) = 1, (1, 4, −3, 4, 2)

106 7.8. QUADRATIC FORMS Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

7. Hint: Show that dim(U ∩ W ) = {0} and R3 = U + W ,


since if dim(U ∩ W ) = {0} and R3 = U + W then R3 = U ⊕ W

8. Hint: show that i) U ∩ W = {0}, and ii) R3 = U + W .

9. a) dim(W ) = 2, (−2, 1, 0, 0), (1, 0, −1, 2)


b) dim(W ) = 0, {0}
c) dim(W ) = 3, (2, 1, 0, 0), (−1, 0, 1, 0), (3, 0, 0, 1)

chapter 5

Exercises 5.2, page 59

1. i) yes, ii) yes, iii) no

2. i) yes, ii) yes

3. i) yes, ii) ,yes iii) no

Exercises 5.5, page 68


Hint: use example 5.5.2

   
1 −1 0 5 −11 12
1
1. i) 1 2 −1, ii) 2 2 1 0
  

2 1 1 2 4 −6
 
3 0 2
2.  0 −1 0
 

−2 0 −2

3. Hint: use example 5.5.4


     

1 1 3 −4 1 −1 0 1 1!
 −1 2
 2 −1, −4 7 = 0 1 −1  2 −1
     
1 −1
−1 0 1 −2 0 0 1 −1 0

4. e.g. {(−1, 2, 3, 0), (1, 0, 0, −1)}, {(1, 1, 0), (−1, 2, 3)}


i) 2, 2 x = y = 0.
ii) 2, 2
( ! )
a b
5. a) {0}, M2 (R), b) {0}, M2 (R), c) |a, b ∈ R
b a+b
( ! )
a −b
d) |a, b ∈ R
a + b −a

Exercises 5.7, page 71

Chapter 7 7.8. QUADRATIC FORMS 107


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

1. Although G is nonsingular, it is not invertible, because R2 and R3 have different


dimensions, so T −1 does not exist.

2. T −1 (α, β, γ) = (α + β + γ, 3β + 3γ, −α + 5β + 2γ)

1
3. λ 6= 8

chapter 6

Exercises 6.2, page 74

1. i) and ii) only

2. (0, 2, -1)

3. i) 6

ii) 38

iii) 2

4. i) arccos−1 ( √6 )
42
π
ii) 2

Exercises 6.4, page 74

1. i) ii)
ii) i)
iii) i) and ii)
i) ii) iii) -13 iv) -71

2. k > 9

Exercises 6.6, page 80

1. i) {(1, −1, 1), (4, 5, 1), (2, −1, −3)


ii) v1 = (1, 1, 1, 1), v2 = (1, 2, 4, 5), v3 = (1, −3, −4, −2)

2. {(1, 1, 1, 1), (−1, −1, 0, 2), (1, 3, −6, 2)


i) { 21 (1, 1, 1, 1), √1 (−1, −1, 0, 2), 1
√ (1, 3, −6, 2)
6 5 2

3. {1, 2x − 1, 6x2 − 6x + 1, 20x3 − 302 + 12x − 1}

chapter 7

108 7.8. QUADRATIC FORMS Chapter 7


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

Exercises 7.1, page 85


i) (λ + 1)(λ − 3); -1, <(1,-1); 3; <(1,1)
ii) (λ − 1)3 ; 1, <(1,1,1)
iii) (λ − 1)(λ − 2)(λ − 3); 1, <(1,-1,0); 2; <(2,-1,-2); 3; <(1,-1,-2)

iv) a) True
Using λI = P −1 λIP , we have
∆B (λ) = det(λI − B) = det(λI − P −1 AP ) = det(P −1 λIP − P −1 AP )
= det(P −1 (λIA)P ) = det(P −1 )det(λI − A)det(P )
determinants are scalars and commute, and so det(P −1 )det(P ) = 1
Hence ∆B (λ) = det(λI − A) = ∆A (λ)

b) True
Exercises 7.3, page 87
 
1 2 1
i) P = −1 −1 −1, D = (1, 2, 3)
 

0 −2 −2
 
0 1+i 1−i
ii) P =  1 1 1, D = (−1, i, −i)
 

−1 1 1
 
0 −1 0 1
 
 0 2 0 1
iii) P = 
 , D = (−3, 0, 3, 3), λ(3 + λ)(3 − λ)2
 −1 0 1 0 

1 0 1 0

Exercises 7.4.1, page 88

1. i) λ(2 − λ) ii) (1 − λ)2 (2 − λ)

2. (1 − λ)2 (2 − λ), (1 − λ)(2 − λ), (1 − λ)2

3. i) ∆(λ) = (λ − 3)2 = m(λ) ⇒ Non diagonalizable


ii) ∆(λ) = (λ − 1)2 (λ − 2); m(λ) = (λ − 1)(λ − 2) ⇒ diagonalizable

iii) ∆(λ) = (λ − 1)3 (λ − 2) = m(λ) ⇒ Non diagonalizable,

iv) ∆(λ) = (λ)(λ + 2)(λ − 6) = m(λ) ⇒ diagonalizable

v) ∆(λ) = (λ − 1)2 (λ + 1); m(λ) = (λ − 1)(λ + 1) ⇒ diagonalizable

Exercises 7.6, page 90

! !
−i 2 1 − 2i 1
1. i) , ii)
i −3i 2−i 1

2. AA∗ = I .

Chapter 7 7.8. QUADRATIC FORMS 109


CHAPTER 7. CHARACTERISTIC ROOTS AND VECTORS

3. i) AA∗ 6= I , ii) AA∗ 6= I

4. i) Hermitian, ii) Hermitian, iii) not Hermitian

Exercises
 7.7, page
 92
− √15 √2
   
5
0 2 2 1 3 2 6
1 1
 
i) P =  0 0 1, ii) P =  2 −1 2, iii) P = −6 3 2
   
 3 7
√2 1
√ 0 −1 2 2 −2 −6 3
5 5

Exercises 7.8, page 101

! ! ! !
0 i 1 1 2 1 1 2
1. i) ii) iii) iv)
i 1 1 1 1 2 2 1

2. i) 23 x2 + 12 y 2 ii) 32 x2 + 21 y 2

iii) 2x2 − y 2 − z 2 iv) 7x2 − 7y 2 + 14z 2

3. i) √1 (1, 1), √1 (1, −1); x2 + y 2 = 4


2 2
ii) √1 (3, 2), √1 (2, −3); 61 x2 − 17 y 2 =1
13 13
√1 (1, 2), √1 (2, −1); 1 2
+ 13 y 2 = 1
iii)
5 5 8x

110 7.8. QUADRATIC FORMS Chapter 7

You might also like