0% found this document useful (0 votes)
62 views825 pages

Discrete Mathematics Course Outline

The document outlines a course on Discrete Mathematics taught by Professors Sajith Gopalan and Benny George at IIT Guwahati, detailing the weekly topics covered over a span of 12 weeks. Key subjects include Boolean functions, propositional and first-order logic, graph theory, and algebraic structures such as groups and rings. Each week consists of multiple lectures that progressively build on foundational concepts in discrete mathematics.

Uploaded by

singaporesaswat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views825 pages

Discrete Mathematics Course Outline

The document outlines a course on Discrete Mathematics taught by Professors Sajith Gopalan and Benny George at IIT Guwahati, detailing the weekly topics covered over a span of 12 weeks. Key subjects include Boolean functions, propositional and first-order logic, graph theory, and algebraic structures such as groups and rings. Each week consists of multiple lectures that progressively build on foundational concepts in discrete mathematics.

Uploaded by

singaporesaswat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DISCRETE MATHEMATICS

[Link] Gopalan
[Link] George K
Department of Computer Science and Engineering
IIT Guwahati
INDEX
[Link] TOPICS [Link]
Week 1

1 Lec 1: Boolean Functions 4

2 Lec 2: Propositional Calculus: Introduction 28

3 Lec 3: First Order Logic: Introduction 50

Week 2

4 Lec 4: First Order Logic: Introduction (Cont'd) 75

5 Lec 5: Proof System for Propcal 97


6 Lec 6: First Order Logic: wffs, interpretations, models 124

Week 3

7 Lec 7: Soundness and Completeness of the First Order Proof System 151
8 Lec 8: Sets, Relations, Functions
173

9 Lec 9: Functions, Embedding of the theories of naturals numbers 196


andintegers in Set Theory

10 Lec 10: Embedding of the theories of integers and rational 223


numbers in Set Theory; Countable Sets

Week 4

11 Lec 11: Introduction to graph theory 250

12 Lec 12: Trees, Cycles , Graph coloring 267

13 Lec 13: Bipartitie Graphs 284

Week 5

14 Lec 14: Bipartitie Graphs; Edge Coloring and Matching 297

15 Lec 15: Planar Graphs 312

16 Lec 16: Graph Searching; BFS and DFS 322

1
17 Lec 17: Network Flows 333

18 Lec 18: Counting Spanning Trees in Complete Graphs 349

Week 6

19 Lec 19: Embedding of the theory of ral numbers in Set 360


Theory; Paradoxes

20 Lec 20: ZF Axiomatization of Set Theory 390

21 Lec 21: Partially ordering relations 418

22 Lec 22: Natural numbers, divisors 456

Week 7

23 Lec 23: Lattices 480

24 Lec 24: GCD, Euclid's Algorithm 504

25 Lec 25: Prime numbers 521

26 Lec 26: Congruences 545


Week 8
27 Lec 27: Pigeon Hole Principle 567

28 Lec 28: Stirling Numbers, Bell Numbers 577


29 Lec 29: Generating Functions 591

30 Lec 30: Product of Generating Functions 607


Week 9
31 Lec 31: Composition of Generating Function 622

32 Lec 32: Principle of Inclusion Exclusion 632

33 Lec 33: Rook placement problem 643

Week 10

34 Lec 34: Solution of Congruences 656

2
35 Lec 35: Chinese Remainder Theorem 678

36 Lec 36: Totient; Congruences; Floor and Ceiling Functions 698

Week 11

37 Lec 37: Introduction to Groups 727

38 Lec 38: Modular Arithmetic and Groups 741

39 Lec 39: Dihedral Groups, Isomorhphisms 757

Week 12

40 Lec 40: Cyclic groups, Direct Products, Subgroups 771

41 Lec 41: Cosets, Lagrange's theorem 785

42 Lec 42: Rings and Fields 797

43 Lec 43: Construction of Finite Fields 810

3
Discrete Mathematics
Professor Sajith Gopalan, Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Mathematical Logic
Lecture 1

Welcome to the NPTEL MOOC on discrete mathematics.

(Refer slide time: 00:43)

We begin our study of mathematical logic today and we begin with propositional calculus.
Propositional calculus is also called propositional logic, sentential logic or 0th order logic, as
opposed to first-order logic, which is a richer logic and we shall study it later. In
propositional calculus, we deal with propositions, these propositions take on truth values, the
only semantic entities that we deal in propositional calculus are truth values. The truth values
could be 0 or 1 or true or false, 0 corresponds to false, and 1 corresponds to true, there are
only two truth values, there is no third truth value.

4
(Refer slide time: 02:12)

Propositions take on truth values as I mentioned, there are, what are called atomic
propositions. For these atomic propositions, the truth values are given, that is we do not
analyse these atomic propositions to find out how they became true or false, we just know
that they are true or false and then we can combine atomic propositions to form what are
called composite propositions.

Composite propositions are made using atomic propositions and these atomic propositions
are combined using logical connectives, that is composite propositions are synthesized from
atomic propositions using logical connectives. We can compute the truth values of composite
propositions using the truth values of its constituents and also the properties of logical
connectives.

5
(Refer slide time: 03:36)

This essentially gives us an algebra of truth values, algebra of 0 and 1, this algebra is called
Boolean algebra, named after Mathematician George Boole. So, let us begin with a study of
this algebra, Boolean algebra which is the underlying algebra of propositional calculus.

(Refer slide time: 04:05)

So, in Boolean algebra, we have what are called Boolean functions. A Boolean function has
a signature of this form. A Boolean function is a mapping from truth values to truth values.
So, a Boolean function with n inputs, an n variable Boolean function takes n truth values as
inputs and produces one truth value as the output. So, let us consider Boolean functions for
various values of n. When n is equal to 0, that is, when there is no input, there are two
Boolean functions, one is the constant Boolean function true ( T ), which always produces

6
one and then the constant Boolean function F, which always produces 0. So, these are the
zero variable Boolean functions.

(Refer slide time: 05:19)

When we consider 1-variable Boolean functions, we are considering functions that take one
Boolean value as input and produce one Boolean value as output, functions of this signature.
Such a function can be specified using what is called a truth table. In a truth table, we list the
inputs, so in this case there is only one input and specify the possible outputs. So, input here
could be either 0 or 1, if the input is 0 what would the output be? If the input is 1, what is
going to be the output? Once you specify these, we have the truth table for function f.

Now, how many such functions could be there, depending on how you choose f of 0 and f of
1, we will have different Boolean functions, which means there are two choices to be made,
one choice for f of 0 and 1 choice for f of 1, each choice is going to be a Boolean value. So,
there are going to be four such possible choices.

7
(Refer Slide Time: 06:38)

So, let us enumerate all such Boolean functions, when input x is 0 or 1, the output could be
both zeros, we are choosing 0 for both f of 0 and f of 1, or we could choose 0 for f of 0 and 1
for f of 1, or we could choose 1 for f of 0 and 0 for f of 1, or we could choose 1 for both. So,
these are the four possible choices for f of 0 and f of 1, each choice will define one Boolean
function.

So, we have four 1-variable Boolean functions. Now, let us name these functions, in
particular, let us look at the first column. In the first column both the function values are 0
therefore let me call this the zero function, the function which produces zero irrespective of
the input value. Similarly, look at the last column here f of 0 is 1 and f of 1 is also 1, so this is
a function which produces 1 irrespective of the input value, so let me call this function 1.

Now, the first column, these are the 0th and the third column respectively. So, let me consider
this first column now which corresponds to output values 0 and 1. So, when you observe that
these reproduce the input, we realize that this function is nothing but x, so this is the identity
function, f of x equals x. The first column corresponds to the identity function, so let me call
it x it reproduces x when you look at the second column you find that it complements x when
x is 0 it produces 1 and when x is 1 it produces 0.

So, this is the complement of x, this we call x bar the negation of x, negation of x inwards the
truth value when x is true, it produces false and when x is false it produces true. Negation of
x is denoted variously in these ways, so we will use all these notations interchangeably, so

8
you should remember these notations, negation of x could be represented in any of these
ways. So, we have now seen four 1-variable Boolean functions.

(Refer Slide Time: 09:24)

Now, let us go on to look at 2-variable Boolean functions, how many 2-variable Boolean
functions could there be? Let us say x and y are the inputs to our 2-variable Boolean
functions, since there are two Boolean inputs the possible inputs are 0, 0, 0, 1, 1, 0 and 1, 1,
so these four form the set of all possible inputs. Now, what could the output be? f of x, y to
specify f of x, y you have to specify a Boolean value at each of these four positions.

So, there are 4 positions to be filled in using 0 and 1, so there are 2 power 4 which is 16
possibilities. Let us look at all these 16 possibilities.

9
(Refer Slide Time: 10:52)

So, enumerating all these 16 possibilities, let us consider all possible 4 inputs 0, 0; 0, 1; 1, 0;
and 1, 1, so there are 4 vacancies to be filled f of x, y could be any of these 16. So, the first
one is 0, 0, 0, 0 and the next one is 0, 0, 0, 1, third one is 0, 0, 1, 0 and the fourth one is 0, 0,
1, 1 and the next one is 0, 1, 0, 0 then 0, 1, 0, 1; 0, 1, 1, 0; 0, 1, 1, 1 these form one half of the
possibilities, the remaining possibilities will complement these.

Then we have 1, 0, 0, 0 then we have 1, 0, 0, 1; 1, 0, 1, 0; 1, 0, 1, 1; 1, 1, 0, 0; 1, 1, 0, 1; 1, 1,


1, 0; 1, 1, 1, 1, so those are the 16 possibilities. So, let me number the columns in this fashion,
starting from 0 and going up to 15, there are sixteen such columns. Now, let us analyse these
columns, look at the 0th column, the 0th column produces a 0 irrespective of the input
whatever be the values of x and y the output is 0 therefore we call this the 0 function, its
mirror image is column 15, in column 15 the output is 1 irrespective of the input value
therefore this function we call 1.

Now, consider the first column, in the first column the output is 1 precisely when both the
inputs are 1, if either x is 0 or y is 0 then the output is 0, the output is 1 precisely when x and
y are 1, so this is called the AND function, which we denote in this manner, the AND
function, this is one way of denoting an AND function. Its complement is column 14, in
column 14 the output is 1 if either x or y is 0, this is the negation of the AND function,
therefore we call it NAND, this is called the NAND function, column 1 is the AND function.

Now, let us look at column 7, in column 7 the output is 0 precisely when both the inputs are 0
or in other words, the output is 1 if either x or y is 1, therefore this we called the OR function,

10
column 7 is the OR function. The complement of column 7 is column 8, there the output is 1
precisely when both the inputs are 0, this is called the NOR function, the negation of the OR
function which is denoted like this.

Now, let us come to column 6, column 6 is 1, if either x is 0 or y is 0 but not both that is
exactly 1 of x and y should be 1 for the output to be 1, this is called the XOR function
denoted in this fashion. The complement of that, is 1 precisely when both the inputs are same,
that is when both the inputs are 0 the output is 1 when both the inputs are 1 the output is 1 but
when one is 0 and the other is 1 the output is 0, so this function is called the equivalence
function or the IF and only IF function denoted either this way or as an equivalence, this is
the equivalence function.

Now, consider column 13, column 13 is false precisely when x is true but y is false, this
function is called the implication function, we say x implies y, we will have occasion to talk
more about the implication function that is column 13. The complement of column 13 is
column 2 which is the negation of implication which, we can denote in this fashion. Now, let
us look at column 3, column 3 is 0, 0, 1, 1 column 3 reproduces the x column therefore this is
the identity function in x.

Its complement is 1, 1, 0, 0 which is column 12, this is the negation of x. Now, let us look at
column 5, which is 0, 1, 0, 1 which reproduces input y. Its complement is 1, 0, 1, 0 which is
column 10 and therefore it is the negation of y, what remains now? Column 11, column 11 is
1, 0, 1, 1 therefore this is the reverse implication comparing it with column 13 you find that
this is the reverse implication.

If column 13 is a forward implication x implies y then 11 has to be the reverse implication


and then its complement 4 is the negation of the reverse implication. So, now we have named
all the 16 functions. So, we find that some of these functions are not dependent on both the
inputs for example, 0 and 1 are 1-variable functions, then x, y, negation of x, negation of y
are 1-variable function, sorry 0 and 1 are 0-variable functions. x, y, negation of x, negation of
y are 1-variable functions the rest are 2-variable functions.

11
(Refer Slide Time: 18:34)

In general, when you have a function of this form to specify the function you can draw up,
what is called the truth table. In the truth table, we list all the inputs to the function. So, here
there are n inputs, we have one column for each of the inputs, each of the inputs can take on
any of the truth values. So, there are 2 power n rows in this table corresponding to the various
assignments of the truth values therefore when you specify the function you have to fill these
2 power n positions, at each position there are 2 possibilities, 0 or 1. So, altogether there are 2
power 2 power n possibilities.

(Refer Slide Time: 19:50)

So, there are 2 power 2 power n Boolean functions of this signature. Special cases of this we
have already seen when n equal to 0, there are 2 power 2 power 0 which is 2 power 1 equal to

12
2 Boolean functions, these are the zero function and the 1 function, when n equal to 1 we
have 2 power 2 power 1 functions which is 2 power 2 which is 4, these are x, the negation of
x, 0 and 1 and when n equal to 2 when there are 2 inputs we have 2 power 2 power 2 which is
2 power 4 equal to 16 Boolean functions, we have seen the list of them.

(Refer Slide Time: 20:52)

Now, consider the truth table of a Boolean function. Let us say, it is a 3-variable Boolean
function, then there are eight rows in this truth table, these are the eight rows. Let us say, the
function is specified in this fashion, the complement of f also we will specify. So, a truth
table of a Boolean function is specified in this fashion, so this is a 3-variable Boolean
function.

Now, let us post this question, can this 3-variable Boolean function be synthesized using 0, 1,
2 variable Boolean functions? So, this is the problem we want to address, we are given the
truth table of a function. A Boolean function can be completely specified by specifying its
truth table because this truth table now maps all the possible inputs to the corresponding
outputs, so the function is completely specified by a truth table. So, the function is specified
using a truth table and we want to see, if the function can be synthesized using 0, 1 or 2
variable Boolean functions.

13
(Refer Slide Time: 23:00)

So, what will come in handy to do this, are what are called De Morgan’s laws. For two
Boolean functions e1 and e2, we say that e1 is equivalent to e2 if and only if the truth tables of
e1 and e2 are identical, that is they should have exactly the same behaviour in terms of the
output, that is when we say that two expressions e1 and e2 are equivalent.

(Refer Slide Time: 24:03)

Now, De Morgan’s laws specify certain equivalences, it says that for two Boolean variables x
and y; x or y by the way or can be denoted in either of these two ways an AND can be
denoted either as this, or a concatenation in particular when I write xy, I mean x AND y. So,
De Morgan’s law, the first De Morgan’s law says that the complement of x or y is the AND
of x complement and y complement.

14
The second De Morgan’s law says that, the complement of the AND of x and y is the OR of x
complement and y complement by the way an AND is also called a conjunction and an OR is
also called a disjunction, you should be familiar with these words. So, these are the two De
Morgan’s laws, we say that these two are equivalent, the left hand side is equivalent to the
right-hand side but how do we show this? We can show this using truth table..

(Refer Slide Time: 25:38)

So, the two variable versions of De Morgan’s laws can be shown like this, take the two inputs
x and y and consider all possible inputs 0, 0, 0, 1; 1, 0 and 1, 1 then the complements of x and
y would be these. Now, let us consider x or y, x or y would be these, what then would be the
complement of x or y? That would be 1, 0, 0, 0. Now, De Morgan’s law says that this is the
same as the AND of x bar and y bar, 1 and 1 is 1, 1 and 0 is 0, 1 and 0 is 0 and 0 and 0 is 0.

So, you find that these two columns correspond exactly to each other therefore we have x or y
the whole bar is equivalent to x bar y bar. Similarly, let us consider x, y; x, y is 0 here 0 and
0, 0 and 1 is 0, 1 and 0 is 0 and 1 and 1 is 1, so x, y is 0, 0, 0, 1. Then what would be x, y
complement? That would be 1, 1, 1 and 0 and what would be x bar or y bar? We have x bar
here and y bar here, if you take the OR of them you get 1 here, 1 here, 1 here and 0 here. So,
you find that they again correspond, these two columns correspond. So, the complement of x
and y is equivalent to the OR of x bar and y bar. So, this truth table establishes the two De
Morgan’s laws for two variables.

15
(Refer Slide Time: 28:09)

In fact, you can extend this to any number of variables for example, if you have three
variables, let us say we want to compute the complement of x or y or z but since OR is an
associative operator you can write this in this manner, then using De Morgan’s law, we can
write this as, the conjunction of x bar and y plus az the whole bar but then we know that y
plus z the whole bar is y bar z bar by applying De Morgan’s law for two variables, but since
conjunction is associative, we can write this as x bar y bar z bar.

In other words, the complement of x or y or z is the conjunction of x bar y bar and z bar.
Similarly, as a dual, we can also find that the complement of x and y and z is this by
associativity of conjunction but that is x bar or y z bar but y z bar is y bar or z bar which by
associativity can be written in this fashion, therefore De Morgan’s laws can be extended from
two variables to three variables and thus it can also be extended to four variables and so on.

16
(Refer Slide Time: 30:06)

Now, let us go back to our Boolean function f and let us say we want to express f in terms of
the inputs x and y and 0, 1, 2-variable logical connectives, logical connectives are the same
thing as Boolean functions. So, let us see how f can be expressed in this form, for that let us
take another look at the truth table corresponding to f. So, here we find that f is 1 precisely
when x, y, z are 0, 1, 0 or 1, 0, 1 or 1, 1, 1 that is 2, 5 and 7.

17
(Refer Slide Time: 31:16)

Which means, f is logically equivalent to x bar y z bar which corresponds to 2 0, 1, 0 or x y


bar z which corresponds to 5 1, 0, 1 or x y z which corresponds to 1, 1, 1 when one of these
combinations happen then f becomes 1 that is what the truth table tells us. So, f can be
expressed as a composite expression of this form, this uses the input variables x, y and z and
logical connectives AND, OR and NOT and this form is called the sum of products form, that
is because it is expressed as an OR of several terms or corresponds to the addition symbol
that is why we call it a sum and then each of the terms here is a conjunction.

For example, x bar y z bar is a conjunction, it is an AND of three literals, a literal is either a
variable or its complement. So, this is a conjunction of several literals therefore this is called
the product form because in arithmetic expressions extra position corresponds to

18
multiplication when several entities are juxtapose we call it a product. Similarly, here also we
call this a product, so therefore this is a sum of products form.

So, looking at the truth table, we can write the sum of products form of the Boolean function
in this manner. So, this we can extend to any truth table given an n variable Boolean
functions truth table. We can identify the rows in which the Boolean function becomes 1 then
corresponding to these rows we can form conjunctions like this. So, in this case there are
three rows corresponding to 0, 1, 0; 1, 0, 1 and 1, 1, 1 so they can be translated into product
forms like this x bar y z bar; x y bar z; x y z, if either if any one of them is true then the
function evaluates to true.

So, what we want to say is that the function is true if and only if at least one of these three
terms is true. So, for any Boolean assignment at most one of them will be true. So, we want
to say that f is true if and only if exactly one of them is true that can be expressed using an
OR of all these terms that is why we call this a sum of products form. Therefore given the
truth table of a Boolean function we can write the sum of products form of the function.

(Refer Slide Time: 34:41)

Now, let us consider the complement of this function. This complement of the function is 1 in
rows which are not 2, 5 or 7 which means 0, 1, 3, 4, 6, these five rows will make the
complement of f 1. Therefore, I can write f bar as x bar y bar z bar which corresponds to 0, or
x bar y bar z which corresponds to 1, or x bar y z which corresponds to 3, or x y bar z which
corresponds to 4, or x y z bar which corresponds to 6.

19
So, this is the sum of products form for f bar. So, once you have the sum of products form for
f bar we can construct an expression for f. So, an expression for f can be obtained using De
Morgan’s law from this sum of products form. Now, we want an expression for f which is the
complement of f bar, f bar is the complement of f and the complement of that is f. So, we
want to complement the left hand side so we should complement the right hand side also.

So, on the right hand side we have a disjunction, so when you compliment a disjunction what
you need to do is to complement each of its terms and then take the conjunction of these
complements. So, we should take the complement of x bar y bar z, which is by De Morgan’s
law x plus y plus z plus stands for OR that has to be AND ed with the other complements, the
complement of x bar y bar z would be x plus y plus z bar and the complement of x bar y z
would be x plus y bar plus z bar and the complement of x y bar z would be x bar plus y plus z
bar and the complement of x y z bar would be x bar plus y bar plus z, this is the product of
sums form for f.

So, what this shows is that we can find the product of sums form for f when we are given the
truth table for f from the truth table for f we construct a truth table for f bar, f bar is true in
certain rows, you identify all the rows in which f bar is true, these are precisely the rows in
which f is false then from these rows we construct the product sum of products form for f bar
then if you apply De Morgan’s law on this sum of products form for f bar we get the product
of sums form for f.

(Refer Slide Time: 38:13)

20
So, what that establishes is this, given the truth table of an n variable Boolean function, we
can write the sum of products or product of sum form of f. So, we have an algorithm for
doing this and this sum of product and product of sums form use only these Boolean
connectives negation or an AND, in addition to the variables of the expression. So, the sum
of products form of the product of sums form will use these n variables along with these
connectives here AND and OR are 2-variable Boolean connectives whereas negation is a 1-
variable Boolean connective.

Since every Boolean expression, irrespective of the number of variables in it can be


expressed using these three Boolean connectives we say that this set is a complete set of
connectives, that is these three connectives are enough to express any Boolean expression but
is this the only complete set of connectives? By no means.

21
(Refer Slide Time: 40:02)

That is because, again from De Morgan’s law we know that the complement of x or y is the
complement of x and the complement of y therefore x or y is equivalent to the complement of
x and the complement of y, the whole complement. Now, what this establishes is that OR can
be expressed in terms of AND, and negation. Now, we know that AND negation and OR
together form a complete set of connective, and here we find that OR can be synthesized in
using AND and negation, so OR is not indispensable here.

Therefore, we find that AND and negation together forms a complete set of connectives. So,
this is a complete set of connectives as well, and then we have the dual of this x and y the
whole complement this equivalent to x complement or y complement, which means x and y is
equivalent to x complement or y complement, the whole component. In other words, AND
can be expressed in terms of OR and negation. Therefore OR and negation also form a
complete set of connectors. So, these are also complete sets of connectives, is there a smaller
set of connectives that is complete? Indeed, there is.

22
(Refer Slide Time: 42:08)

For example, consider the NAND function, denoted like this as an up arrow, what is A
NAND A? A NAND A is equivalent to A AND A the whole complement but then what is A
AND A when A is 0 it is 0 AND 0 which is 0 and when A is 1 it is 1 AND 1 which is 1,
which means A AND A is nothing but A, so this is a complement which means when a
variable is NANDed with itself, we get the complement of it or in other words we can use
NAND to generate negation.

Now, look at this set AND and negation in this negation can be synthesized using NAND and
then what about AND? AND is nothing but the negation of NAND, so if you take the
negation of NAND of A and B and then negate it, what we get is, the AND of A AND B,
how do we get this negation? This negation also can be synthesized using NAND. So, if you
take the NAND of A NAND B with it itself, what we get is, A AND B. So, we find that both
AND and NOT can be synthesized using NAND connectives. In other words, NAND alone is
a complete set of connectives. Then, of course you would expect NOR also to behave the
same way and it indeed does.

23
(Refer Slide Time: 44:15)

If you take x NOR x, you find that this is nothing but x complement when x is 0 we have 0
OR 0 which is 0 and the negation of that is 1 and when x is 1 we have 1 OR 1 with the
negation of which is 0, therefore, x NOR x is the negation of x. So, negation can be
synthesized using NOR. Similarly, x or y is nothing but x NOR y complement and we have
just seen that complement can be synthesized using NOR, therefore this also can be
expressed in terms of NOR, x NOR y NOR x NOR y this is what x OR y is.

Therefore, this is also a complete set of connectives. So, we find that there are these two
singleton sets that are complete sets of connectives, the NAND connective as well as the
NOR connective but are these the only singleton sets of, singleton sets that are complete sets
of connective, it transpires that they are the only ones, but how do we show this?

24
(Refer Slide Time: 45:58)

Suppose some function h on 2-variables is a universal logic gate. A universal logic gate is
precisely a 2-variable Boolean function that forms a complete set of connectives by itself. So,
we know that NAND and NOR are universal logic gates, but are these the only universal
logic gates? We want to claim that they are the only universal logic gates. Now, suppose h is
some 2-variable Boolean function that is a universal logic gate, then I claim that h of 0, 0 will
have to be 1, why is this? Suppose h of 0, 0 is 0 that is when both the inputs to h are 0 then
the output is 0. Then consider a logical expression, a Boolean expression that is synthesized
using h.

(Refer Slide Time: 47:38)

25
Consider a composite proposition made up of h connectives. So, in this the only connective
that has been used is h and here all the inputs are, let us say both the inputs are let us say 0, in
which case we have these inputs both 0 on which we apply this h connective and it produces
a 0, let us say, and then on the same inputs we might have other h connective supply which
will also produce 0 and then we might combine these 0s to reduce further signals using more
h connectives, they will also keep producing 0.

In other words, if you have a circuit made up of h gates in this manner then the every signal
which is inside this will be a 0 if both the inputs are 0. In other words, if both the inputs are 0
we will not be able to produce a 1 at the output of this, but every Boolean function is not of
this form.

(Refer Slide Time: 49:00)

In particular, the NAND function, the truth table of NAND is like this, in particular this entry
is 1 that is when both the inputs are 0 the output is 1. So, how would you synthesize NAND
using h gates alone? That is not possible because if both the inputs are 0 h gates will keep
producing only 0s, it will never produce a 1. So, NAND cannot be synthesized using the h
case.

Therefore, we know that h of 0, 0 has to be 1. Similarly, we can also argue that h of 1, 1 has
to be 0, that is when both the inputs are 1 the output has to be 0 otherwise we will not be able
to produce for example, the OR function which produces a 1 when both the inputs are 1.
Therefore, if you visualize the truth table of h, you find that it should be of this form, the first
entry is 1 and the last entry is 0.

26
Now, there are two possibilities, the two vacancies these vacancies can be filled in 4 different
ways. So, let us consider all those 4 possibilities.

(Refer Slide Time: 50:37)

So, here we have 1 and here we have 0 then let us fill these vacancies using 0, the other
alternatives are 0, 1 here or 1, 0 here or 1, 1 here. So, these are the 4 possible functions that
could be universal logic gates. Now, what is 1, 0, 1, 0? 1, 0, 1, 0 is the complement of y and
what is 1, 1, 0, 0? It is a complement of x. These two cannot be universal because they
depend only on 1-variable.

So, using these two gates that is the negation of y and the negation of x, we will not be able to
synthesize any Boolean function which depends on both the inputs. So, these two are anyway
ruled out, what are the remaining? This is nothing but the NOR function and this is nothing
but the NAND function. So, we find that NOR and NAND are the only universal logic gates.
That is it, from this lecture. Hope to see you in the next, thank you.

27
Discrete Mathematics
Professor Sajith Gopalan, Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture 2
Propositional Calculus

Welcome to the MOOC on discrete mathematics, this is the second lecture on mathematical
logic. We will continue with our discussion of propositional calculus that we started in the
last class.

(Refer Slide Time: 00:42)

In the last class, we talked about the Boolean functions on n variables and we saw that such
functions can be synthesized using AND, OR, NOT gates, these are logical connectives using
these logical connectives we can synthesize any Boolean function on n variables. Therefore,
this set of logical connectives is called a complete set of connectives. We found that, this is
by no means the only complete set of connectives.

We find that AND and NOT also form a complete set of connectives. Similarly, OR and NOT
also form a complete set of connectives, NAND is also a complete set of connectives on its
own. NOR also forms a complete set of connectives on its own. So, a Boolean function on n
variables can be synthesized using any of these sets.

28
(Refer Slide Time: 02:02)

Some of these logical connectives can be expressed as logic gates, an AND gate is drawn like
this in a circuit diagram. So, an AND gate has 2 inputs x and y and the output is x AND y, the
output is 1 if and only if both the inputs are 1. An OR gate is represented in a diagram in this
manner where the inputs are x and y and NOT gate has only 1 input, let us call it x and its
output is the negation of x.

(Refer Slide Time: 03:01)

A NOR gate, which is a negation of OR is drawn in this manner, x and y are the inputs the
output is x NOR y and NAND gate, is an AND gate followed by a negation, an XOR gate,
which produces an output of 1 if and only if the inputs are not the same, it is denoted like this.
Therefore, using these logical gates you can convert a Boolean expression into a circuit.

29
(Refer Slide Time: 04:03)

For example, if you have a Boolean expression of this form, we have the signals x1 and x 2

they can be ORed using an OR gate, the output of which can be ANDed with x3. So, this is
the circuit diagram corresponding to this Boolean expression. So, what we know is that any
Boolean expression can be converted into a circuit diagram using AND, OR, NOT gates,
using either the sum of products form or the product of sums form. You can have an
equivalent circuit diagram created using only NAND gates too or NOR gates.

(Refer Slide Time: 05:03)

NAND and NOR are universal logic gates because every circuit can be synthesized using
only 1 of these and these are the only universal logic gates, this we proved using the

30
following argument. Suppose, h is a universal logic gate, a Boolean function h with 2
variables.

(Refer Slide Time: 06:09)

If this is a universal logic gate then when both the inputs are 0, the output of h will have to be
1, otherwise we will not be able to use h to synthesize a function like NAND that is because
when you have a circuit made up only of h gates and when both the inputs to the circuit are 0
then the circuit will have only 0s in the inside, if h of 0, 0 were 0. Therefore, to be able to
synthesize every function which produces 1 when both the inputs are 0 h of 0, 0 will have to
be 1. Similarly, h of 1, 1 will have to be 0. Therefore, we have only four possible functions of
these 2 are the negations of the inputs. Therefore, the 2 remaining functions are NAND and
NOR, these are the only universal logic gates.

31
(Refer Slide Time: 07:17)

Now, let us consider implication. The implication is a 2 variable Boolean function with this
truth table, when both the inputs are 0 the output is 1, when x is 0 and y is 1 the output is 1,
when x is 1 and y is 0 the output is 0 and when both the inputs are 1 the output is 1. So, what
it means is that, x implies y is false if and only if x is true and y is false or we can say, x
implies y is logically equivalent to the complement of x y bar which by De Morgan’s law is x
bar plus y. So, we find that x implies y is true when x is false or y is true.

(Refer Slide Time: 08:33)

32
In the case of an implication x is called the antecedent and y is called the consequent, so what
we have seen is that, if the antecedent is false the implication is true, on the other hand if the
antecedent is true then the implication is true only if the consequent is true, this has
interesting consequences, we can derive essentially any statement from an antecedent which
is wrong.

(Refer Slide Time: 09:36)

The legend has it that Bertrand Russell famously showed that, if 2 plus 2 equal to 5 then
Bertrand Russell is the pope. His argument was this, if 2 plus 2 equal to 5 then we can
subtract 2 from both sides then we have 2 equal to 3, if we subtract a further 1 from both
sides we have 1 equal to 2. Therefore, we have 2 equal to 1, then consider the set containing
Bertrand Russell and the pope, this set has 2 people namely Bertrand Russell and the pope.

33
Therefore, the cardinality of the set is 2, but then, we know that 2 equal to 1, therefore the
cardinality of this set is 1 but if the cardinality of a set is 1 that is a singleton set and it has
only 1 member, therefore, Bertrand Russell is the pope. In other words, if the antecedent is
false then practically you can prove anything as the consequent. Therefore, we can judge an
implication only when the antecedent happens to be true.

(Refer Slide Time: 11:02)

Consider the implication, x implies y, the implication y bar implies x bar is called the
contrapositive of x implies y. Using a truth table you can readily verify that, x implies y is
equivalent to y bar implies x bar. Every logical every implication is logically equivalent to its
contrapositive, y implies x is the converse of x implies y and x bar implies y bar is the inverse
of x implies y. So, by what we have seen just now, the converse and the inverse are logically
equivalent namely, y implies x is logically equivalent to x bar implies y bar.

34
(Refer Slide Time: 12:21)

The implication p implies q will be written in English, as if p then q, p is sufficient for q, q


when p, a necessary condition for p is q, q unless not of p, p implies q, p only if q, q follows
from p. So, these are all essentially the same thing, they all denote the implication p to q.

(Refer Slide Time: 13:32)

The logical connective p is equivalent to q which is the negation of the exclusive OR, is
usually written in English as p is necessary and sufficient for q, p if and only if q or p is
equivalent to q, equivalence is true if and only if p and q have the same logical value either
both are true or both are false.

35
(Refer Slide Time: 14:28)

So, we have seen several logical connectives now. In an expression that uses many of these
logical connectives, how would you parenthesize the expression if the parentheses are not
already placed in it? For this, we have to use the precedence’s rules. The precedence’s rule,
commonly used are these, negation has the highest precedence which means you have to
associate the negation symbol to the nearest variable first AND and NAND have the next
preference, they associate from left to right. OR and NOR have the next precedence. Again,
they are from left to right. Single implication comes next but this associates R to left, right to
left and double implication has the least preference which again has right to left association.

(Refer Slide Time: 15:37)

36
So, with these precedences in mind, let us work out an example. Let us see how parentheses
can be inserted in this expression. In an expression given like this, first you should insert
parentheses here because AND has the highest precedence, after that the precedence is for
OR therefore you should parenthesize them in this manner. So, this is the first parentheses,
this is the second one, this is also at the second level, then at the third level we have
implication.

So, this is the third parenthesization and then the double implication, the 2-way implication
has the least precedence, so this is the last one. So, this is how you would parenthesize an
expression containing many of these connectives. Let us consider another expression, p
implies q implies r, when we discuss precedences, we said that, implications have a right to
left associativity, therefore you will have to associate q to r first, and then an outer
parentheses.

Similarly, when I have an expression of this form p or q exclusive OR r, we find that the
precedences for this therefore this is where you put the parentheses first and then the whole of
the expression is put in a parentheses of a lower precedence. So, this is how you would
parenthesize Boolean expressions.

(Refer Slide Time: 17:43)

An expression is called a tautology, if its truth table has only 1s in the rightmost column. For
example, let us consider x OR x NOT, this evaluates to 1 for every assignment to x which
means x OR x bar is true for every imaginable assignment. If you drop the truth table of it in
the rightmost column, we have only 1s therefore this is a tautology. A tautology, is a

37
statement which is always true, whatever be the truth values that you assign to the variables
of the formula. The formula will always evaluate to true.

(Refer Slide Time: 19:08)

The complement of a tautology is called a contradiction. For example, if you have x AND x
bar, you find that it evaluates to false for every possible assignment, it has only 0s in the
rightmost column such a Boolean expression is called a contradiction, a statement which is
always false.

(Refer Slide Time: 19:47)

38
In the last class, we saw that two expressions e1 and e2 are logically equivalent if and only if
their truth tables are identical, in particular we saw De Morgan’s laws, let us see several more
equivalences which will be useful for us to prove various statements.

(Refer Slide Time: 20:18)

One is the identity laws, the identity laws states that p AND 1 equal to p, and p OR 0 is
equivalent to p, you can verify this using the truth table. For example, consider variable p and
the logical function 1, p can take on values 0 and 1 and for any assignment to p, 1 takes on
values 1 and 1, therefore p AND 1 here would be 0 and 1, this seems to be identical to the
first column which is p, so p AND 1 is the same as p. Similarly, p OR 0, you find as identical
to p again, proving the identity laws. So, when you take the AND of p with 1 you get p itself,
AND of anything with 0 will give you the same thing.

39
(Refer Slide Time: 21:38)

The next are the domination laws, domination laws say that p plus 1 is equivalent to 1
anything ORed with 1 gives us 1, that is 1 is a certainty, so something ORed with a certainty
is always a certainty. Similarly, p AND 0 is 0, you can verify these using the truth tables.
Then, the idempotent laws are, p OR p is p, and, p AND p is also p.

(Refer Slide Time: 22:31)

Double negation, asserts that the negation of a negation is the same as x again, you can verify
this using a truth table. The negation of 1 is 0, the negation of 0 is 1, so this is identical to x,
establishing the double negation law. Then the law of commutativity, which says that AND
and OR are commutative operations, namely x OR y is the same as y OR x, similarly x AND
y is the same as y AND x.

40
(Refer Slide Time: 23:24)

Then we have the associative law, the associative law says that the OR of a, with the, OR of b
and c, is equivalent to the OR of, a OR b ,with the OR of c. So, by extending this we can take
the OR of a chain, in whichever order is convenient. We can parenthesize in whichever way,
when we have a long sequence of disjunctions. So, you can verify this using the truth table
readily. Similarly, for the AND operation, a AND b AND c is the same as a AND b AND c,
again you can verify this using the truth table.

(Refer Slide Time: 24:38)

Now, comes the distributivity, the law of distributivity in this form is familiar to all of you
from arithmetics, that is when we compare OR with addition and AND with multiplication,
this is a familiar form of distributivity which you can easily verify using the truth table. But

41
there is another distributivity law which does not have an equivalent in the algebra of
numbers a OR b AND c is a OR b AND a OR c, which means, OR distributes over AND, the
way addition does not distribute over multiplication in arithmetics.

(Refer Slide Time: 25:50)

Then of course, we have the familiar De Morgan’s laws which says that the complement of a
OR b is the complement of a AND the complement of b. similarly, the complement of a b is
the complement of a OR the complement of b.

(Refer Slide Time: 26:12)

The absorption laws allow you to absorb q in this fashion, p OR p q is the same as p.
similarly, p AND p OR q is also the same as p.

42
(Refer Slide Time: 26:39)

Then, we have the negation laws, the OR of a quantity with the negation of itself is 1, the
AND of a quantity with the negation of it is equivalent to 0, these are the negation laws. So,
all these laws can be proved using truth tables.

(Refer Slide Time: 27:12)

Now, let us consider a problem, a problem that is used to Smullyan and you can find them in
various textbooks. Let us say, there is an island, on this island there are only 2 kinds of
inhabitants, knights and knaves, knights speak truth and only truth, knaves tell lies and only
lies but you cannot distinguish who is a knight or who is a knave, all of them look alike. Let
us say, we encounter 2 persons A and B, so their appearances do not disclose whether they
are knights or knaves, you will have to figure out that from what they say.

43
(Refer Slide Time: 28:14)

Let us say, A says B is a knight and let us say B says the two of us are of opposite types then
the question is, what are A and B? So, let us try to solve this problem by forming
propositions.

(Refer Slide Time: 28:56)

Let p denote the proposition A is a knight, let q form the proposition B is a knight. Now, what
does A say? A says that B is a knight, so A asserts q, so if A is a knight which means if p is
true then A would speak only the truth and this would be true and if A is a knave then a
would tell only lies and this would be a lie. That is if A is a knight which means if p is true
then q also would be true.

44
On the other hand, if A is a knave which means if p is false then q would be false therefore
we have p if and only if q, so this is one conclusion we have.

(Refer Slide Time: 29:58)

Now, from what B has said, we can conclude that B is a knight which is if and only if q this is
true if and only if what he said is, right. Now, what did he say? He said that both of them are
of different types, which would mean, that either p is 1 and q is 0 or that p is 0 and q is 1. In
other words, either A is a knight and B is a knave or A is a knave and B is a knight. So, now
we have this logical equivalence that is q is true if and only if this is true.

Therefore, the AND of the 2 logical statements is what we have p if and only if q and q if and
only if p q bar plus p bar q. So, let us try to simplify this expression, let us see if we can
figure out what p and q are from these.

45
(Refer Slide Time: 31:25)

Now, p if and only if q can be written using AND and OR and NOT in this manner, this is
true if and only if both have the same logical value, which is possible if both are true or both
are false. On the other hand, if q is true then this will have to be true, that is if B is a knight
then what he said should be right, which means either A is a knight and B is a knave or A is a
knave and B is a knight.

Otherwise, q is false, in which case, both are of the same type. That is, either both are knights
or both are knaves. So, this is the sum total of the two statements, the AND of the
conjunction of the two statements is what we have taken, these, we concluded from the first
statement and from the second statement. Therefore, both of these conclusions must be right,
what does that entail?

Let us try to simplify these expressions, we have p q q bar by taking the AND of q with p q
bar by associativity and commutativity, I can flip q and p and write this as p q q bar and then
I have p bar q q, that is from the first conjunction. From the second conjunction, we have p q
bar q and p bar q bar q bar. Let us simplify this further, so this is logically equivalent to q, q
bar is 0, so this is 0 here, p and 0 is 0 q q is q therefore we have p bar q here, q bar q is 0 then
we have p and 0 which is 0 again and then p bar q bar q bar is q bar, so we have p bar q bar.

Which is then, equivalent to p q OR p bar q bar AND p bar q and p bar q bar, this can be
written as p q or p bar q bar and from the second term, if you take p bar outside we have q
plus q bar on the inside which is because q plus q bar is 1 and p bar and 1 is p bar. Now, if
you take p bar inside, from the first term we have p bar p q but since p bar p q is 0 we have 0

46
here and then from the second term we have p bar p bar q bar but p bar p bar is p bar, so we
have p bar q bar which is p bar q bar.

(Refer Slide Time: 35:11)

So, we find that this logical expression reduces to p bar q [Link] does that say? It says that
A is a knave and B is a knave, both of them are knaves, that is, what we conclude from the
logical expressions.

(Refer Slide Time: 35:38)

Let us consider one more such problem, let us say in this case person A says, at least 1 of us
is a knave and suppose B does not say anything then let us see what A said. If A is a knight, if
p stands for the statement that A is a knight in which case what he said is right. Now, what

47
did he say? He said that, at least 1 of us is a knave, then either both of them are knaves or A
is a knight and B is a knave, or A is a knave and B is a knight. It is not possible that both of
them are knights, for at least one of them to be a knave, this is precisely how the situation
should be. Now, what does this mean? This is logically equivalent to saying that if p is true
then the quantity in the bracket is true or if p is false then this quantity in the bracket is false
but what is the negation of the quantity in the bracket?

(Refer Slide Time: 37:05)

We want to negate p bar q bar plus p bar q plus p q bar. To compute the negation of this, let
us first simplify this expression, we can write this as p bar times q bar plus q and that ORed
with p q bar this is what we want to negate but then q bar plus q is 1 therefore this is the
negation of p bar plus p q bar which by De Morgan’s law is the negation of p bar and the
negation of p q bar. But double negation of p is p itself and the negation of p q bar by De
Morgan’s law is p bar plus q.

If you take p inside, we have p p bar plus p q which is equivalent to 0 plus p q which means,
we have p q. So, the quantity within the bracket when negated will give us p q, that is, this is
p q, therefore this expression reduces to p p bar q bar plus p p,q bar plus p p bar q plus p bar p
q which is 0 for the first expression p q bar for the second, 0 for the third and 0 for the fourth,
the OR of these four quantities would be p q bar.

48
(Refer Slide Time: 39:12)

So, what we have concluded is that, the logical statement that we had is equivalent to p q bar
which means? A is a knight and B is a knave, that is the conclusion we have drawn. So, that
is it from this lecture, hope to see you in the next, thank you.

49
Discrete Mathematics
Professor Sajith Gopalan, Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture 3
Mathematical Logic

Welcome to the MOOC on discrete mathematics, this is the third lecture on mathematical
logic. In the previous lecture, we talked about propositional logic. In propositional logic, we
have propositions and truth values to propositions. We saw, how propositions can be
combined to form larger composite propositions and how the truth values of the component
propositions will combine to form the truth values of the larger propositions. But not every
logical statement can be captured using the apparatus of propositional calculus, there are
some arguments for which propositional calculus are not adequate.

(Refer Slide Time: 01:12)

For example, consider the statement of this form. All men are mortal, Socrates is a man, so
Socrates is mortal. In this statement, we form the conclusion from the first two propositions.
So, if we call these propositions, p and q, whether proposition p and q are true or false will
not help us in forming the third conclusion. So, even if we assume that the first two
propositions are true, there is no way we can conclude that the third proposition is true using
the apparatus of propositional calculus that is because the statements include predicates and
quantifiers.

50
So, let us see what we mean by this.

(Refer Slide Time: 02:27)

In the statement, All men are mortal, ‘men’ form the subject of the sentence and ‘are mortal’
form a predicate, ‘All’, a sub quantifier.

(Refer Slide Time: 03:03)

In general, when we have a sentence of the form, 4 greater than 3, we can say that 4 is the
subject of the statement and 'greater than 3' is the predicate of the statement. From this
statement, we can abstract the subject away and write in this form, I use a variable for the
subject and we say that x is greater than three. Let us say, we denote this symbolically in this

51
manner, suppose P of x denotes x greater than 3, we might want to abstract away the other
constant 3 as well.

(Refer Slide Time: 03:51)

So, if you abstract that away too, then we will have two variables, then we will have a
sentence of the form x greater than y, where both x and y are unknown, we could denote this
as R of x y. So, now we have two predicates P of x which says that x greater than 3 and R of
x y, which says that x greater than y. If you substitute constants for the variables in these
predicates, in these formulae by substituting 4 for x, we have P 4 which is 4 greater than 3,
this is correct.

On the other hand, if you substitute 2 for x, we have 2 greater than 3, which is false. If you
substitute 5 for x we have 5 greater than 3, which is true. So, depending on the value that you
substitute for x here, P of x may be true or false. Similarly, in the case of R of x y, you can
substitute various values for x and y, you can substitute 3 and 4 which then would say 3
greater than 4, which is false, if you substitute 4 and 3 you will have 4 greater than 3 which is
true, if you have 7 and 4 you will have 7 greater than 3, which is true and so on. So, now we
have a way of abstracting individuals away and replacing them with variables.

52
(Refer Slide Time: 05:24)

Now, let me introduce what are called quantifiers. The first quantifier is the universal
quantifier. A universal quantifier stands for the expression "for all", for example, when we
say, for all x P x, what we mean is that, for every x, P x is true.

(Refer Slide Time: 06:13)

The other quantifier is the existential quantifier. Using an existential quantifier, when we say
that, when we write a formula of this sort, here, this is supposed to stand for "there exists",
what we essentially say is that, there exists an x such that P of x is true. So, P x is a predicate
with an argument supplied. x is the argument here, so that will take on a truth value as we
have seen before. So, this statement is supposed to say that, there exists an x such that P of x
is true.

53
(Refer Slide Time: 07:12)

Now, in these statements, we say for all x or there exists an x such that some predicate is
satisfied, or here for all x, so that some predicate Q x is satisfied, but then what do we mean
by for all x? What kind of x do we talk about here? And here there exists an x where, this
question is not clear when we say for all x or there exists an x. Now, these quantified
statements happen in a context, in a discourse, these happen in a context in a discourse.

So, from the context of the discourse it should be clear, what is the domain of the discourse.
The domain of the discourse is the set of elements about which we are conversing at the
moment, for example, we could be talking about natural numbers or we could be talking
about people. Depending on the domain of discourse, the statement that we say might make
sense.

54
(Refer Slide Time: 08:44)

For example, when we say for all x, x is odd or x is even makes sense if the domain of
discourse D is the set of all natural numbers, every natural number is either odd or even, you
can classify natural numbers as odd or even, or D could be a proper subset of N, so in these
context the statement x is odd or x is even makes sense because is odd or is even predicates
do apply to natural numbers.

But if you are talking about people, these statements need not make sense, to make sense of
these statements, we will have to interpret the predicates 'is odd' and 'is even' in a manner
which is appropriate to the members of the domain of discourse. So, if domain of discourse is
the set of all people, then these will have to be interpreted appropriately in terms of the
people for this statement to make sense.

So, for a first order statement to make sense, we will have to first fix the domain of discourse.
So, we assume that in the context in which our conversation is happening the domain of
discourse is fixed and in that context we quantify.

55
(Refer Slide Time: 10:09)

Similarly, when we say there exists an x such that x is prime, if the domain of discourse is the
set of natural numbers, then what do we assert? We assert that there is a natural number x,
which is prime, so once the domain of discourse is fixed and the predicate is prime as
understood then the sentence makes sense, then you can assert whether the statement is true
or false.

(Refer Slide Time: 10:55)

Sometimes, we want to make restrictions on the domain of discourse. Suppose, D is the


domain of discourse and let us say we want to make restrictions. So, let us assume that, D is
the set of all natural numbers and let us say we have a statement of this form, for every x

56
greater than 11, P of x , that is we want to assert that the predicate P of x is true for every x
which is greater than 11.

How would we write this in our logic using the quantifiers? You have to write this way, there
exists an x, so that when x is greater than 11, if x is greater than 11, then P of x is true. This is
the correct representation of this statement, this cannot be paraphrased as this is an often
made mistake people often write this way. The second statement, says that for every x, x is
greater than 11 and P of x this would be true if and only if every x is greater than 11 and for
every x, P x is true that is not what we intend to say, what we intend to say is that for every x
which is greater than 11 P of x is true, these two are not the same at all but these two you can
see are the same. So, this is the correct paraphrasing of the sentence, the quantified sentence
for every x greater than 11 P of x.

(Refer Slide Time: 12:50)

And then you can make an assertion of the sort, there exists an x such that x greater than 11
and P of x, then we would write this as there is an x greater than 11 such that P of x is what
we want to say and we would write it in this way, you should remember that here x greater
than 11 implies P x will not do, that is because this says that there is an x such that either x is
less than or equal to 11 or P of x, what we assert is that x greater than 11 implies P of x.

So, from what we saw in the last lecture, we know that alpha implies beta is logically
equivalent to negation of alpha OR beta, therefore x greater than 11 implies P x would
paraphrase as x less than or equal to 11, or P x, but these two statements are not the same at
all. Therefore, this is the correct representation of the top statement.

57
(Refer Slide Time: 14:25)

So, we can write statements of this sort using quantifiers, and so, the extension of
propositional calculus. In propositional calculus, we have propositions, these are the syntax
entities and composite propositions, which are made from atomic propositions and the truth
values, which are the semantic entities. So, these are what we deal with in propositional
calculus, but when we come to this logic, which we called first order logic, we deal with
predicates and quantifiers. So, when we extend a propositional calculus to first order logic, or
predicate logic, we have variables. Variables are akin to pronouns in English. Constants are
similar to proper nouns in English. Function symbols are used to create entities which can be
used as names of objects, for example, father of Neeraj, this is a naming mechanism ' father
of ' is a function which is applied to the constant Neeraj to create the phrase ' father of Neeraj
', so this phrase is a naming phrase.

So, we can have function symbols that serve this purpose and we have predicate symbols and
we have quantifiers 'for all' and the 'there exists' and we have all the apparatus of
propositional calculus for example, the logical connectives, AND, OR, NOT, etcetera
implication and what not. So, the apparatus of propositional calculus are still with us along
with these additions. So, this richer logic is called the first order logic or predicate logic.

58
(Refer Slide Time: 16:58)

Quantifiers take higher precedence than connectives. In the last lecture we discussed,
precedence of connectives, we saw that negation has the highest precedence and double
implication has the least precedence, but quantifiers take a precedence which is higher than
that of all the connectives including negation. Therefore, a statement of the form for all x P x
or Q x should be interpreted as for all x P x or Q x, that is for all x applies only on P x, not on
Q x. This is as opposed to for all x P x or Q x, that is the scope of a quantifier is the
immediately adjacent predicate unless otherwise specified using parentheses.

(Refer Slide Time: 18:27)

Consider the formula x greater than 3, here we say that x is greater than 3 but what is x? x is a
variable. So, when you look at this formula, we do not know what x is, so x is to be inferred

59
from the context, so in that sense x is like a pronoun in English. We say that x is free in the
statement x greater than 3, the occurrence of x is free in x greater than 3. As opposed to this,
when we say for all x, x greater than 3, of course the statement would be false if we are
talking about natural numbers but never mind we are not talking about the truth value of the
statement.

Look at the statement, the form of this statement, in the statement we say, for all x, x greater
than 3, so in this statement x has a bound occurrence, this occurrence of x is bound to this
quantifier. So, variables can have a free and bound occurrence, it is also possible to mix free
and bound occurrences within the same statement.

(Refer Slide Time: 20:06)

For example, when I have a statement of this form x less than 100 and for all x, x greater than
3, this occurrence of x is bound to the x in the quantifier, so this is a bound occurrence of x.
Whereas, this is a free occurrence, so this x is talking about some individual which is known
only from the context, so it is rather like a pronoun whereas this second x is bound to the x in
the quantifier, so that does not depend on the context.

This is similar to a statement of the form, the tigress is free, that is one sentence that provides
a context and let us say, in the second sentence we have, she is coming here, and now, it is
everyone for herself. So, consider the second sentence, in this sentence the pronoun she
occurs in two places that is similar to x in the statement x less than hundred and, for all x, x
greater than 3.

60
The first she is a free occurrence of the pronoun, the meaning of this she has to be inferred
from the context now, what is the context? In the context, the previous sentence says that the
tigress is free, therefore this she refers to this tigress but the she in herself is a bound
occurrence, it is bound to everyone, so we have a group of women facing the tiger, so this
quantification is over this group of women facing the tiger, so everyone refers to the
individuals within this group, so the she in herself is bound to this occurrence of x.

(Refer Slide Time: 22:45)

Consider the statement, for all x P of x, so, as we said before, this says that, for all x in the
domain of discourse D, P of x is true. Suppose we want to negate this, then we want to say
that, this is not the case, suppose we want to negate this, we want to say that it is not the case
that for every x P of x is true, then clearly somebody violates P of x, that is, if you go to every
individual belonging to the set D, we would find that P of x is not satisfied by everybody.

So, there is somebody who does not satisfy P of x. In other words, some x belonging to D
does not satisfy the predicate P or in other words, there exists an x within D, so that P of x is
not satisfied. So, we find that, these two statements are equivalent, that is, a negation of for
all x P x is the same as there exists x, NOT of P x, of course parenthesizing correctly, we will
use this convention of parentheses, you will find in literature that there are different ways of
parenthesizing quantified statements, we will always use this notation.

A quantifier will be immediately followed by a parenthesized statement, the scope of the


quantifier will be defined by the parentheses, if such a parenthesization is not done, then ‘for
all x’ will associate to the nearest predicate, it has the highest precedence as we said before.

61
(Refer Slide Time: 25:07)

Similarly, let us try to negate this statement. There exists an x so that P of x, let us say we
want to negate this. So, what this asserts is that there does not exist an x in D, so that P of x is
true or in other words if you go to the individual members of D, we will find that P of x is
violated by every x in D or in other words for all x P of x is violated.

(Refer Slide Time: 26:14)

So, these two equivalences, you can avoid these parentheses and simplify the expression. It
says that, there exists x so that P of x is violated. If it is not the case that P of x is true for
everybody, then there must be some x for which P of x is violated. Analogously, if there does
not exist an x, so that P of x is true, then for every x, P of x must be false, these two are called

62
De Morgan’s laws for the first order logic. Let us try a few examples, paraphrasing sentences
in English into sentences in first order logic.

(Refer Slide Time: 27:23)

These examples are from the textbook of Mendelson, anyone who is persistent can learn
logic. We want to translate this sentence in English into a first order formula. So, let us
consider the predicates here, is persistent is one predicate, can learn logic is another
predicate, so we can have P of x stand for x is persistent, we can have C of x stand for x can
learn logic then what we essentially assert is that any person who is persistent is capable of
learning logic.

In other words, for every x when x is a person that is our domain of discourse is a set of
people, for every x where x belongs to D that is understood, the domain of discourse is
understood, for every x, if x is persistent, then x can learn logic, this would be the first order
representation of the sentence.

63
(Refer Slide Time: 29:09)

Consider another statement, no politician is honest, a debatable statement but there we have
it. Let us consider the compliment of this statement, the compliment of this statement would
say that some politician is honest, some politicians are honest or in other words, there exists a
politician who is honest. So, let us say there exists an x in D, the domain of discourse is the
set of people here again.

So, there exists an x, so that x is a politician. So in this case P stands for the predicate is
politician, so P x means x is a politician and honest x. so, we have the statement there is some
x who is both a politician and honest, that would be a negation of this statement. Now, what
we want here is to negate this, no politician is honest. So, here we have a negation of
quantified statement then we can apply De Morgan’s laws to take the negation inside.

64
(Refer Slide Time: 30:38)

So, from the De Morgan’s laws, we know that when negation is taken inside a quantified
formula, it changes the quantifier for example, when a negation travels over a universal
quantifier into the parentheses, then the universal quantifier changes into the existential
quantifier, this universal quantifier changes into an existential quantifier when the negation
travels inside the brackets. Similarly, when the negation travels over an existential quantifier,
inside the parentheses it converts the existential quantifier into a universal quantifier. So, let
us use that here and take the negation inside. Then this becomes, for all x and we have the
negation of P x and H x, but the negation of P x and H x can be found using the De Morgan’s
laws of first order logic which would be here, we have the negation of a conjunction. The
negation of a conjunction is the disjunction of the negations.

So, we have negation of P of x or negation of H of x which is logically equivalent to saying


this, so what does it say? For every x if x is a politician then x is not honest, x is dishonest.
So, that is tantamount to asserting that every politician is dishonest which is logically
equivalent to saying that no politician is honest.

65
(Refer Slide Time: 32:25)

Similarly, consider the statement, Not all birds can fly. Suppose, we want to say that every
bird can fly, then we would say for every x if x is a bird, then x can fly, here B of x stands for
x is a bird and F of x stands for x can fly. So, the statement asserts that every bird can fly,
suppose we want to negate this then we would have the required assertion so that says that
not all birds can fly.

Once again, if you take the negation inside the brackets the quantifier flips, we have there
exists, then we have the negation of the implication B of x implies F of x but the negation of
an implication is the conjunction of the antecedent and the negation of the consequent which
means, we have B of x and F of x, what does this say? There exists an x, there is x such that
bird of x and not of F of x. In other words, there is a bird that cannot fly, you see that this is
logically equivalent to our original statement, not all birds can fly.

66
(Refer Slide Time: 34:14)

Another interesting example, if anyone can solve the problem, Lakshmi can. Let us say, S of
x denotes the predicate x can solve the problem, so if anyone can solve the problem translates
into this quantified statements there exists an x, so that x can solve the problem, this asserts
that someone can solve the problem. Now, we have an implication if anyone can solve the
problem in other words, if there is someone who can solve the problem then Lakshmi can
solve the problem.

Let small l denote the individual Lakshmi, so the statement now asserts that if there is some x
that can solve the problem then Lakshmi can solve the problem. So, this is a translation of the
given statement, let us take the logical equivalents of this, the logical equivalents of an
implication would be the negation of the antecedent and the consequent. So, the negation of
the antecedent here would be for all x, not of S of x OR S l, which by commutativity of OR
can be written like this, which is logically equivalent to saying this, that is because alpha
implies beta is logically equivalent to alpha bar or beta, we are invoking that in the reverse
here. So, what does this say? If Lakshmi cannot solve the problem then no one can, which is
exactly the first assertion. The first assertion and the last are logically equivalent.

67
(Refer Slide Time: 36:54)

Considering one more example, nobody in the algebra class is smarter than everyone in the
logic class. So, to paraphrase this, we would write this way, first let us assume that there is
somebody in the algebra class who is smarter than everyone in the logic class. So, we would
say there exists an x, so that x is in the algebra class and for all y, if y is in the logic class then
x is smarter than y. So, what it asserts is that, there is some x, who is in the algebra class and
is smarter than every y in the logic class, this is what we want to negate. So, if you put a
negation symbol here, we are asserting that nobody in the algebra class is smarter than
everybody in the logic class. So, this is the first order translation of the above sentence given
in English. So, now that gives you an idea as to how English sentences can be translated into
first other sentences.

68
(Refer Slide Time: 38:23)

We say that, two first order formulae, I have not formally defined a formula yet, which we
will do that later, at least now you know you have an idea about what a first order formula is.
Considered two first order formulae, two first order formulae are logically equivalent if they
evaluate to the same truth value irrespective of the interpretations, interpretations of
constants, function symbols, predicate symbols, etc.

(Refer Slide Time: 39:35)

For example, by De Morgan’s laws as we saw just now, negation of for all x P x is logically
equivalent to there exists x negation of P x. Similarly, negation of there exists x P x is
logically equivalent to for all x negation of P x. So, these are logical equivalences.

69
(Refer Slide Time: 40:12)

We can have quantifiers nested within one another but then when universal quantifiers and
existential quantifiers are nested within one another, the order in which we nest them is
significant. So, if our domain of discourse is the set of natural numbers then what does this
statement say? It says that, for every x there is once you fix the x, there is a y such that y is
x's additive inverse, when x and y are added together we get 0 or y is the negative of x. In
other words, we say every natural number or every integer has an additive inverse. We would
of course be making the statement correct, only if we are talking about integers, that is the
domain of discourse will have to be the set of integers. Comparing the previous statement
with this statement, if there exists a y, so that for all x, x plus y equal to 0, what does this say?
It says that that there is a number, there is an integer which upon addition with x gives 0 for
all x but this is patently false. So, we see that the two statements mean entirely different
things. So, in a sequence of universal quantifiers and existential quantifiers, if you change the
order, the meaning of the statement would change.

70
(Refer Slide Time: 42:37)

But that is not the case with a sequence of universal quantifiers. When we have an assertion
of this form, for all x y P x y, what we want to assert is that for every ordered pair drawn
from the domain of discourse, for every ordered pair x y, P is true for x and y, this would be
exactly the same even if we change the order of x and y, as you can verify. Therefore, in a
sequence of universal quantifiers, we can change the order of the quantifications.
Analogously there exists x, there exists y P x y is logically equivalent to there exists y, there
exists an x, P x y. In a sequence of existential quantifiers too, we can permute the order of the
quantifications.

(Refer Slide Time: 43:46)

71
We say that a formula is logically valid if it is true irrespective of the interpretations of the
function symbols, predicate symbols, constants, variables, etc. So, a logically valid formula is
akin to tautologies. Tautologies is in the context of propositional calculus, that is a formula
which always evaluates to true. A logically valid formula in first order logic is similar, it
always evaluates to true irrespective of the interpretation that you place on the various
symbols of the language.

(Refer Slide Time: 44:48)

For example, consider this statement for all x P x implies Q x implies for all x P x implies for
all x Q x, I want to claim that this is logically valid, that is irrespective of the interpretation
that is placed on P and Q, this statement will always be true, how do we argue this? To argue
this, let us look at the structure of the sentence. This is an implication. So, this is the
implication at the topmost level. This implication has an antecedent and a consequent, we
want to assert that this implication is always true. In an implication, if the antecedent is false,
the statement anyway evaluates to true, so we do not have to worry about the situation where
the antecedent is false. So, let us consider only the case where the antecedent is true. So, let
us assume that for all x P x implies Q x is true. Then, for the implication to be true the
consequent will have to be true, that is, when the antecedent is true, the consequent will have
to be true for the implication to be true.

Now, we want to show that the consequent is true. Now, let us look at the consequent, the
consequent itself is an implication and we want to claim that it is true. So, for an implication
to be true, the antecedent has to be false or the antecedent and the consequent both have to be
true. So, here again let us assume that the antecedent of this implication is true.

72
So, we make these two assumptions, for all x P x implies Q x is true and for all x P x is true,
then, consider any x belonging to D, for this x we have that P x implies Q x is true and P x is
true, you can readily verify that P x implies Q x and P x together ensures that P x and Q x are
both true, or in particular Q x is true. Therefore, this is true for every single x, as we have
taken an arbitrary x in D, therefore we can assert that for all x Q x.

So, we have shown that for all x P x assuming these two. Therefore, the formula has to be
logically valid that is in that implication, the antecedent and also the antecedent of the
consequent are both true, and we show that the consequent within that global consequent is
also true. Therefore the formula is true always, that is irrespective of the interpretation that
you place on P and Q, the formula will be true. So, this is an example of a logically valid
formula, but if you take the converse of the formula, that will not be true.

(Refer Slide Time: 48:32)

For example for all x P x implies for all x Q x implies for all x P x implies Q x need not be
true, that will depend on the interpretation for P and Q. Let us consider an interpretation
which will make this formula false, let us say the domain of discourse is the set of people, let
us say P of x stands for x is peaceful and Q of x stands for x is happy, so what does this
statement assert?

It asserts that if all are peaceful, implies that, all are happy. Then for every individual x if x is
peaceful then x is happy that need not be the case because even if the antecedent is true, that
is, if all are peaceful, then peace will prevail within humanity and that is sufficient for all to
be happy, still the consequent does not follow, what does the consequent say? It says that for

73
every single individual, if that individual is peaceful then he is happy, that may not be the
case, because this individual might be surrounded by quarrelsome people, so even if he holds
the peace, his neighbours may not. Therefore, he may not be happy.

Therefore, this is a counter example to establish that this statement is not logically valid. To
prove that first order formula is logically valid you have to argue in terms of all
interpretations, you have to show that this formula has to be necessarily true in every single
interpretation. On the other hand, to prove that formula is not logically valid, all that you
have to do is to come up with a counter example, come up with one particular interpretation
in which this formula will not be true.

So, in this case you have to come up with a counter example in which P and Q are universal
properties but if P is a universal property then Q is also a universal property, so you have to
assume that about the properties P and Q but then it should still be the case, it should still be
such that, if P is held only by one person then that person need not satisfy Q. If you can find
such an interpretation, then you have a counter example and that is what we have just done.

So, that is it from this lecture, hope to see you in the next, thank you.

74
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture 4
Mathematical Logic

Welcome to the NPTEL MOOC on discrete mathematics. This is the fourth lecture on
Mathematical Logic. In the last lecture, we started a discussion on first order logic. We
continue with this discussion. We saw, what is a free variable and a bound variable, in the last
class.

(Refer Slide Time: 00:53)

Let us consider a formula alpha of x with one free variable. So, x is the free variable here, in
that case this notation says that, there is a unique x, a unique x exists such that alpha of x. In
other words, this formula is equivalent to saying that, there exists an x, such that alpha of x is
true and for all x, for all y, alpha of x and alpha of y is true implies that x is equal to y, what
does it say?

It says that, there is an x, so that alpha of x is true, and in addition to that for every x and y, if
alpha is true for, x as well as for y, then x must be equal to y, that is there cannot exist two
distinct x and y, so that, alpha holds for both of them. So, this is exactly analogous to saying
that, there is a unique x, such that alpha of x.

75
(Refer Slide Time: 02:28)

Let us consider the negation of this. We want to say that there does not exist a unique x, so
that alpha of x is true, this would be equivalent to, if you apply De Morgan’s law repeatedly
to the above formula, it would be a disjunction, in which, the first term would be this: for
every x, alpha of x is false, so this is one way of negating the assertion that there is a unique x
which satisfies alpha of x.

Here what do we say? We say that, alpha is not satisfied by any x. So, this is one possibility.
The other possibility is that, there exists x and y such that, alpha of x is true for both x and y
and, x is not equal to y. So, this is the other way of contradicting the statement that, there is a
unique x so that alpha of x is satisfied. So, alpha x is satisfied by some individuals in the
domain, but there are multiple individuals that satisfies alpha of x. So, the statement that,
there is a unique x so that alpha of x is true is not right.

76
(Refer Slide Time: 03:58)

In many first order systems, we have a special road for the equality predicate. So, equality is
a predicate, which satisfies some conditions, the conditions are this: for all x, we can assert
that x is equal to x, for every individual, that individual is equal to itself. In addition to that, if
x is equal to y, then alpha of x, x implies alpha of x, y. Here, alpha of x, x is a first order
formula with some occurrences of the free variable x. Alpha of x, y is a substitution of y for
some occurrences of x.

(Refer Slide Time: 05:27)

The only stipulation is that, when you substitute here, when you substitute y for x, this
substitution should not be caught by any existing quantifiers for y, because, if such a catching
happens, then the intended meaning would change. For example, let us say we have a formula

77
of this form, For all y; P of x implies Q of y and let us say, we want to substitute y for x, in
this. x is a free variable here, but if you substitute variable y for x here, we would get the
formula for all y; P of y implies Q of y, but then this substitution is now caught by this
quantifier, which can change the intended meaning of the formula.

So, what does the first formula say? The first formula says that if P is true for x, then for
every y, Q is true, that is what it effectively says. In other words, if I take P of x as x is Angry
and Q of y as y is Scared, then what does the first formula say? The first formula says that,
for every y, if x is angry then y is scared. If x happens to be a despotic dictator, then it could
be true, if the dictator is angry, then everybody is scared, but then, what does the second
statement say?

With the quantifier for y catching the substituted occurrence of y, we have a statement like
for every y, if y is angry then y is scared, which means, any angry person is scared, which is
not true at all and that is not the indented substitution. Therefore, while substituting for one
variable you have to be careful, the substitution cannot be caught by any existing quantifier.

(Refer Slide Time: 08:04)

Let us take an example, which would explain the notion of logical consequences. So, we
consider first order logic with equality, so, equality predicate is available within the logic
with the properties we mentioned. Let us say we have three predicate symbols P, L and O. P
is a unary predicate, L is also a unary predicate and O is a binary predicate. We indent P of x
to stand for, x is a point, L of x stands for, x is a line, O of x, y stands for x lies on y.

78
(Refer Slide Time: 09:10)

So, with these predicate symbols, let us write a few formulae and see what the meaning is, by
gamma we define a set of formulae, a set of three formulae: P1, P2 and P3. So, let us now see
what these formulae are, P1 is the formula which says that for all x for all y; x is a point, y is a
point and x not equal to y implies that, there exists a unique z so that z is a line and x lies on z
and y lies on z.

So, this is the formula that we denote as P1. What does this formula say? For every pair of
points x and y, that are not the same as each other, in other words, for any pair of distinct
points, there exists a unique z that is a line so that x lies on z and y lies on z. In other words,
for every pair of distinct points, there is a unique line passing through them, which we all
know is true in Euclidean geometry, for any pair of distinct points, there is a unique line
passing through them, so this is the formula that we call P1.

79
(Refer Slide Time: 11:02)

Now, let’s take a look at P2. P2 asserts that for every z, if z is a line, then there exists x and y,
so that x is a point, y is a point, x is not equal to y and x lies on z and y lies on z. so, what
does it say? For every line z there is a pair of points x and y so that x is not the same as y,
which means there exists a pair of distinct points, so that x lies on z and y lies on z, or in
other words, in every line there are at least two distinct points. Every line has two distinct
points. This is again true in Euclidean geometry, we have familiar with the fact that every line
has an infinite number of points. So, here the assertion is only that every line should have two
distinct points at least, that is what assertion P2 is.

80
(Refer Slide Time: 12:30)

Now, coming to assertion P3, what P3 says is this, there exist w, there exist x and there exist y
so that w is a point, x is a point, y is a point, so there are three points w, x and y. So, that w is
not equal to x and x is not equal to y, and y is not equal to w, which means, the three points
should be distinct. So, there exist three distinct points w, x and y so that for every z if z is a
line then it is not the case that w lies on z or x lies on z or y lies on z, or in other words, there
are three distinct non collinear points, in other words there exist three points w, x and y so
that all three do not lie on the same line. So, check once again whether this is what the
formula says: there exist 3 points, w, x and y all three are points and all three are distinct so
that for every z if z is a line then all three cannot be lying on z, that is w lies on z, x lies on z
and y lies on z all three cannot be true simultaneously. So, there is at least one triplet of
points that do not lie on any line together.

81
(Refer Slide Time: 14:41)

So, we have these three statements. The statements P1, P2 and P3 together form the set
gamma, we say that gamma and all it's logical consequences form Incidence Geometry.

(Refer Slide Time: 15:19)

Now, the question we want to ask is this, we consider a fourth formula, which we called
Euclidean parallel property. What Euclidean parallel property says is this, given a line x and a
point y not on x, there exists a unique line passing through y and parallel to x. We know that,
this statement is true in Euclidean Geometry. Given a line and a point which is not on the line
through that point we can draw a parallel to the original line.

82
So, this is our line x and this is the point y, through y we can draw a line z which is parallel to
x, z and x do not intersect. So, you could paraphrase the statement like this, given a line x and
the point y not on x, there exists a unique line z passing through y so that x and z do not
intersect.

(Refer Slide Time: 17:03)

So, let us write the first order representation of the statement, we say for every x and every y,
so that x is a line and y is a point and y is not on x, there exist a unique z, so that z is a line
and it is not the case that there exist w so that w is a point which lies on both x and z. When
two lines x and z intersect, there is a point of intersection. If two lines intersect, then there
exists some point w that lies on both the lines. So, that is what we negate here, for every x
and y so that x is a line and y is a point and y does not lie on x, there exist a unique z which is
a line so that there does not exist a point w that lies both on x and z. So, this is the Euclidean
parallel property. Then, there is this question, is Euclidean parallel property necessarily true
in incidence geometry? Is Euclidean parallel property a logical consequence of Incidence
Geometry? As it happens, it is not.

83
(Refer Slide Time: 18:52)

How do we do show this? Let us consider the negation of Euclidean parallel property, so
from the formula that we have just written we find that, the negation of Euclidean parallel
property can be written in this fashion by using De Morgan’s law. This will be the general
structure of the formula, so you can fill the formula. You find that this is actually the OR of
two statements, one is the elliptic parallel property and other is the hyperbolic parallel
property. So, what are these statements, the elliptic parallel property and the hyperbolic
parallel property?

(Refer Slide Time: 19:46)

The elliptic parallel property says that, given a line x and a point p and a point y not on x,
there is no line through y, that does not intersect x. In other words, given a line x and a point

84
y not on x, there is no parallel line through y for x, this is what is called the elliptic parallel
property.

(Refer Slide Time: 20:41)

Elliptic parallel property could be written in this manner: there exist x and there exist y so
that x is a line, y is a point and y is not on x and it is not the case that there exist a z, where z
is a line and there is no w. Check once again, if the statement is right. What does it say? For
every x and y, where x is a line and y is a point and y does not lie on x. It is the case that there
does not exist a line z so that there is no w on which both x and z lie , that is x and z do not
intersect.

85
So, there is no point w, which is on both x and z, that is when x and z would intersect. So, the
meaning would be exactly what we have seen, given a line x and a point y not on x there is no
line through y that does not intersect x.

(Refer Slide Time: 21:55)

The hyperbolic parallel property would contradict the Euclidean parallel property, in another
way, it says that given a line x and a point y not on x, there are more than one parallel
through y, for x, that is given a line x and a point not on y, there are multiple parallels to x
through y. Such a statement is what is called the hyperbolic parallel property. So, along the
lines of the Euclidean parallel property and the elliptic parallel property you can write the
formula which corresponds to the hyperbolic parallel property, which I leave to you as an
exercise.

86
(Refer Slide Time: 23:01)

So, you can see that the Euclidean parallel property, when contradicted, gives us the OR of
elliptic parallel property and the hyperbolic parallel property, whereas Euclidean parallel
property can be contradicted either by holding elliptic parallel property or by holding
hyperbolic parallel property. Now, coming back to our question, is Euclidean parallel
property a logical consequence of incidence geometry?

We will show that it is not. The proof goes like this. Let us consider an interpretation, an
interpretation for incidence geometry. To interpret incidence geometry, we have to define
what lines are, what points are. So for doing this, we need a domain of discourse D. Let the
domain of discourse D consist of entities of this form three entities A, B and C and sets of
pairs of these entities, the entities A, B, C and sets A, B; B, C and C, A all belong to the
domain of discourse. Then the predicate p is associated to the set. A, B and C are the points,
so we say the set of points is A, B and C and we say the set of lines is A, B; B, C and C, A .
This is a departure from Euclidean geometry. So, the definition of points and lines are
different here, a line in particular is just a two member set.

87
(Refer Slide Time: 25:06)

So, we have three points A, B and C and we have lines of this form, the set which contains A
and B is a line this line has exactly two points A and B. The set which contains B and C
forms another line. This set has exactly two points B and C and then the set which contains A
and C also forms a line. This set also contains exactly two points. So, these three are lines, so
if we define A, B, C as the points and the sets A, B; B, C and C, A as the lines, then let us see
if the three formulae P1, P2 and P3 are satisfied.

So, what does P1 say? P1 says that, there is a unique line passing through any distinct pair of
points, any pair of distinct points. So, let us take A and B, A and B are two distinct points A
is not equal to B and there is a unique line A, B that passes through them, so in this case there
is a unique line L which contains B. So, the statement P1 is true for A and B and the statement
is true for B, C as well you take the two points B and C there is a unique set B, C which
contains them among the lines, so there is a unique line that contains both B and C.

Similarly, there is a unique line which contains both A and C. So, the first proposition is true,
P1 is correct. What about P2? P2 asserts that, in any line there are at least two points, which is
again the case here there are only three lines here AB; BC and CA are the three lines. In each
of these lines, there are exactly two points. So P2 is also satisfied. What about P3? P3 asserts
that, there are three non collinear points, so there must exist three points, all three not lying
on the same line, which is the case here indeed.

There are three points A, B and C and there is no line which contains all three, that is
trivially so, because every line contains exactly two points here. So P3 is also true. Therefore,

88
this interpretation makes the whole of gamma true, every formula in gamma namely P1, P2
and P3 are true in this interpretation. Therefore we say that this interpretation is a model of
Incidence Geometry.

(Refer Slide Time: 28:05)

But now, what about the Euclidean parallel property? We find that the Euclidean parallel
property is not true in this interpretation, that is because, when we consider line B, C and a
point A not on B, C, for Euclidean parallel property to be true, there should be a unique line
through A which does not intersect B, C. Here, there are two lines through A. One is AB,
which intersects BC. AB intersection BC is B.

Similarly, the other line through A, which is AC, also intersects BC. So, Euclidean parallel
property is violated here. In particular, here we find that the elliptic parallel property is true,
for any line and a point which is not on the line, there is no line parallel to the first line
through the point. In this case you take line BC and point A which is not on BC, there is no
line through A, which is parallel to BC. A line is parallel to BC, if that intersects with BC. A
line in this case is a set of two points.

So, if you interpret a line in this sense, then elliptic parallel property holds and therefore in
this model of incidence geometry, elliptic parallel property is true, Euclidean parallel
property is false. But then, is elliptic parallel property a logical consequence of incidence
geometry? That also is not true, that can be shown with another interpretation.

89
(Refer Slide Time: 29:49)

A second interpretation uses a different domain of discourse. Here we have four points A, B,
C, D and we have lines of this form AB; AC; AD; BC; BD; CD, that is we consider every
two member subset of the set A, B, C, D. So, the domain of discourse is made up of all these.
As in the previous interpretation, we define P: the set of points as A, B, C, D, and L the set of
lines as all two member subsets: A, B; A, C; A, D; B, C; B, D and C, D. So, we have four
choose two, namely six lines, there are four points and six lines.

(Refer Slide Time: 31:18)

Once again, we can verify that propositions P1, P2, P3 are true here. So here we have four
points, A, B, C and D, and the lines are two member subsets: A, B is a line, B, D is a line, C,
D is a line, A, C is a line, so are A, D and B, C, or to simplify the diagram, we could just

90
draw lines, this corresponds to the set A, B and this corresponds to the set C, D, this is B, D
this is A, C; B, C and A, D.

(Refer Slide Time: 31:44)

So, in this interpretation, there are four points and six lines and we find that the statements P1,
P2 and P3 are all satisfied. Given two distinct points B and C, there is a unique line which
passes through the two of them namely B, C, and P2 asserts that in every line, there are at
least two points, which is indeed the case.

A line is made up of two distinct points and there are three non collinear points. Any triplet
that can be chosen from here, for example A, B, C do not lie on the same line. That is because
the line has only two points. So, P1, P2, P3 are all satisfied therefore this interpretation is a
model of Incidence Geometry. What about the Euclidean parallel property? We find that the
Euclidean parallel property is true here. Take any line BC and a point A, which is not on BC,
then there is a line AD which passes through A and this line does not intersect BC.

Does not intersect, in the sense that, these two lines AC and BD, do not have a common
element, mind you even though I have drawn the sets using lines, they are in fact two
member sets. The line AD contains exactly two points A and D and the line BC contains
exactly two points B and C, these two sets do not intersect. So, we can say that AD is parallel
to BC because they do not have an intersecting point.

So, the line BC and A which is not on BC, shows us that there is a line AD which is through
A and does not intersect BC and you can verify that this is the case for every line and a point
which is not on the line. For example, if you take CD as a line and A as the point, A is not on

91
CD and there is AB which passes through A and is parallel to CD. Therefore Euclidean
parallel property is satisfied here.

Since Euclidean parallel property is satisfied, elliptic parallel property and hyperbolic parallel
property will be false in this interpretation. Therefore, we find that elliptic parallel property is
not a logical consequence of incidence geometry either, where as there is an interpretation, in
which all the axioms of incidence geometry, namely P1, P2, P3 are true, but in this
interpretation elliptic parallel property is not true. The first interpretation that we saw was a
model of incidence geometry, but in that interpretation the Euclidean parallel property was
not true.

(Refer Slide Time: 34:46)

Now, coming to the hyperbolic parallel property, let us consider, a third interpretation in
which we have five points A, B, C, D, E and all two member subsets namely A, B; B, C; A,
B; A, C; A, D; A, E; B, C; B, D; B, E and C, D; C, E and D, E there are 10 of them, 5 choose
2, there are 10 lines, so we have 5 points and 10 lines. So, P is A, B, C, D, E, the set A, B, C,
D, E, and L is the set of all pairs of points, unordered pairs of points.

Again you can verify that P1, P2, P3 are satisfied exactly as before, but we find that Euclidean
parallel property is not satisfied, that is because, if you take a line BC and a point A which is
not on BC, there are two lines AD and A, E passing through A, both of which are parallel to
BC, parallel to BC in the sense that those two lines do not intersect with BC. Mind you, a line
is just a set of two points.

92
So, set A, D does not intersect with B, C and set A, E also does not intersect with B, C
therefore we can say that there are two distinct sets containing A that do not intersect with set
B, C. Therefore Euclidean parallel property is not satisfied, elliptic parallel property is also
not satisfied, but we find that hyperbolic parallel property is satisfied. So, this is again a
model of incidence geometry, since P1, P2, P3 are all satisfied, but in this model of incidence
geometry, Euclidean Parallel property and elliptic parallel property are not true but
hyperbolic parallel property is true.

(Refer Slide Time: 37:09)

And finally we will look at one more interpretation, the fourth interpretation in which P1, P2,
P3, all need not be true. So, here D is the set of all points on the surface of a sphere, along
with the great circles of that sphere. Here, the first set forms the points. The set of points in
this interpretation is the set of all points on the surface of a sphere and the set of all great
circles on that sphere will form the lines.

93
(Refer Slide Time: 38:12)

Now, what is the great circle of sphere? When you have a sphere, and we consider a circle
drawn on the surface of the sphere, so that, the diameter of the circle is the same as the
diameter of the sphere, then this circle is a great circle. A great circle of a sphere is a circle on
the surface of the sphere, so that the diameters of the circle and the sphere are the same.

(Refer Slide Time: 39:13)

So, if you consider the Earth as a sphere, any meridian along with it's opposite meridian will
form a great circle, and the equator is also a great circle. Of course, these are not the only
great circles, by any means, you can draw any number of great circles. So, in this

94
interpretation the points on the surface will form the set of all points and the great circles will
form the set of all lines.

(Refer Slide Time: 39:51)

Now, let us see if this is the model of incidence geometry. We find that P1, is not satisfied
here. So what does P1, say? There is a unique line passing through any pair of distinct points,
which we find that is not true in this case, if the points that we take, happen to be the polar
opposites, then we find that, there are any number of great circles passing through them. In
particular, any meridian along with it's opposite one will form a great circle here, that is if
you take the two points as the two poles, the north pole and the south pole, on the sphere
which is the earth then any meridian along with the diametrically opposite to meridian will
form a great circle.

So, there is an infinite number of great circles passing through this pair of distinct points, that
is through the south pole and the north pole, we have an infinite number of meridians passing.
The pair of points that we take need not be the poles, you can take any pair of diametrically
opposite points and there would be an infinite number of great circles passing through them.
Therefore, P1 is violated here. Therefore, this interpretation is not a model of incidence
geometry.

95
(Refer Slide Time: 41:39)

Then let us formally ask this question. What is a logical consequence? We have a set of
formulae gamma, and we want to define the logical consequences of gamma, we say that
alpha is a logical consequence of gamma, denoted like this, alpha is the logical consequence
of gamma if every interpretation, that makes every formula in gamma true, makes alpha also
true. So, for incidence geometry we have seen three interpretations in which P1, P2, P3, are all
true. For Euclidean parallel property to be a logical consequence of incidence geometry, it
would have had to be true in all these three interpretations, but we find that it is true in only
one of the interpretations. Therefore Euclidean parallel property is not a logical consequence
of incidence geometry.

Similarly, elliptic parallel property is also not a logical consequence of P1, P2 and P3 . The last
interpretation that we saw did not satisfy P1. Therefore that is not a model of incidence
geometry. So, when we investigate Euclidean parallel property, we need not consider this last
interpretation, because this is not a model of incidence geometry. What we have to verify is
that in every interpretation which makes every formula in the set gamma true, namely P1, P2
and P3, must also make alpha true. If that is the case we say that alpha is a logical
consequence of gamma. So, it is in this sense, that we said that gamma along with all its
logical consequences forms incidence geometry. Ok, that is it from this lecture, hope to see
you in the next. Thank you.

96
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture 5
Mathematical Logic

Welcome to the NPTEL MOOC on the Discrete Mathematics. This is the fifth lecture on
mathematical logic.

(Refer Slide Time: 00:49)

In the previous four lectures, we had an informal discussion on propositional calculus and
first order logic. These are examples of logic systems. Today, let us have a formal discussion
on these systems of logic.

97
(Refer Slide Time: 01:18)

So, what is the system of logic? A system of logic essentially consists of these components,
the first one is, the syntax of the language. This specifies, what are the grammatical formulae
or the grammatical sentences of the language. Then, we have the semantic component of the
system. The semantics of the language specifies the meaning of the syntactic entities. It
assigns meanings to the syntactic entities and finally we have a proof system which is a
rewriting system, which starts with a set of axioms, has a rule of inference, has many rules of
inferences possibly, and then using these rules of inference, it writes new sentences and the
process of writing these new sentences is called proving and what we get is a proof.

So, let us now see in detail what these components are for the two systems of logic that we
have studied, namely propositional calculus and first order logic.

98
(Refer Slide Time: 03:14)

So, let us begin with propositional calculus. So let us begin with the syntax of propositional
calculus. The syntax is specified using, what is called a grammar. A grammar, as in English,
for example, we can say a sentence in English, is made up of a subject and a predicate. This is
what is called the grammatical rule or a grammatical production.

(Refer Slide Time: 04:13)

So, likewise we can write a set of productions for the sentences of propositional calculus. The
first production is this. Here, we say that, the grammatical symbol A stands for an atomic
proposition. So what this grammar rule says is that, an atomic proposition could be any one
of a1 through an. Here, of course this need not even be finite, you might even have an infinite

99
number of options. So, the rule could very well be this, that is, our system might have an
infinite number of atomic propositions which is also possible.

(Refer Slide Time: 05:22)

So, this specifies what the atomic propositions are for example, the atomic propositions could
be that, a1 says it is raining. a1 stands for the atomic proposition " it is raining ". a2 might
stand for the atomic proposition " I do not have an umbrella ", and a3 might stand for " I will
not get wet " and so on. So, your system might have several such atomic propositions, so this
is the first grammar rule that we have.

(Refer Slide Time: 06:01)

100
Then, the second grammar rule says that, the syntactic entity S which stands for a
propositional formula, could be made up in this fashion, what this says is that, a formula
could be the negation of another formula or it could be an implication, one formula implying
another could form a formula or it could be an atomic formula. So, this is a recursive
specification, what it says is that a formula could be a negation of another formula or it could
be an implication made up of two formulae, which two have to be synthesized afresh or the
formula could be an atomic formula, so that is where the recursion breaks off.

So, this is the other grammar rule that we have for forming propositional formulae. So, you
can see that, in this we are using certain symbols the open and closed brackets, the negation
symbol, the implication symbol in addition to the propositional symbols, a potentially infinite
number of propositional symbols. So, this is the other grammar rule that we have for forming
propositional formulae. So, you can see that, in this we are using certain symbols, the open
and closed brackets, the negation symbol, the implication symbol, in addition to the
propositional symbols, a potentially infinite number of propositional symbols.

So, this we call our alphabet, so the language of propositional calculus is made up of this
alphabet and from this alphabet using these two above grammar rules, we synthesize
formulae. So, a formula could be a negation of another formula, it could be an implication of
two formulae or it could be an atomic formula. An atomic formula could be any one of a1, a2,
a3 etc.

101
(Refer Slide Time: 07:49)

So, let us see an example of synthesizing a formulae, you could derive negation of S from S,
as the grammar rule shows here, a formula could be a negation of a formula. So, a formula
could be derived as a negation of another formula, which could intern be derived as an
implication. Now, there are two formulae here, in the implication there is an antecedent as
well as a consequent, we could decide that the antecedent is a negation of another formula.
So, now the synthesized formula has two placeholders, two instances of S, each will stand for
a formula. Now, we could decide that these two instances are in fact atomic formulae, and
then, we could say that the first atomic formula stands for propositional symbol a1 and the
second atomic formula stands for the propositional symbol a2 .

102
Now, this is a concrete propositional formula, this is what we call a well-formed formula.
This is a formula made up of the alphabet of our language. It has negation symbols, closing
and opening brackets, two propositional symbols a1 and a2, negation symbol and then
implication symbol. So, this is a well-formed formula of the alphabet of our language. So,
this is how we derive the formulae of the language, using these two grammar rules we can
derive the formulae of our language.

(Refer Slide Time: 09:56)

So, this is the syntax. The syntax specifies what are the well-formed formulae. The semantics
of the proof system specifies what they mean, what the formula mean. So, now let us go on to
the semantics of our proof system. How do we specify truth values for these formulae?

103
(Refer Slide Time: 10:47)

To specify the semantics, we consider functions of this form, functions that map the
propositional symbols or the atomic formula to the set {0, 1}. Such a function is called an
assignment. So, this is essentially an assignment of truth values, to the propositional symbols.
A propositional symbol is the same as an atomic formula. These two are synonyms. So, an
assignment sigma assigns truth values to the propositional symbols or the atomic formulae.
So, now once we have an assignment, we can say which atomic formula are true and which
atomic formulae are false, but how do we know the truth values of a synthesized formula
now, a well-formed formula?

(Refer Slide Time: 12:07)

104
So, the semantics of a well-formed formula is denoted like this, when I write like this, what I
mean is the truth value of alpha under the assignment sigma. So this is a notation we shall
use.

(Refer Slide Time: 12:37)

So, using this notation we say that for an atomic formula an, the truth value of the formula an
under the assignment sigma is nothing but the truth value explicitly assigned by sigma to it.
So, sigma is a mapping from the propositional symbols to the truth values, so the truth value
which is mapped to an will be the meaning of an, so that specifies the truth values for all
atomic propositions. Now, let us consider formulae of this form, formulae that are negations.
The truth value assigned to the negation of a formula will be defined in this manner, you
compute the truth value of the formula alpha under sigma and then subtract that from 1. This
is the truth value that is assigned to the negation of alpha.

So, if alpha evaluates to 1 under sigma, then, not of alpha will be assigned 0, if alpha
evaluates to 0 under sigma then not of alpha will be assigned a value of 0 under sigma. So
this is how you assign a truth value to a negation. For an implication, likewise we can say,
this is 1 minus the meaning of alpha under sigma multiplied by 1 minus the meaning of beta
under sigma. That is, the truth value assigned to alpha implies beta under the assignment
sigma would be this quantity. So, you can see that this assigns a truth value of 0 if and only if
alpha evaluates to true and beta evaluates to false, under sigma.

Or in other words, if alpha is 0 or beta is, 1 then alpha implies beta will be assigned a truth
value of 1. So, this is how we assign truth values to composite formulae. So, this now

105
specifies a truth value for every single well-formed formula. So, given a well-formed
formula we can parse the formula, see how the formula was derived using the grammar and
then running along the derivation we can assign truth values to the constituents of the
formula. So, it is possible to calculate the truth value of a formula, given an assignment.

(Refer Slide Time: 15:39)

So, consider a truth table for a formula alpha. In the truth table, we find that we have one row
corresponding to each assignment. So if the formula evaluates to 1 under this assignment,
then the corresponding entry in the truth table in the final column would be 1. Instead if the
formula evaluates to 0, then the corresponding entry will be 0. So, looking at the truth table
we can understand what the meaning of the formula is under any assignment. So, a truth table
is a complete specification of the semantics of a formula. So, given a formula alpha, and an
assignment sigma, we can refer to the truth table to see whether sigma satisfies it or not, once
the truth table is available.

106
(Refer Slide Time: 16:43)

We say that alpha logically implies beta, which we denote in this fashion. The double
implication is supposed to mean that alpha logically implies beta. If every assignment that
satisfies alpha, also satisfies beta, that is when we say that, alpha logically implies beta. In
other words, if you look at the truth table and the columns corresponding to alpha and beta,
we find that whenever alpha is true, beta is also true. But of course beta could be true even
when alpha is false in some assignments, where alpha is false beta could be true. But what we
know is that whenever alpha is true, beta is true. Therefore, we say that beta is a logical
consequence of alpha, if alpha holds then beta has to necessarily hold.

(Refer Slide Time: 18:02)

107
So, another way of saying is this, beta is a logical consequence of alpha. We could write this
way to, to say that beta is a logical consequence of alpha, and we say that a formula, a well-
formed formula alpha is a tautology, if the truth value of alpha in sigma is 1 for all possible
assignments sigma. So, a tautology is a formula which is true always, in every single
assignment, the formula happens to be true. When we write like this, what we mean is that
alpha is a tautology, alpha is a logical consequence of nothing. In other words, alpha is true
everywhere in which case we say that alpha is a tautology.

(Refer Slide Time: 19:20)

So, let us look at a theorem now, which we call theorem 1. This says that, if alpha is a logical
consequence of a set of formulae gamma, gamma is a set of formulae, mind you, if alpha is a
logical consequence of gamma. In other words, when the whole of gamma is true, gamma
consists of several formulae, when each of these formulae is correct then alpha is also true
and alpha implies beta is a logical consequence of gamma. Then beta is a logical consequence
of gamma. So, this is what we want to show.

Let us assume that alpha is a logical consequence of gamma and alpha implies beta is a
logical consequence of gamma. So what it means is that, in an assignment which satisfies the
whole of gamma, alpha is true as well as alpha implies beta is true. So, suppose sigma
satisfies gamma. Then, sigma satisfies alpha and sigma satisfies alpha implies beta. So, if
under this assignment, alpha is true and alpha implies beta is true, then, beta must also be
true. Because, if beta is not true, then alpha is true and beta is false.

108
Therefore, alpha implies beta must be false, which is a contradiction. Therefore, these two
together ensures that beta is satisfied by gamma. That proves the theorem.

(Refer Slide Time: 21:38)

So, as a corollary we see that, if alpha is a tautology and alpha implies beta is also a
tautology, then, beta is also a tautology.

(Refer Slide Time: 22:03)

Now, we have seen the syntax as well as the semantics of the system of logic. How do we
correlate the two? How do you distinguish a tautology when you see one? That is one
question, when you has given a formula you have to say whether it is a tautology or not.

109
The second question is this, given gamma, distinguish it's logical consequences. So, we have
these two questions, first we want to distinguish tautologies, given a formula, we have to say
whether it is a tautology or not. Secondly, given gamma, gamma could be finite or infinite,
we have to distinguish it’s logical consequences, that is given gamma in the context of
gamma when we are given a formula alpha we have to say, whether alpha is a logical
consequence of gamma or not. If gamma is finite, then these two questions can be answered
using the truth table method. That is you drop the truth table, you have a formula alpha, you
drop the truth table if the truth table says that the formula is true in every single assignment
then the formula is a tautology.

Similarly, you drop the truth table for every single formula in gamma, if gamma is finite and
then you also drop the truth table of alpha and you find that, in every assignment which
satisfies the whole of gamma, alpha is also true in which case alpha is a logical consequence
of gamma. So, the truth table method will help us in answering these questions in situations
where gamma is finite. But the truth table method is a luxury that we have, in the case of
propositional calculus. But when we come to first order logic, as we shall soon, we find that
we do not have a method which is analogous to the truth table method, because the semantic
space there could be infinite. Therefore, we need a different way of distinguishing logical
consequences as well as tautologies.

(Refer Slide Time: 24:35)

For this, we have what is called a proof system. In a proof system, we say that, P is a proof
system, if P has a set of axioms. An axiom is nothing but a formula. So, a set of axioms is a
set of formulae and a set of rules of inference. A rule of inference is a relation on formulae.

110
So, a proof system consists of these two components, a set of axioms and a set of rules of
inference and then, a proof in this proof system is a sequence of formulae, a sequence beta 1
through beta n is called a proof, where beta 1 is an axiom.

(Refer Slide Time: 25:56)

So a proof always begins with an axiom and beta i for i greater than 1, is an axiom. You
could use an axiom anywhere in the proof. So, beta i is either an axiom or follows from beta
1 through beta i minus 1, the previous formula in the proof, by some rule of inference. So,
this is our notion of a proof. We have a set of axioms and we have a set of rules of inference.

A proof is a sequence of formulae beta 1 through beta n, so that the first formula in the
sequence is necessarily an axiom and any subsequent formula in the proof is either an axiom
or follows from the previous formulae in the proof, by some rule of inference. So, what it
means is that, within a proof you can use an axiom anywhere you want, and at any point in
time you can use some of the previous formulae, combine them to using a rule of inference to
create a new formula which could be the next formula within the proof. So, such a sequence
of formula is called a proof.

111
(Refer Slide Time: 27:28)

So, now let us see a proof system for propositional calculus. A proof system P0 for
propositional calculus, consists of some axioms, that are called logical axioms. So axioms
could be classified into logical axioms and proper axioms. So, the logical axioms would
belong to these templates. So, in this proof system that we are going to talk about, there are
three templates, the templates are these.

The first template is of the form alpha implies beta implies alpha. This is the first template.
So, from this template you can form concrete axioms, by substituting formulae for alpha,
beta. alpha and beta in this case. So, you could write a1 implies a2 implies a1; a1 implies a3
implies a1 and so on. So, these are all concrete formulae that are derived using this template.
Therefore, we will say that A1 is an axiom schema.

112
(Refer Slide Time: 29:14)

The second axiom or the second family of axioms is of this form, alpha implies beta implies
gamma implies alpha implies beta implies alpha implies gamma. And the third family of
axioms says that, not alpha implies not beta implies not alpha implies beta implies alpha. So,
that these are the three families of axioms that we have, these are the families of logical
axioms by substituting any formula in a well-formed formula for alpha, beta and gamma in
these templates we can derive an infinite number of logical axioms. So, P0 consists of an
infinite number of logical axioms.

(Refer Slide Time: 30:06)

In addition to that, we could also have called proper axioms. Let us say, we have a set of
proper axioms called gamma. So, this could be any set of arbitrary formulae that suits our

113
convenience and we have only one rule of inference. This rule of inference is called Modus
Ponens, or MP for short. It is triplets of the form: alpha, alpha implies beta, and beta, where
alpha and beta are well formed formulae.

(Refer Slide Time: 31:10)

Or we could write like this, a rule of inference could be written like this, when you have
alpha and alpha implies beta, then beta is derivable. So, what it means is that, within a proof
if you have already proved alpha and you have proved alpha implies beta, then, as the next
step in the proof you could write beta, that is if alpha and alpha implies beta are among the
previous formulae within the proof, then the next formula within the proof could be beta. So,
this is a rewriting rule called modus ponens. So this is the only role of inference that we have.
So, in the sense our axioms are logical axioms or the formulae, the set of formulae in the set
of formula gamma.

So, axioms are from these and MP is our only rule of inference, therefore a proof will consist
of a sequence beta 1 through beta n where beta 1 is necessarily an axiom, it is either a logical
axiom or is a formula from gamma and any subsequent formula is either a logical axiom or a
formula from gamma or follows from two previous formulae by Modus Ponens. So, such a
sequence of formulae is what we call a proof.

114
(Refer Slide Time: 32:43)

So, you may have noticed that here in our system, we have only two logical connectives, for
example, we have negation and implication but we have not used any other connective. That
is because, this set is a complete set of connectives. Why is that? That is because we know
that x implies y is logically equivalent to x bar or y, negation of x or y. Therefore, negation of
x implies y will be logically equivalent negation of negation of x or y but negation of
negation of x is the same as x. Therefore, negation of x implies y is nothing but the OR of x
and y that is OR can be synthesized using negation and implication. Now, we know that OR
and NOT together form a complete set of connectives. Therefore, negation and implication
also form a complete set of connectors. Therefore, in our system if we have only these two
connectives, still we can synthesize any boolean function as we have seen before. So, that is
why we have used only these two connectives in our system.

115
(Refer Slide Time: 34:15)

Now, let us see a sample proof. When we write like this, what we mean is that, alpha is
provable from the logical axioms alone, without any proper axioms. When we write like this,
what we say is that alpha is provable from gamma. So, in this case what we mean is that from
logical axioms and the set of proper axioms gamma, alpha is provable.

(Refer Slide Time: 35:10)

So, in the sense we will show that alpha implies alpha is provable from gamma. So, this is
what we want to establish alpha implies alpha is provable from, let us say nothing, that is
from logical axioms alone we can prove alpha implies alpha.

116
So, how we prove is this, the first statement has to be necessarily an axiom, so the axiom that
we choose is this. So, this is the first statement in the proof. The first statement has to be
necessarily an axiom, now, is this an axiom? We find that it is indeed an axiom, because this
is from the second template.

In the second template, we have alpha implies beta implies gamma implies alpha implies
beta implies alpha implies gamma. So, if you substitute alpha implies alpha for beta and
alpha for gamma, what we get is exactly this. So, this is an instance of axiom schema 2. By
substituting alpha implies alpha for beta and alpha for gamma, this is what we get. So, this is
the first statement in our proof.

The second statement in our proof is, alpha implies alpha implies alpha implies alpha. This
you can see, is an instance of axiom schema 1, if you substitute beta for, or rather alpha
implies alpha for beta, in axiom schema 1, this is precisely what we get, so this is the second
statement in the proof.

(Refer Slide Time: 37:23)

117
Then the third statement is an implication, alpha implies alpha implies alpha implies alpha
implies alpha. How do we get this? We get this from the first two statements. So, if you
observe the first two statements, we find that the second statement is in fact the antecedent of
the implication, which is the first statement. The second statement and the antecedent here are
identical. Therefore, once we have these two statements in our proof, we can apply the rule of
inference Modus Ponens on these, and we can write the consequent as the next statement that
is precisely what we have done, this happens to be the consequent of the first statement in the
proof.

So, this is by Modus Ponens on statements 1 and 2, and then, we have alpha implies alpha
implies alpha. How? by A1. If in axiom schema 1, you substitute alpha for beta, this is
precisely what you get alpha implies alpha implies alpha, so this is the fourth statement in the
proof. Now, you compare the third statement with the fourth statement we find that the fourth
statement is the same as the antecedent of the third statement. Therefore, we can now derive
the consequent of the third statement as the next statement in the proof and this is precisely
what we wanted to prove. We wanted to show that alpha implies alpha is provable from
logical axioms alone, that is precisely what we have done. So this is an example of a proof in
our system.

118
(Refer Slide Time: 39:25)

Now, we prove what is called the soundness of the system of logic. We say that a system of
logic is sound, if whatever you prove in that system happens to be the logical consequences.
In other words, if you take gamma as the set of proper axioms and manage to prove alpha,
then alpha is indeed a logical consequence of gamma. That is, whatever you prove is a logical
consequence. Therefore, the system of proof that we have, is sound. So, how do you prove
this?

(Refer Slide Time: 40:23)

119
If alpha is provable from gamma, that is what we have assumed, alpha is provable from
gamma and then we want to show that alpha is indeed a logical consequence of gamma. If
this is provable from gamma, then there exists a sequence beta 1 through beta n, culminating
in alpha, so that beta 1 is an axiom and any other formula in the sequence is derived from two
previous formulae by modus ponens. Or in other words, there exists a proof, which is
culminating in alpha. That is, when we say that, alpha is provable from gamma. So, this we
know. If alpha is provable then there is a proof. Now, let us look at beta 1, beta 1 is surely an
axiom. So, any assignment that satisfies the whole of gamma must satisfy beta 1 too, why is
that? if beta 1 is an axiom, either it is a logical axiom or it is a proper axiom, if it is a proper
axiom, it is a member of gamma.

Now, we are looking at an assignment which satisfies the whole of gamma. Therefore, in
particular beta 1 was also satisfied, if beta 1 happens to be a proper axiom. On the other hand,
if beta 1 is a logical axiom, then beta 1 must subscribe to one of the three templates A1, A2
and A3. If you look at A1, A2 and A3, you find that these three templates are essentially
tautologies, that is whatever you substitute for the variables here what you get is a tautology,
you can drop the truth table and see that if whatever truth values you assign to alpha, beta and
gamma, these formulae will always evaluate it true. Therefore, our logical axioms are always
tautologies. Therefore, they are satisfied in every single assignment. So, if beta 1 is a logical
axiom, then it is true in every single assignment not just in the assignments that satisfy the
whole of gamma.

So, in particular in any assignment which satisfies the whole of gamma, beta 1 is true also.
Therefore, if beta 1 is an axiom, then any assignment that satisfies gamma must necessarily
satisfy beta 1. Or in other words, beta 1 is a logical consequence of gamma. So, we are now
proving by induction. This is the basis of the induction. The first statement in the proof is a
logical consequence of gamma.

120
(Refer Slide Time: 43:18)

Now, assume that, beta j is a logical consequence of gamma for every j less than i, and we are
going to look at beta i. Consider the sequence from beta 1 through beta i, this is also a proof
culminating in beta i. Every formula here is, either an axiom or follows from two of the
previous formulae by Modus Ponens. Therefore, this is also a proof and here we know that
beta 1 to beta i minus 1 are all logical consequences of gamma. What we want to show is
that, beta i is also a logical consequence of gamma.

(Refer Slide Time: 44:11)

121
Now, beta i could be several, beta i could be an axiom. If beta i is an axiom, we have an
argument that we made in the case of beta 1. That argument is valid here too. If beta i is a
logical axiom it is true everywhere, so it is true in particular in the assignment which makes
gamma true, if beta i is a proper axiom then it is a member of gamma. So, in any assignment
which makes the whole of gamma this is also true trivially.

So, if beta i is an axiom, then we know that beta i is a logical consequence of gamma.
Suppose beta i follows from some beta j and beta k by Modus Ponens. If this is the case, then
either it is the case that we have beta j and beta k which is the same as beta j implies beta i, or
it is the case that we have beta k and beta j which is the same as beta k implies beta I, the two
are symmetric so we will discuss only one. It is exactly in these two situations when we will
be able to derive beta i from beta j and beta k using Modus Ponens. So, let us consider the
first case, let us assume that beta k is beta j implies beta I. Now, what we know is that all the
previous j's are logical consequences of gamma. Therefore, we know that beta j is a logical
consequence of gamma and beta k, which is beta j implies beta i is also a logical consequence
of gamma.

(Refer Slide Time: 46:15)

By the theorem, we have just shown if beta j is a logical consequence and beta j implies beta i
is a logical consequence, then beta i is also a logical consequence. Therefore, what we have
found is that, for i, beta i is a logical consequence of gamma.

122
Therefore, by induction, we can say that this is the case for every i less than or equal to n. In
the proof, we are going to consider beta 1 to beta n. But then what is beta n? beta n happens
to be alpha, that is the culmination of the proof, the original proof that we started with. We
started with the proof of this form which ends in alpha, that is why we claimed that alpha is
provable from gamma. So, now what we have shown is that the culminating statement, which
is beta n is also a logical consequence of gamma, by induction.

In other words, alpha is a logical consequence of gamma. In other words, if alpha is provable
from gamma then alpha is a logical consequence of gamma. In other words, whatever we
prove is sound in this system of proof. Okay, that is it, from this lecture. In the next lecture,
we will see a proof system for the first order logic. Thank you.

123
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture 6
Mathematical Logic

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the sixth lecture on
mathematical logic. In the previous lecture, we saw a formal discussion of propositional
calculus. Today we shall have a formal discussion of system of logic, which is more involved
than propositional calculus.

(Refer Slide Time: 00:56)

So, in first order logic, exactly as in the case of propositional calculus, we have a
specification of syntax, semantics and a proof system.

124
(Refer Slide Time: 01:21)

Let us begin with syntax. So, as in propositional calculus, to specify syntax we require an
alphabet first of all, and using this alphabet we have to devise formulae. So the alphabet of
first order predicate calculus will consist of these symbols: open and close brackets, the
logical connectives, negation and implication exactly as in propositional calculus and we will
have a quantifier, the " for all " quantifier.

And then we will have a number of variables, the variables could even be infinite and then we
have function symbols of the form where n is greater than or equal to 0 and i is greater than
or equal to 0; F n i will denote the ith n-ary function symbol, so this is a function symbol
which will have n arguments, so this is the ith such. So, we have such function symbols and
we have predicate symbols P n i where n is greater than 0 and i is greater than or equal to 0.
So, this represents the ith n-ary predicate symbol. So, the alphabet of the language will
consist of these symbols: the open and closed bracket, the negation symbol, the implication
symbol, the quantifier, the universal quantifier, a number of variables, a number of function
symbols and predicate symbols.

125
(Refer Slide Time: 03:06)

You notice that we have only one quantifier here, the universal quantifier, but we have not
included the existential qualifier. That is because existential quantification can be expressed
using universal quantification and negation by De Morgan’s law. For example, a formula of
this form can be written as the negation of alpha of x, the negation of for all x the negation of
alpha of x, by De Morgan’s law. Therefore, the existential quantifier can be expressed using
the universal quantifier and negation, and in our alphabet we have included both the universal
quantification and negation. So, that is a sufficient set, as we shall see.

(Refer Slide Time: 04:12)

126
Now, let us define the grammar which governs the language. First, we define what is called a
term. A term is an entity that is supposed to name individuals, so a term is either a function
symbol of this form, a 0-ary function symbol, we will also write a 0 argument function
symbol as a I, we will use a i as a short form for F 0 i which is a function symbol that does
not take an argument. So, a term could be one such, or it could be a variable or it could be an
n-ary function symbol applied to a number of terms

So, a term can be constructed from other terms in this fashion. A term could be made up of an
n-ary function symbol applied on n terms, so there would be n arguments here, if we use an
n-ary function symbol. In particular, if we use a 1 argument function symbol, we will have a
term of this form. If you use a 2 argument function symbol, we will use two terms as
arguments. So, a term can be generated inductively in this form.

(Refer Slide Time: 05:59)

So, that is what a term is. An atomic formula would be an n-ary predicate symbol applied on
'n' terms. So, using the previous grammar rule, we generate terms and using n such terms we
have to use an n-ary predicate symbol to generate an atomic formula and then a well-formed
formula would be either an atomic formula or a negation of a well-formed formula or a well-
formed formula implying another. These two are similar to propositional calculus and then
finally for any variable x i for all x i quantification applied on a well-formed formula, will
also be a well-formed formula.

127
So, a well-formed formula can be synthesized in this fashion. So, these are the rules
governing the syntactic entities of first-order logic. Every syntactic entity can be generated
using one of these rules from the alphabet.

(Refer Slide Time: 07:40)

Now, we need to specify the semantics. To specify the semantics of first-order logic, we use
what is called an interpretation. An interpretation I is a triplet. It specifies D a domain of
discourse, this is the set of individuals about whom we talk using the system of logic and then
we have a function F. F is a mapping from the set of function symbols to functions on D.

In particular, an n-ary function symbol should be mapping D power n to D. It would take n


individuals and map them to D, that is the semantics that we assigned to F n i an n-ary
function symbol, the meaning of this function symbol would be a function which maps a
tuple of n entities from D to an entity from D, and finally, we have a third component R. The
third component R maps the predicate symbols to relations on D.

128
(Refer Slide Time: 09:56)

In particular, an n-ary predicate symbol will map to an n-ary relation on D, that is, the
meaning of an n-ary predicate symbol is going to be a subset of D power n.

(Refer Slide Time: 10:40)

So, let us take an example. First of all, we consider 1 variable functions, or let us begin with 0
variable functions. These are also called constant symbols. These are akin to proper nouns in
English. These are supposed to refer to particular individuals in the domain of discourse.

129
Proper nouns like, Singh, Modi, Jaitley Chidambaram etc. are proper nouns. They refer to
individuals, when the domain of discourses the set of all people.

So, when we have a first order system that is talking about the set of all people, we would be
referring to particular individuals using their proper nouns. So, these are all proper nouns.
Using 0 variable functions which are constant symbols we refer to particular individuals
belonging to the set D.

(Refer Slide Time: 12:19)

Whereas 1-variable function symbols, of this form, take an argument x and then map this
argument to an individual belonging to D. For example, when we say, father of x, we take
argument x and then use the father of function to map x to the father of individual x in the
domain of discourse. So, x belongs to the domain of discourse and the father of x also
belongs to the domain of discourse. Another 1-variable function could be the mother of x. So,
these 1 variable functions map individuals to individuals. So, 1-variable function symbols are
mapped to such functions.

130
(Refer Slide Time: 13:24)

Then, what could be a 2-variable function on D? 2-variable function symbols would be


mapped to such functions. A 2-variable function on D where D is the set of all people, could
be the eldest child of x and y, you substitute appropriate individuals for x and y, you get the
eldest child of x and y as the meaning of this expression. So this is a term. So, if you look at
the definition of a term that we had before, we find that a term is obtained in this manner.

A term could be a constant symbol, that is, it could be an individual specified using his or her
proper noun or a term could be variable, a variable is rather like a pronoun in English, he or
she or it. So, an individual can also be referred using a pronoun. So, a term could be a proper
noun or a pronoun or it could be a function symbol applied on a number of terms. For

131
example, when you say, father of the eldest child of a and b, where a and b are proper nouns,
then you understand what is the meaning of this expression. So, you can construct terms in
this fashion using function symbols and smaller terms.

(Refer Slide Time: 15:13)

An example of a 3-variable function, again, when D is the set of all people could be the tallest
among x, y and z. You could substitute appropriate terms for x, y and z to get terms out of
this function. So, every term you can see is supposed to refer to individuals.

(Refer Slide Time: 16:02)

To take another example, if the domain of discourse were the set of natural numbers, then x
plus y is the function symbol plus applied on x and y. Only that the function here is applied in

132
the infix format, you could use it in the usual prefix format like this, plus applied on x and y,
this represents the sum of x and y. So, this is now a 2-variable function symbol and that is
mapped to the addition function. Similarly, x * y written in prefix form would look like this,
multiplication applied on x and y, this is again a 2-variable function symbol. So that is how
we construct terms out of smaller terms.

(Refer Slide Time: 17:01)

And then, predicate symbols, predicate symbols are mapped to relations, so what would be an
example of a 1-ary relation? For example, is an actor, is a 1-ary relation. So, this is the set of
all x belonging to D, such that x is an actor. That is the interpretation I maps the function
symbol "is an actor" to this relation.

133
So, when we write, is an actor being applied to a term, to evaluate the truth value of this you
have to first evaluate this term and then you have to check whether that individual the
individual referred to by this term is indeed an actor, that is it indeed belongs to this set. If
that is the case, then this predicate is true, otherwise this predicate is false.

(Refer Slide Time: 18:20)

What would be an example of a 2-ary relation? On the set of individuals, you could say, is
taller than, you would want to say that x is taller than y, this is a 2-ary relation. A 3-ary
relation on the set of natural numbers could be the set of all triplets. Could be like this. So, we
have n-ary relations that are mapped to predicate symbols with n variables.

(Refer Slide Time: 19:16)

134
The meaning of a predicate symbol with n variables, would be an n-ary relation on D. So, that
is how we define the semantics of the symbols.

(Refer Slide Time: 19:36)

Now, we consider what is called a context. We will consider the context along with an
interpretation. We call this an interpretation context pair. What the context does is to map this
set of variables to the domain of discourse. So, for each variable x i, the context specifies a
member of the set D. To take an analogy with English, in a sentence when we say, he is on
his way, what does he refer to?

To understand who the person being referred to here is, we will have to look at the previous
sentence or we will have to understand the context. If you look at the context you will
understand who this pronoun is referring to. To understand the meaning of a pronoun you
have to look at the context. In the context, this pronoun will be mapped to a particular
individual. So, he is mapped to some member of the domain of discourse, which in this case
is the set of all people.

135
(Refer Slide Time: 21:04)

So, once we have given an interpretation and a context, we can talk about the meaning of
various formulae and terms. First, let us consider the meaning of terms, the meaning of
variable x i in an interpretation and context is defined solely by the context. That is, as I
mentioned the meaning of a pronoun will be given by the context in which the use of the
pronoun occurs. Whereas, the semantics of a 0 variable function symbol would be specified
by the interpretation itself.

The interpretation would say which individual each proper noun refers to. For example, we
consider an n-ary function being applied to n terms. The semantics of such a term is defined
inductively. First, we apply the interpretation on the n-ary function symbol, which will give
us an n-ary function. This is applied on the n tuple of individuals that we obtain by applying
this same function, the same semantic function s I, c on these terms.

So, s I c term 1 will give us an individual, the individual who is referred to by term 1. s I c
term 2 will give us another individual. The individual referred to by term 2 and so on. We
collect all these individuals. Form an n tuple of them and on this n tuple, we apply the n
variable function which is the mapping of F n i, that would give us an individual, that
individual is the person who is referred to by F n i on term 1 through term n. So, this is how
we would synthesize the meaning of a term

136
(Refer Slide Time: 23:37)

Now, coming to the semantics of formulae, for n greater than 0 and i greater than or equal to
0, when we consider an atomic formula of the form P n i applied on term 1 through term n,
we would say that, this is 1 precisely, when the n tuple obtained by applying the meaning
function on these terms. This n tuple belongs to the n-ary relation, which is mapped to the
predicate symbol P n I. If this is the case, we would say yes for P n i on term 1 through term
n. Otherwise we will say no. So, this is how we define the meaning of all atomic formulae.

(Refer Slide Time: 24:55)

137
Now, let us consider a general formula. A general formula could be a negation of another
formula. This, exactly as in the case of propositional calculus will be 1 minus then semantics
of alpha in the context of I and c. The semantics of alpha implies beta with reference to the
interpretation context pair again would be, as in propositional calculus, 1 minus the meaning
of alpha multiplied by 1 minus the meaning of beta, the truth value of beta. So, you can see
that alpha implies beta is true in the interpretation context pair if and only if alpha is false or
beta is true exactly as it should be.

(Refer Slide Time: 26:14)

Now, coming to the last rule of formula synthesis, what would be the meaning of for all x i
alpha? We say that this is 1 if and only if for every context c prime, such that on x j, c and c
prime disagree only if j equal to i. For this, alpha i c has to be 1. So what does it say? We are
now standing in context c. In context c variables are mapped in various forms, the variables x
1, x 2, x 3 etc. x I, x i plus 1 etc. are mapped to individuals d 1, d 2, d 3 etc. from D. So, that
is what the context does, it maps an individual to each variable. Now, we are looking at a
context c prime, which is identical to the context c except for variable x I. What c prime
should do is to map every variable except x i to the same individuals as c does, but x i could
be mapped to a different individual d i prime. c i prime could map x i to a different individual
d i prime.

So, c prime with respect to c, we say is a one change world. The world view of c prime is
almost identical to that of c, except that there is one change. The pronoun x i refers to a
different individual possibly, that is it might refer to d i prime instead of d i, which is where it
is being referred to in c. So, c prime is a one change world. So what do we say here? What we

138
say is that, for every one change world of c, the statement alpha must be true, which means
when you stand in context c with respect to the interpretation I, you should be able to say that
irrespective of the meaning of variable x i, alpha must be true. That is precisely what we try
to say here.

What we say is that alpha is true for every individual x i. But in this context x i has a certain
meaning, but what we want to say is that even if the meaning of x i changes so as long as
everything else remains the same, alpha will still be true. In whichever way x i changes alpha
will still be true. So, in all one change worlds imaginable, when you are standing in context c
where only the mapping of x i changes, alpha will still be true in all those contexts. If that is
the case, then, this will be 1, otherwise it will be 0. This is precisely our understanding of the
universal quantification.

(Refer Slide Time: 29:48)

We say that alpha is true in an interpretation I if and only if for every context c, alpha of I c is
1. That is, irrespective of the context c alpha is true. In that case we said that alpha is true and
I. Analogously we can say that alpha is false in I, if for every context c, alpha of I c is 0. But
of course you can see that for a formula alpha and an interpretation I, alpha might neither be
true in I nor be false in I. How is that possible?

It could be that alpha of I c is 1 for some context c, but for the same interpretation, if you take
another context alpha might be 0. In that case alpha is neither true in this interpretation nor
false in this interpretation.

139
(Refer Slide Time: 31:07)

We say that alpha is satisfiable, if there exist an interpretation context pair, such that alpha is
true in that interpretation context pair. And generalizing on this notion, we say that, a set of
formulae gamma is satisfiable, if there is an interpretation context pair such that every
formula alpha in gamma is satisfied by this interpretation context pair. This must be true for
every alpha in gamma. In that case, we say that, gamma is satisfiable.

(Refer Slide Time: 32:11)

Analogous to what we did in propositional calculus, we can now define the notion of a
logical consequence. We say that alpha logically implies beta, if every interpretation context

140
pair which makes alpha true will also make beta true. In this case, we will also say that beta is
a logical consequence of alpha.

(Refer Slide Time: 32:50)

In particular, we say that alpha is logically valid, if every interpretation context pair makes
alpha true. Alpha has to be true on every interpretation context pair, that is when alpha is
logically valid. This is analogous to the notion of a tautology in propositional calculus. We
write like this to indicate that alpha is logically valid.

(Refer Slide Time: 33:29)

141
Again, analogous to what we did in propositional calculus, we can prove this theorem. If
alpha is a logical consequence of gamma and alpha implies beta is a logical consequence of
gamma, then beta is a logical consequence of gamma. The proof is quite similar. If alpha is a
logical consequence of gamma, then there is an interpretation context pair, in any
interpretation context pair which makes every formula of gamma true, alpha is also true.

So, consider some interpretation context pair in which the whole of gamma is true. Every
formula in gamma is true in the interpretation context pair I c. So, in this interpretation
context pair, alpha is true because alpha is a logical consequence of gamma. Alpha implies
beta is true because, that is also a logical consequence of gamma. Then, necessarily beta has
to be true. Because if beta were false, then we would have that alpha is true and beta is false,
in which case alpha implies beta would have to be false. Therefore, beta is necessarily true.

(Refer Slide Time: 34:59)

As a corollary to this, we can argue that, if alpha is a tautology and for alpha is logically valid
and alpha implies beta is also logically valid, then, beta is logically valid. So, now analogous
to the case of propositional calculus, we can post these questions. Given a formula alpha, is
alpha logically valid? In the case of propositional calculus, we could drop the truth table for
alpha and check whether alpha evaluated to 1 in every single assignment. But we cannot do
that here. To show that alpha is logically valid we have to look at every possible
interpretation context pair for the system. But there could be an infinite number of such
interpretation context pairs. So, we do not have analogous semantic procedure here. The truth
table method is a semantic method, because it handles entirely the semantic entities, that is,
the truth values.

142
In a truth table, what we do is to consider every possible assignment, the assignment is a set
of truth values and then the function is evaluated for this particular assignment. So, we deal
entirely with semantic entities. An analogous semantic method is not available for first order
logic, that is, because, the space that we are looking at is infinite.

(Refer Slide Time: 36:56)

Therefore, we require a proof system. A proof system is a syntactic rewriting system and we
have proofs in the system. The proofs are exactly analogous to what we saw in propositional
calculus. We have a sequence of formulae, beta 1 through beta n, where beta 1 is an axiom.
Every subsequent beta is either an axiom or follows from previous formulae by some rule of
inference. Such a sequence of formula is called a proof. What we want is this, for every
statement which is logically valid, we should be able to start from a set of logical axioms and
culminate in this formula, through a proof.

So, the proof is witness to the fact that, this statement is logically valid. We would be very
happy, if every logically valid statement is provable in this fashion and everything that is
provable is logically valid. That is our system is both sound and complete. The system is
sound, if everything that the system proves is logically valid and the system is complete if the
system is capable of proving everything that is logically valid.

143
(Refer Slide Time: 38:53)

We would also ask questions of this form. Given a set of formulae gamma and a formula
alpha, is it the case that alpha is a logical consequence of gamma? Here also, we would
require a proof system. The proof system would help us in answering this question. What we
do is this. We take a set of logical axioms and the set gamma as a set of proper axioms and
then from this, we write a proof exactly as before, only that, now formulae in gamma could
also be used as axioms.

So, beta 1 is an axiom, necessarily an axiom and every subsequent formula is either an
axiom or follows from two previous formulae or some previous formulae by some rule of
inference. So, you obtain these formulae either by a role of inference or by invoking an
axiom.

144
(Refer Slide Time: 40:00)

So, the proof system that we have in mind, in fact has these components. It has logical
axioms and it has the rules of inference and it has a set of proper axioms. The set of proper
axioms is rather like a plug-in. You change the set of proper axioms, you will have a different
proof system and you will have different consequences. What we want is that this proof
system is both sound and complete.

We would be able to say that this proof system of sound and complete, if alpha is proved in
this proof system, then alpha is a logical consequence of gamma. When I write like this what
I mean is that alpha is provable from gamma, that is when I use gamma as the plug in here,
gamma as the set of proper axioms here, then I will be able to derive alpha from this proof
system. That is the assertion alpha is provable from gamma means. There exists a proof of
alpha starting from the proper axioms set gamma.

So, this asserts that if alpha is provable from gamma then alpha is a logical consequence. So,
this is an assertion of soundness. It says that, whatever we prove is sound. The converse of
this says that, if alpha is a logical consequence of gamma then, alpha is provable. This is the
statement of completeness of the proof system. This is what we would desire of the proof
system. We would want the proof system to be both sound and complete.

145
In which case, we would be able to reach conclusions that are logical consequences without
dealing with semantic entities. Since the semantic entities form an infinite space in any case
we cannot have an algorithm which is analogous to the truth table method in the case of
propositional calculus. Therefore, we do need a syntactic method and the proof system will
function as a syntactic method if this is the case

(Refer Slide Time: 42:46)

Now, let us see one such proof system, proof system for first order logic. First, let me see a
set of logical axioms. So we have three logical axioms schemas exactly as in the case of
propositional calculus. So, the first three logical axiom schemas are identical to that of
propositional calculus. That is why I said, the first order logic system is an extension of the
propositional calculus system. So, the second axiom schema says that, alpha implies beta
implies gamma implies alpha implies beta implies alpha implies gamma and the third axiom
schema says that, if not alpha implies not beta, then, not alpha implies beta implies alpha. In
other words, if negation of alpha implies both beta and not beta which will be an
inconsistency, in which case, alpha must be true. So, these three axioms are analogous to, are
exactly the same as in the case of propositional calculus proof system.

146
(Refer Slide Time: 44:24)

Now, we have some axioms, which are different. The fourth axiom says that, for all xi, alpha
xi, that is, alpha is true for every xi, is what this assertion says. If this is the case then alpha is
true for an individual referred to by term t, where t is a term that is free for xi in alpha xi.
What it means is that, when term t is substituted for every free occurrence of x i within alpha
of xi, then none of those substitutions should have a variable, that is caught by a
quantification here. So, let us take an example for this. So, first we say that, t is free for xi in
alpha, if no variable in t will be captured by a quantifier in alpha when t is substituted for free
occurrences of xi in alpha of xi.

(Refer Slide Time: 46:35)

147
As an example, consider this. Suppose alpha of xi is this formula: for all xj, xi is not older
than xt. So this formula asserts that xi is not older than everybody. So in that sense, xi is at
least as young as anybody in the group. That is what the statement asserts. Now, let us
consider a term t, which is father of xj. This term is substituted for every free occurrence of
this formula. Every free occurrence of xi in this formula, so in this formula there is a bound
occurrence of xj, this is the bound occurrence of xj, it is bound to this quantifier, but this
occurrence of xi is free. So, we are planning to substitute t for every free occurrence of xi in
this formula. We could write it this way, t being substituted for every free occurrence of x i
would give us this formula. For all xj, father of xj is not older than xj, which is a funny
statement in the normal interpretation of this world. But, this was certainly not what was
intended. So, if you were to substitute this term in a formula, we should make sure that no
quantifier existing in that formula will capture the free variable here.

So, this substitution is happening in some context in which xj has some meaning, xj is
mapped to some individual, but then, that is defined by the context and therefore after the
substitution also we should let the context define the meaning of this particular xj. But, when
that substitution happens here we find that this free occurrence of xj is being caught by the
already existing quantifier there. So, to avoid this pitfall what we should do is to change the
variable name. So, this bound variable xj can be changed to xj prime. So, once you do this,
then this association is not made. So what does it say now? It says that father of xj is not older
than everybody. So, we might be considering a pool of fathers, and what we are asserting is

148
that, father of xj is possibly the youngest in this group, is at least as young as anybody in this
group. That meaning makes sense, but then this requires a change of variable.

So, what the axiom that we have seen asserts that if alpha is true for everybody, then, alpha
must be true for individuals too, that is, for any particular individuals too. So, this allows for
particularizations. When you have a statement, which is universally quantified, then you will
be able to particularise the statement, for a certain individual drawn from the domain of
discourse.

(Refer Slide Time: 50:22)

Now, we have two more axioms. Axiom A5 is a formula that we have seen before. For all xi,
alpha implies beta implies for all xi alpha implies for all xi beta. In an earlier class, we
showed that this is a logically valid formula, so this is our axiom A5. Then axiom A6 says,
alpha implies for all xi alpha , if xi is not free in alpha, which allows for generalizations of
formulae. So, these axioms, A1 through A6 are axiom schemas. They use variables alpha, beta
and gamma that stand for any well-formed formulae. So, if you substitute appropriate well-
formed formulae alpha, beta and gamma, then we would have instances of these axioms. So,
we have a countably infinite number of axioms obtainable in this manner.

149
(Refer Slide Time: 51:53)

And we have one rule of inference, exactly as in the case of the proof system for
propositional calculus. Here also we have only one rule of inference, namely alpha and alpha
implies beta, together allowing us to write beta in the proof. So, if you have alpha in the proof
and alpha implies beta in the proof, then you will be justified in writing beta as the next step
in the proof. So this rewriting rule is called Modus Ponens.

So, we are considering a proof system, in which the logical axioms are the axioms derivable
using the templates A1 through A6, Modus Ponens is the only rewriting rule and then we have
a set of proper axioms gamma. From one proof system to another, gamma could be different
but the other components will all remain the same.

Now, we would like to assert that this is a sound and complete system. That is for any plugin
gamma here, what is provable from this system happens to be the set of logical consequences
of gamma and every such logical consequences indeed provable in the system, which would
indeed be a nice property. So, this is what we would like to establish. Okay, that is it from
this lecture. Hope to see you in the next. Thank you.

150
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture 7
Mathematical Logic

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the seventh lecture on
Mathematical Logic.

(Refer Slide Time: 00:41)

In the last class, we were talking about a proof system for first order logic. The proof system
consists of a set of logical axioms. We saw six templates for forming logical axioms. Three of
them were identical to the logical axioms of the proportional calculus, and then we had three
additional logical axioms for first order logic. So these form the logical axioms of first order
logic. And then, we have one rule of inference. Exactly as in the case of proportional
calculus, the rule of inference that we have is modus ponens, which says that, if we have
alpha and alpha implies beta, then we can prove beta.

151
(Refer Slide Time: 02:00)

What this means is that, given a set of formulae, gamma, which will form the set of proper
axioms, we have a system like this. We have the logical axioms and we have the rule of
inference modus ponens and we have a set of proper axioms gamma. So here gamma is rather
a plug in. You can change the set of proper axioms that you have. When you change gamma,
the conclusions would change. Then, with this system, with gamma, logical axioms and
modus ponens, you can write what are called proofs.

A proof is a sequence of statements or well-formed formulae, so that the first statement of the
proof is an axiom. This could be either logical axioms or a proper axiom, and the
subsequence statements are either axioms or obtainable from the previous statements by
modus ponens. But, of course, since modus ponens should have two formulae to be
applicable, we know that beta 2 also is an axiom. So, in any proof, the first two statements are
axioms. The remaining statements are all axioms or obtainable by modus ponens.

152
(Refer Slide Time: 03:45)

That is, we are visualizing sequences of this sought, beta 1, beta 2 are axioms. Any beta I is
either an axiom or is obtained by some beta J and beta K, which is beta J implies beta I.
These 2 together will provide beta I. So, every statement is obtainable in this manner. Such a
sequence of well-formed formula is called proof.

(Refer Slide Time: 04:29)

Any statement in a proof, any well-formed formula, so that a proof ends in it, is a theorem.
So, if beta 1 through beta n is a proof, then all of these statements beta 1 through beta n are
theorems. That is, because you can stop the proof at any point, beta 1 is a proof, beta 1 beta 2
is a proof, beta 1 beta 2 beta 3 is a proof.

153
In other words any prefix of the sequence is a proof. Therefore the statements at which these
proofs culminate are all theorems. That is, beta 1 beta 2 beta 3 etc. beta n, are all theorems.
Every statement in a proof is a theorem. But usually, we attach the word theorem to a
significant conclusion. But in the case of logic, that is not the case. Any statement in a proof
is a theorem.

(Refer Slide Time: 05:54)

Then, we spoke about models. An interpretation, contexts pair, which is what breathes life
into the syntactic system we have laid out. An interpretation context pair is a model for the
set of proper axioms gamma, if gamma is true in that pair. In other words, every well-formed
formula in gamma is true under that interpretation context pair. In that case, we say that this
interpretation context pair is a model for gamma.

154
(Refer Slide Time: 06:54)

Then the question is this. Given gamma, and given the models for gamma, which statements
are true, which statements or well-formed formulae are true, in all of these models? These are
exactly the logical consequences of gamma. What is a logical consequence of formula? Beta
is logical consequence of alpha, if beta is true whenever alpha is true.

That is, any interpretation context pair which makes alpha true will also make beta true. That
is, when we say that beta is a logical consequence of alpha. Now, when we are given a set of
formulae gamma, we say that beta is a logical consequence of gamma, if any interpretation
context pair which makes every single statement of gamma true will also make alpha true. So
that is precisely what we are saying here. A model for gamma is an interpretation context
pair, which will make every statement of gamma true. If such a model will also make a
statement alpha true if every such model will also make a statement alpha true then we say
alpha is a logical consequence of gamma.

155
(Refer Slide Time: 08:31)

Notations we write in this manner, alpha is a logical consequence of gamma. Now the
question is, given alpha, how do we check this? This is where the proof system comes handy.
When the proof system furnishes us with a proof, where the proof culminates with alpha, we
would say that alpha is provable from gamma. That is, using gamma and the logical axioms,
we are capable of proving alpha.

Now, we would like this to be identical to this. That is the semantic notion of logical
consequence and the syntactic notion of provability. You know, proving is strictly syntactic
process. We are merely looking at the forms of the statements and rewriting them. So a proof
is a strictly syntactic process. Using the syntactic process, we are arriving at alpha. Here
logical consequence is a semantic notion. We want the syntactic notion of provability and the
semantic notion of logical consequence to be equal. That is, when we say that the proof
system is sound and complete.

156
(Refer Slide Time: 09:58)

Now, let us consider one concrete example for proper axioms. Let us consider a system of
logic in which there is one predicate symbol, which is the equality symbol and then there are
function symbols. There is one constant symbol or a 0 argument function symbol, which is Z
and there is a one variable function symbol which is S. There are 2 two variable function
symbols, which are A and M.

Let us say these are the only symbols that the system has: the function symbols and the
predicate symbols. So these along with the logical connectives, quantifiers, brackets will
form the alphabet of the language.

(Refer Slide Time: 11:19)

157
And the set gamma consists of the following axioms. Axiom S1 is this. It is essentially a
statement about equality. It is about the transitivity of equality. What it says is that, for every
x1 x2 and x3, if x1 equal to x2 and x1 equal to x3, then x2 will be equal to x3. We will use this as
a short form for this. So, whenever I write like this, you should understand that this is what I
mean. The second axiom is also about equality. The second axiom is about the S function. S
function can be thought of as the successor function, we would call it the successor function.

So what it says is that, for every x1and x2, if x1 is equal to x2, then the successor of x1 is the
same as the successor of x2, which means every number has a unique successor and then we
say that Z is not the successor of anyone, that is what the third axiom is, and the fourth axiom
asserts, for every x1 and x2, if the successor of x1 and the successor of x2 are the same, then x1
and x2 are the same. This is the converse of the second axiom.

(Refer Slide Time: 13:29)

Then, the fifth axiom asserts that for every x1 apply a on x1and Z, we get x1. The fifth axiom
asserts that for every x1 a applied on x and Z will give us x1. The sixth axiom says that x1 x2,
for every x1 x2, A of x1 and the successor of x2 is the same as the successor of A of x1 and x2.
Now, probably you can guess where we are headed. What is the meaning of a? In the seventh
axiom, we assert that for every x1, the m function applied on x1 and Z will gives us Z. The
eighth axiom says that, for every x1 and x2, M on x1 and the successor of x2 is the same as A
applied on M on x1 x2 and x1.

158
(Refer Slide Time: 15:05)

And the final axiom S9 is in fact an axiom schema. It says that, for any well-formed formula
alpha, with one free variable x, alpha of Z implies that, for all x, alpha of x implies alpha of
successor of x implies for all x alpha of x. So, let us say gamma is this set of proper axioms.
Now, we have written gamma with an intention of interpreting them in a particular way, for
example, we would like to interpret Z as 0, we would like to interpret A as the addition
function and M as the multiplication function, and the successor function is the plus one
function, in the sense that successor of seven is eight.

(Refer Slide Time: 16:27)

Let us take another look at the axioms, with this interpretation in mind. So, the first axiom
says that equality is transitive. The second axiom says that, for every x1 and x2, if x1equal to x

159
2, then S of x1 equal to S of x2, or another words if x1 is equal to x2 then x1 plus one is the
same as x2 plus one. The third axiom says that, 0 is not the successor of anyone and the fourth
axiom says that, if x1 and x2 have the same successor, then x1 equal to x2. In other words if x1
plus 1 equal to x2 plus 1, then you can cancel one from both sides and get x1 equal to x2.

(Refer Slide Time: 17:09)

The fifth axiom says, that when 0 is added to x1, we get x1. That is, 0 is the identity of
addition. In the sixth axiom, we say that, x1 added to the successor of x2, that is x1 plus x2plus
1 is the same as x1 plus x2 plus 1. So, from this you can essentially derive the associativity of
addition. In the seventh axiom, we have multiplication with 0. We know that 0 of
multiplication, that is x1 into Z is Z. In the eight axiom, we have distributivity. x1 multiplied
by x2 plus 1 is the same as x1 x2 plus x1. So from this, we can derive the general form of
distributivity.

160
(Refer Slide Time: 18:00)

And the ninth axiom is in fact the principle of induction. For any well-formed formula alpha
of x, if you can show that alpha is true for 0 and also that for any if alpha is true for x, then
alpha is true for x plus 1. If both of these can be shown then, we can argue that alpha is true
for every x. So this is the principle of induction. So all these axioms put together, form our
set gamma.

(Refer Slide Time: 18:33)

So gamma is called Peano’s axioms, and forms the basis for theory of natural numbers. So,
we can think of an interpretation I0, D0, F0 and R0, where D0 is the set of natural numbers. F0
maps the functions symbols. It maps Z to 0, A to the addition function, M to the

161
multiplication function and one variable successor function to the increment function plus 1.
So that is what F0 does.

(Refer Slide Time: 19:35)

R0 maps the equality symbol to the identity function, the identity relation, which consists of
pairs of the form (x, x). So this way, we interpret all the symbols, function symbols and
predicate symbols of the language. So, this interpretation is called the standard model for our
first order system S. S is the first order system with Peano’s axioms as the proper axioms.
That is, when within the proof system, we plug in gamma, the system of Peano’s axioms,
what we get is the first order system of S.

This interpretation where D0 is defined as the set of all natural numbers and F0 and R0 are
defined in this fashion is called as standard model for S. In fact, any interpretation which is
equinumerous with this interpretation is also called as standard model. But let us do not
bother about equinomersoity as this point.

So this is what we will call a standard model for S. We call this a model because, we can
guess, we can check that every one of S1 through S8 and instances of S9 are true in this
interpretation. That is why we say that, I0 is a model for S. So that is an example of a first
order proof system with a concrete gamma.

162
(Refer Slide Time: 21:41)

Now, let us go back to the original question. We have a question of this form. We are given
gamma and we are given an alpha. We want to answer this question really. This is what we
are interested in, is alpha a logical consequence of gamma? We have designed a proof system
to essentially answer this question and what we want is this. Is there any relationship with
between the provability of alpha from gamma and the logical consequence of alpha being a
logical consequence of gamma, is there any relation between the two?

(Refer Slide Time: 22:27)

We would like these two notions to be identical. In other words we would like the system to
be sound. If the system is sound, then anything that we prove is sound. In other words
anything that we prove is a logical consequence. So, if there exists a proof culminating in

163
alpha with in this system, where any of these statements is either an axiom or follows from 2
of the previous statement by modus ponens, then alpha is provable and then, we would like to
argue that alpha is a logical consequence of gamma.

So this is when we would say this system is sound. It can indeed be shown that the proof
system consisting of logical axioms and modus ponens is sound. In the sense that, you use
any plug in gamma, anything that is provable within the system would be a logical
consequence of gamma. So the system that we have been discussing is sound.

(Refer Slide Time: 23:50)

The other question is the completeness. We would like to know that the system is complete.
In other words, we would like to know that, if alpha is a logical consequence of gamma, then
alpha is provable. Godel’s famous completeness theorem proves just this. It shows that, this
proof system is indeed complete. That is, when you use a proof system of this sought with
axiom schemas A1 through A6 as logical axioms and modus ponens as the rewriting rule, then
every logical consequence happens to be provable.

164
(Refer Slide Time: 24:56)

So the proof system is sound and complete. But, there are some remaining questions.
Consider the standard model, I0, for Peano’s system. This is just one interpretation and in this
one interpretation, suppose, this rectangle represents the set of all well-formed formulae, all
syntactically correct formulae. So, let us say I0 makes some of these formulae true, in
particular it will also make all of gamma true. Consider another interpretation, which will
make some other set of formulae true. Of course these two sets of formulae may overlap,
suppose I1 is also a model for gamma, that is gamma is true under this interpretation I1 as
well.

Now imagine a third interpretation, which will make a yet another set of formulae true.
Suppose this is also a model for gamma. So in this sense, let me imagine all models for
gamma. There could be countably infinite number of models per gamma. Then, the
intersection of all of these will form the logical consequences of gamma. That is because,
these statements, the shaded portion within the venn diagram.

When we have plugged in all imaginable interpretations, we find that the intersection of all of
them is exactly the logical consequences of gamma. So in all this interpretations, gamma is
true. Therefore, all of them are models for gamma and in all these interpretations, these
shaded portion is also true.

Therefore, we can say that any interpretation which makes the whole of gamma true, will also
make all these formulae true. So these are the logical consequences of gamma and these are
indeed the statements which are provable within our system. So the shaded portion is what is
provable within the system.

165
(Refer Slide Time: 27:42)

Now, let us consider I0. We are going back to I0. We find that the shaded portion is a subset
of I0. Now, what is a circle I0? It is supposedly the set of all statements that are true in the
standard interpretation and the shaded portion of the set of all statements that are true in every
interpretation that will make gamma true. Of course, I0 also makes gamma true but there
could be more statements which are true in I0. Or in other words, the statements that are true
in I0 and the set of logical consequences of gamma need not be the same. They need not form
the same set, or in other words there could be a statement that is true in I0, which is not a
logical consequence of gamma. Clearly the second set is a subset of the first, but what I am
saying is the two sets need not be equal.

(Refer Slide Time: 28:17)

166
The second set could be a proper subset of first set in which case there would be I0 which is
not a logical consequence of gamma.

(Refer Slide Time: 29:10)

The famous Godel’s incompleteness theorem establishes just this. This statement shows that,
there is a well formed formula, that is true I0, but is not a logical consequence of gamma,
where gamma is the set of Peano’s axioms and so is not provable from gamma. That is
because, by the soundness and completeness of the first order system, where we obtain
completeness from Godel’s completeness theorem, what we know is that logical
consequences of gamma are exactly the statements that are provable from gamma.

Therefore, if there is a statement that is true in I0, which is not a logical consequence of
gamma, then this statement will not be provable from gamma. But then, what it is the
significance I0? I0 is the standard interpretation, it interprets the domain of discourse as a set
of natural numbers and the function symbols and the predicate symbols in a familiar way, we
have the 0 symbol and the addition symbol and multiplication symbol and the successor
symbol and when you read the axioms in this sense you realize that it is the set of axioms for
the theory of natural numbers.

Therefore, there is a statement which is true according to our intuitive understanding of


numbers, which is not provable from gamma. The demonstration of such a statement forms
Godel’s incompleteness theorem.

167
(Refer Slide Time: 31:18)

Godel's construction was roughly like this. For every syntactic entity, we can define a unique
number. This is a way of encoding the syntactic entity. For example, we can assign a symbol,
a number for the opening bracket. Let us say 13. For the closing bracket we can assign a
number 17 and so on and then combining these we can assign a number, we can find a way of
encoding every syntactic entity in to a number.

For example, when I have a statement of this form, the corresponding number for this could
be, the number corresponding to the opening bracket multiplied by the number corresponding
to a1 multiplied by the number corresponding to implication and so on. This is one way of
encoding and this encoding has the property that, we can uniquely decode any such given
number. So, Godel demonstrated that, such an encoding is possible for the syntactic entities.
These are called Godel numbering.

168
(Refer Slide Time: 33:00)

Once the syntactic entities are encoded into numbers, you can treat numbers as syntactic
entities, and then when you look at the statements within the system, we find that we have
well-formed formulae with free variables. In particular, if I look at a well formed formulae
alpha with one free variable x, it seems to be saying something about x. Statement could be
something like this, x plus 1 equal 5. So this is a formula with one free variable, x. By
substituting for x, I can get various formulae.

Some of them are true while some of them are false and so on. So, those are formulae with
one free variable. But then formulae with one free variable now can also be thought of as
speaking about syntactic entities, because now we have mapped syntactic into numbers in a
one to one manner. Therefore at least some of the numbers represents valid syntactic entities.
So a formula with one free variable can be thought of as talking about a particular number.
But a number can be thought of as a syntactic entity as well. So now we have formally taking
about syntactic entities.

169
(Refer Slide Time: 34:35)

But, what are syntactic entities? The symbols of syntactic entities: the function symbols, the
predicate symbols, open close brackets, implication, negation ,all these are syntactic entities
and then well-formed formulae are syntactic entities, terms are syntactic entities which form
constituents of well-formed formulae, and we can also consider proofs as syntactic entities.
A proof is nothing but a sequence of formulae. So if you combine the Godel numbers of the
well-formed formulae belonging to a proof in an appropriate manner, we can also get devise a
Godel number for a proof. So proofs are also syntactic entities. Now Godel numbering maps
all of these into the set of natural numbers in a 1 to 1 manner. Therefore, we can now have
statements that are talking about syntactic entities which include symbols, well-formed
formulae, terms, proofs etc.

Therefore, we can have statements that are talking about provability. Roughly, when you
have a statement that says that, for every x, y is not proved by x, in other words, for every x,
x is not a proof of y, is essentially asserting that y is not provable. Now, this statement itself
has a Godel number and y is a free variable in that.

170
(Refer Slide Time: 36:19)

So, by appropriate substitutions, we could device an expression, which essentially says that I
am not provable. This is what Godel did in his incompleteness theorem. He devised a
formula, which essentially says I am not provable, the meaning of the formula is I am not
provable, when you consider the standard interpretation of natural numbers. That is, from the
perspective of our understanding of natural numbers, the formula can be thought of as
meaning this. The formula is self-referential formula, it refers to itself.

It says that, there is no y which is a proof of this statement. That is the statement of the Godel
number inverse of which is this formula itself, but then what would be the truth value of such
a formula? Will this be provable? If this formula is provable, then by the soundness, we know
that it is a logical consequence and therefore, it is true. If this formula is provable, then it is
true.

But then what does the formula say, it asserts that there is no proof of this formula itself, then
there is no proof. Or in other words, the formula is not provable, which is a contradiction. If
this formula is provable then it is false, but it cannot be false because anything that is
provable has to be true, by the soundness of the system. Therefore, we have a contradiction.
Therefore it is not possible for this formula to be false.

171
(Refer Slide Time: 38:32)

Therefore, in the standard interpretation, this formula is true, which means this formula is not
provable. In other words, this formula is not a logical consequence of Peano’s axioms. In
other words, there is a true in the standard interpretation, but is not a logical consequence
Peano’s axioms or in other words there is a statement which is true in the standard
interpretation, according to our intuitive understanding of numbers, this formula has to be
true, but it cannot be provable from the set of Peano’s axioms.

In other words, the proof system that we have laid out is not complete in that sense. That is, it
is not capable of proving every statement, which is true in the standard interpretation. It is
capable of proving exactly the logical consequences of Peano’s axioms.

So, Godel demonstrated that, there exists a formula which is true according to our intuitive
understanding of numbers and is not provable from Peano’s axioms. Okay. So, this is the end
of discussion on mathematical logic and the end of this lecture. Hope to see you in the other
modules. Thank you.

172
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lec 08
Set Theory

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the first lecture on set
theory. Set theory deals with sets.

(Refer Slide Time: 0:37)

A set is a collection of objects. In that sense, a set is also an object. Therefore, a set can be a
member of another set.

(Refer Slide Time: 1:15)

173
When we write a belongs to A, what we mean is that a is a member of set A. The negation of
this statement, which says that a is not member of A, is written like this.

(Refer Slide Time: 1:45)

For two sets A and B, we say that A is equal to B if and only if, for every x, x belongs to A if
and only if x belongs to B. In other words, two sets are equal precisely when they have
exactly the same members. This is called Principle of extensionality. Two sets are equal
precisely when they have same extensions.

(Refer Slide Time: 2:36)

It asserts that, sets are defined by memberships. There is a set with no members, this is the
empty set. An Empty set is denoted by phi. Note that phi is not the same as the set containing
phi. A set with exactly one member is called a singleton set. So here phi is not the same as a
singleton containing phi because phi has no member it's an empty set. The singleton
containing phi has one member namely phi itself. So, these two are not the same.
174
(Refer Slide Time: 3:22)

If every member of A is a member of B as well for sets A and B, we say that A is a subset of
B. This symbol says A is a subset or equal to B.

(Refer Slide Time: 3:52)

If A is a subset of B and A is not equal to B, then we say that A is a proper subset of B, and it
is denoted like this, either like this or sometimes we write like this, to indicate that A is subset
of B, but A is not equal to B. So these two notations will be used interchangeably. Both
mean to say that, A is a proper subset of B. Now let us define some operations on sets.

175
(Refer Slide Time: 3:52)

The union operation denoted using this symbol is defined like this. x belongs to A union B if
and only if x belongs to A or x belongs to B. Another operation is the intersection operation.
This is expressed using this symbol. We say that x belongs to A intersection B if and only if x
belongs to A and x belongs to B.

(Refer Slide Time: 5:34)

Notice the correlation between the union symbol and the OR symbol, the intersection symbol
and the AND symbol. x belongs to A union B if and only if x belongs to A OR x belongs to
B, x belongs to A intersection B if and only if x belongs to A and x belongs to B.

176
(Refer Slide Time: 6:03)

The third operation is the relative complement. Complement of A with respect to a universal
set A is defined like this, x belongs to the complement of A with respect to U if and only if x
belongs to U and X does not belong to A.

(Refer Slide Time: 6:38)

A set A can be deemed, a predicate on objects. For example, consider first order predicate or
any logical predicate with one free variable. So this statement is essentially a statement about
one unknown entity, x is the one unknown entity. For example, it could be a statement of this
form: x greater than 5, there is one free variable in this. So, you can assume that this is
actually a definition of a state, it is talking about all individuals that are greater than 5. We
write like this. So using every one variable predicate, we can construct a set. In other words, a
set can be equated to a predicate on objects.

177
(Refer Slide Time: 7:40)

So let us say that we have a universal set U and we deal with subsets of U, and we have these
three operations: union, intersection and the relative complement. The algebra defines by
these three operations, you can see is in fact a Boolean algebra, which we have already seen
in the module on mathematical logic.

(Refer Slide Time: 8:23)

Therefore, the following laws hold. By virtue of this algebra being a Boolean algebra, the
following laws hold. We have seen the analogues of these laws in the context of propositional
Calculus. The first one is the identity law, which says that A intersection U, the universal set,
is A and A union phi is A, which means U is the identity of intersection and the empty set is
the identity of union. Then, we have domination laws.

178
(Refer Slide Time: 9:11)

Domination laws says that, the universal set U dominates the operation of union, that's
because A union U is U for any A , and the empty set dominates the operation of intersection,
A intersection phi is phi for any A.

(Refer Slide Time: 9:39)

Then we have idempotent law, one each for union and intersection. A union A is A, and A
intersection A is A.

179
(Refer Slide Time: 9:54)

Then, we have the double negation law, which asserts that the complement of the
complement is the original set. U minus U minus A, the relative complement of the relative
complement of A is A itself .You can draw a Venn diagram to verify this. If the rectangle
represents universal set U and the circle represents the set A, U minus A is the elements
which are outside of the circle. The complement of the elements which are on the outside the
set formed by the element which are on the outside of the circle will be the elements which
are inside the circle.

(Refer Slide Time: 10:32)

Then, commutative laws, it says the both the operations union and intersection are
commutative. A union B is same as the B union A and A intersection B is same as B
intersection A.

180
(Refer Slide Time: 10:55)

Then we have associative laws. It says that both the operations Union and intersection are
associative. You can apply them in any order. When you have a sequence of unions, A union
B union C is the same as A union B union C. Therefore we can avoid the parenthesis and
write a sequence of unions in this manner and this can be extended to any number of sets.

(Refer Slide Time: 11:27)

Similarly, for intersections as well. A intersection B intersection C is the same as A


intersection B intersection C.

181
(Refer Slide Time: 11:40)

Then, we have distributive laws. It says that intersection distributes over union and union
distributes over intersection. A intersection B union C is the same as A intersection B union
A intersection C.

(Refer Slide Time: 12:12)

Similarly, intersection distributes over union as well. A intersection B union C is A


intersection B union A intersection C.

182
(Refer Slide Time: 12:25)

Similarly, union distributes over intersection as well. A union B intersection C is A union B


intersection A union C.

(Refer Slide Time: 12:41)

You can verify these using Venn diagram. You would have studied Venn diagrams in school.
In a Venn diagram you represent a universal set using a rectangle, and then sets inside can be
represented using convex figures, circles typically. So you have sets A, B, C with possible
intersections between them, then you can talk about A intersection B, B intersection C etc. A
intersection B in this case would be this portion and B intersection C would be this part. A
intersection C would be the common parts between circles A and C.

183
(Refer Slide Time: 13:31)

So using Venn diagrams, you can verify the correctness of all these laws. Then we have De
Morgan's laws in this Boolean algebra as well. The corresponding De Morgan's laws would
be this. The complement of A union B. Let me denote the relative complement in this
fashion, the relative complement of A intersection relative complement of B. Similarly, A
intersection B complement would be A intersection union A complement union B
complement, where the complement is relative complement. So these are the De Morgans
laws of set theory.

(Refer Slide Time: 14:26)

Then, we have absorption laws. Absorption laws says that, A union A intersection B is A and
A intersection A union B is also A.

184
(Refer Slide Time: 14:46)

And finally we have negation laws. A union the relative complement of A is the universal set.
A intersection the relative complement of A is the empty set. So, verify all these laws using
Venn diagrams or logical arguments, I am leaving it as an exercise to you.

(Refer Slide Time: 15:17)

The set of all subsets of A is denoted by two power A or sometimes P of A. If A is a finite


set, the size of two power A, two power the size of A. In other words, if A has n members,
then the power set of A has two power n members.

185
(Refer Slide Time: 15:53)

The argument goes like this. If the cardinality of A is n, the cardinality of a set is the number
of elements in the set, when the set is finite. If the cardinality of A is n, then to form a subset
of A, for each element of A we have two choices. Pick the element or do not pick the
element.

(Refer Slide Time: 16:39)

Therefore there are two power n ways to form a subset. That is if, x1 through xn are the
members of A. For each element, either I can pick it or leave it: selected or leave it, for each
of the n elements. So there are two choices for each element. Therefore the total number of
ways in which you can form the subset is a multiple of n 2’s, this is two power n. Therefore
the number of subsets of A is two power n.

186
(Refer Slide Time: 17:19)

A set may contain sets, as we have seen, the power set is a set of sets.

(Refer Slide Time: 17:45)

We define the union of A in this manner. It is a set of all x, such that there exists b, where b
belongs to A and x belongs to b. In other words, union of A is the set of members of members
of A. If b is a member of A and x happens to a member of b, then x belongs to union of A.

187
(Refer Slide Time: 18:31)

Analogously, we can define intersection of A as x such that, for every b, if b belongs to A,


then x belongs to b. So intersection of A is the set of those elements which belong to every
member of A.

(Refer Slide Time: 18:59)

For example, if A is made up of these elements: set {2, 3}, number 4 and the singleton
containing just 5. Then, union of A would be the set of all members of members of A.
Number 4 is not a set. Therefore it doesn't have members, but set {2, 3} has two members
namely 2 and 3, set 5 has one member namely 5. So union of A is 2, 3, 5. Intersection of A is
the set of those elements that belong to every member of A. But since 4 is not a set it does not
have any member. Therefore there is no such element. There is no element which is the
member of every member of A.

188
(Refer Slide Time: 19:58)

An unordered pair is a set of size 2. A set of size 2: {x, y} is an unordered pair, it has exactly
two elements and there is no particular ordering of the elements. That is you cannot say this is
the first element and this is the second element, these two elements are identical, as far as the
membership is concerned. So an unordered set is of cardinality 2.

(Refer Slide Time: 20:38)

Whereas, an ordered pair has two elements x and y and these elements are to be ordered, you
should be able to say that x is the first element here and y is the second element. So, x and y
are not symmetrically included here. But an ordered pair can be defined as a set. So let us
define the ordered pair (x, y) in this fashion. We define the ordered pair (x, y) as a two
element set. In which the first element does x itself where the second element is a set
containing x and y, the unordered pair.

So the ordered pair define the set consists of two elements, the first element is an element
from the universe whereas the second element is an unordered pair of the universe, and the
first element is a member of the second set. So189here x and y are not symmetric.
In the sense that, x is the first element and it's also a member of the second element, whereas
y is only a member of second element, it does not have an independent membership of the
set. Therefore x and y are holding asymmetric positions within this unordered pair.

(Refer Slide Time: 22:13)

We use ordered pairs to use what are called relations. A relation is a set of ordered pairs. For
a relation R, the domain of R denoted in this fashion is defined in this fashion, x belongs to
the domain of R if and only if there exists y, so that (x, y) belongs to R, there is an ordered
pair in R where x is the first component. If that is the case, we say that x is the domain of R.

(Refer Slide Time: 23:18)

Similarly, the range of R denoted like this is defined in this fashion. We say that, x belongs to
the range of R if and only if there exists a y, so that the ordered pair (y, x) belongs to R.

190
(Refer Slide Time: 23:50)

The union of these two, the domain of R and the range of R is called the field of R.

(Refer Slide Time: 24:01)

For two sets A and B, A cross B is the set of all ordered pairs (a, b), so that, a belongs to A
and b belongs to B. That is A cross B set of all ordered pairs (a, b), so that a come in the first
set A and b come in second set, namely B.

191
(Refer Slide Time: 24:35)

So, if A and B are finite and A has a size of n and B has a size of m, then cross product of A
and B has size of n * m. That is, the number of ways in which you can form ordered pairs
with the first component coming from A and the second component coming from B is n * m.

(Refer Slide Time: 25:03)

Observe that, a relation R from A to B is subset of A cross B. If R is made up of ordered pairs


from A and B, then R is a subset of A cross B.

192
(Refer Slide Time: 25:29)

A binary relation on A is a subset of A cross A, which is also sometimes written as A


squared. An n-ary relation on A is a subset of the cross product of A with itself n times,
which is denoted in this fashion, A power n.

(Refer Slide Time: 26:17)

An ordered pair is also called a 2-tuple, and an n-tuple, in that sense, is defined as an ordered
pair consisting of an n minus one tuple, followed by a single element.

193
(Refer Slide Time: 26:49)

For example, when we have elements a1 through an forming an n-tuple. This can be defined
as a set in this fashion, define an ordered pair first you form an n -1 tuple recursively using
the first n-1 elements and then include a n, in that. So every n-tuple can be defined as an
ordered pair.

(Refer Slide Time: 27:24)

And, A power n is the set of all n tuples. If (x, y) belongs to R, we write x R y.

194
(Refer Slide Time: 27:59)

Now, let us define functions. A function F is a relation, such that for every x belonging to the
domain, there exists a unique y in the range of R, so that (x, y) belongs to F. We will see
more about functions in the next class. That is it, from this lecture. Hope to see you in next.
Thank you.

195
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Set Theory
Lecture 2

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the second lecture on Set
Theory.

(Refer Time Slide: 0:37)

At the end of the previous lecture we were discussing functions and relations. A function we
saw is a relation such that for every x belonging to the domain of the function there is a
unique y in the range of the relation so that (x, y) belongs to F. This would be the range of F.
For every x belonging to the domain of F, there is a unique y in the range of F so that (x, y)
belongs to F. So that is what a function is.

196
(Refer Time Slide: 1:15)

Since an x has a unique mapping in the range, the image of x can be denoted as F of x, this is
the image of x under F. It means x, F of x is the only ordered pair with x as the first
component and belonging to F. We say that a function maps A into B, where a function is
from A to B. We said that function F maps A into B if the range of F happens to be a subset
of B.

(Refer Time Slide: 2:05)

We say that a function F maps A onto B, if the range of F is equal to B. So, if F is a mapping
onto B, then F is a mapping into B as well.

197
(Refer Time Slide: 2:43)

We say that a set R is single rooted if and only if for each y belonging to the range of R, there
is a unique x so that x R y.

(Refer Time Slide: 3:14)

Even a non-relation can be single rooted. For example, consider this set. This is single rooted.
For every ordered pair belonging to this set, that is, we consider (2, 0) and (2, 1). And
consider the second components, 0 and 1 are the second components. Each of them has a
unique pre image, the only pre image of 0 is 2 and the only pre image of 1 is 2. Therefore,
this is a single rooted set. So, single rootedness can be a property of a set.

198
(Refer Time Slide: 4:10)

A function is called one to one if and only if it is single rooted. So if the function is from set
A to B, for each x belonging to the domain of the function, there is a unique image y, that is
because it is a function. Now since it is single rooted, then y has unique pre-image x, so each
x in the domain has unique image y in B and each y belonging to B, the range of the function
has a unique pre-image x, and a that is when the function is called one to one. So a one to one
function is a single rooted set.

(Refer Time Slide: 5:12)

199
A bijection is a function that is one to one and onto. And injection is a function that is one to
one, that is one to one and into, but need not be onto. So, a bijection is also an injection.

(Refer Time Slide: 6:01)

A surjection is a function that is onto. So combining all these definitions, we say a function is
a bijection if and only if it is an injection and a surjection. You must be familiar with these
terms: bijection, injection and surjection.

(Refer Time Slide: 6:47)

Consider two sets F and G. We define the inverse of set F denoted F inverse F to the power of
minus ,1 this is defined as the set of all ordered pair (u, v) such that, the ordered pair (u, v)

200
pair belongs to F. That is you consider all ordered pair belonging to the set F and then invert
the ordered pairs, make the first component the second and the second component the first.

The ordered pairs that you obtain will form F inverse. For example, if F happens to be (2, 0),
(1, 0) and c, then F inverse is (0, 2) and (0, 1). So F need not even be a relation for it to have
an inverse. Even for sets, we can define an inverse. So to find the inverse of a set you take out
all the ordered pairs belonging to the set and then invert every single ordered pair.

(Refer Time Slide: 8:10)

The restriction of F to A, denoted in this manner, this is called restriction of F, is defined as


the set of all ordered pairs (u, v) so that u F v and u belongs to A. The image of A under F,
which is denoted like this, the image of A under F, this is defined as the range of the
restriction of F to A. This is the set of all v such that there exists u belonging to A, so that u F
v. The range of the restriction is what the image of A under F.

201
(Refer Time Slide: 9:33)

Then we have a number of theorems. I will not prove these theorems here. These are trivial
enough for you to prove them as exercises. For a set F, domain of F inverse is the same as the
range of F and the range of F inverse is the same as the domain of F. Another theorem says
for a relation F, F inverse of inverse, the inverse of F inverse is same as the F, which is trivial
enough. For a set F, F inverse is a function if and only F is single rooted. That is, every
element in the range of F has a unique pre image that is when F inverse is a function.

(Refer Time Slide: 11:03)

202
A relation F is a function if and only F inverse is single rooted. Say F is a one to one function,
for x belonging to the domain of F, F inverse of F of x is the same as x, and for y belonging
to the range of F, F of F inverse of y is the same as y. This is true for a one to one function.

(Refer Time Slide: 12:14)

For two sets F and G, we define F composition G thus. It is the set of all ordered pairs (u, v)
such that there exists a t so that, u G t and t F v. And then a theorem about function
composition. If F and G are functions, then F composition G is also a function. If G is a
function and G is applied on u, then it will produce a unique t and if F is a function, when F is
applied on t, it will produce a unique v. Therefore F composition G will also produce a
unique v, for every u belonging to the domain of it. Thus, the theorem. The domain of F
composition G is all x belonging to the domain of G, such that G of x belongs to the domain
of F.

203
(Refer Time Slide: 13:52)

For any two sets F and G, the inverse of F composition G happens to be the compositions of
G inverse and F inverse. So these are theorems that are easy enough to prove as exercises. Do
attempt to prove them. So we've seen some basic definitions of set theory.

(Refer Time Slide: 14:33)

Let us show that Peano system can be embedded within set theory. Russell and Whitehead
proved that the whole of classical mathematics can be embedded within set theory. I will give
you a peak as to how to do this. So we will begin with Peano system. Peano system deals
with arithematics, we have already seen peano's axioms and the module on logic. So let us
see how to embed Peano system in set theory.

204
(Refer Time Slide: 15:45)

First of all, for a set a, the successor of a denoted a superscript plus, is defined as a union the
singleton set containing a. That is, when you take a set and add it to itself we get the
successor of the set. So this is how the successor of a set is defined.

And then we say, a set A is inductive, if the empty set belongs to A and for every a, a belongs
to A implies that the successor of small a also belongs to A. For every small a, small a
belongs to capital A implies that A plus also belongs to capital A, that is a successor of small
a also belongs to capital A. In other words A is closed under the successor operator. So for a
set A to be inductive it should contain the empty set and for any set belonging to A it should
also contain the successor of that set. That is the set should be closed under the successor
operator. That is when we say that the set is inductive.

(Refer Time Slide: 17:02)

205
Now let us define omega as the intersection all A such that A is inductive. We define omega
as this intersection of all inductive sets. That is, x belongs to omega if and only if x belongs
to every inductive set, which is precisely when x belongs to the intersection of all inductive
sets. So, the definition says that x belongs to omega precisely when x belongs to every single
inductive set.

(Refer Time Slide: 18:07)

206
Now, let us consider some interesting properties of omega. First of all, omega is a subset of
every inductive set, that should be obvious, because in the previous slide we've just seen that
if x belongs to omega then x belongs to every single inductive set. Therefore, omega is a
subset of every single inductive set. That is, the members of omega are members of every
single inductive set. And then, omega is inductive as well. Why should those be so?

This is because phi belongs to A if A is inductive. For every A, that is inductive phi belongs
to A. Therefore phi belongs to omega as well, because omega is the intersection of every
single inductive set. So, if phi belongs to every single inductive set, it should also belong to
omega. So, that is the first requirement for omega to be inductive. It should contain the empty
set, so it indeed does contain the empty set.

207
(Refer Time Slide: 19:37)

Now consider some x belonging to omega, then, x belongs to every inductive set. But then,
all inductive sets are closed under successor operator, which means x plus belongs to every
inductive set. In order words x plus belongs to the intersection of all inductive sets, which is
precisely what omega is, which means x plus belongs to omega. So what we have is this, if x
belongs to omega then x plus also belongs to omega, which means omega is closed under the
successor operator. That is the second requirement for omega to be inductive. The first
requirement is that omega should contain the empty set. The second requirement is that it
should be closed under the successor operator. Since we've established both, we know that
omega is inductive.

(Refer Time Slide: 20:53)

208
So, omega which is defined as the intersection of all inductive sets is itself an inductive set.

(Refer Time Slide: 21:00)

And once again and inductive set is a set which contains an empty set and is closed under a
successor operator.

(Refer Time Slide: 21:09)

Suppose, x is a subset of omega, and suppose x is inductive, but then omega is a subset of
every inductive set, therefore omega is a subset of x as well. But when x is a subset of omega,
this implies that omega equal to x. What it means is that, if x is a subset of omega and x is
inductive, then x has to be same as omega. Or in other words, no proper subset of omega is
inductive. So, to combine this with the definition of omega.

209
Omega is the intersection of every inductive set. And omega is a subset of every inductive
set. And no proper subset of omega is inductive. So, in that sense omega is the smallest
inductive set. Every inductive set is a super set of omega.

(Refer Time Slide: 22:31)

Now using this omega, we can embed natural numbers in set theory, so what we do is this.
We define natural numbers in this manner. We construct natural numbers in this manner. We
define 0 as the empty set, we define 1 as 0 plus, which is the successor of 0, which would be
phi, that is because this is phi union the singleton phi.

So 1 is the singleton containing phi and then 2 is the successor of 1, which would be phi
union the singleton which contains the singleton phi. This would be the set containing 2
elements, one is phi and the other is singleton containing phi. This is what is defined as 2.

(Refer Time Slide: 24:09)

210
And then the number 3 is defined as successor of 2, the number four is defined as successor
of 3 and so on. So in this manner we define all the natural numbers. So you can readily verify
that. 0 is a member of 1 which is a member of 2 which is the member of 3 and so on. Not just
that, 0 is a subset of 1 which is a subset of 2 which is a subset of 3 subset of four and so on.
So both these chains of relations hold.

(Refer Time Slide: 24:48)

So we have in this sense, defined omega as set of all natural numbers. Omega is the set of all
natural numbers. That is because, omega contains 0, it contains the successor of 0 which is 1,
it contains successor of 1 which is 2 and so on. For every set, it also contains a successor of

211
that set. It is closed under the successor operator. So we define omega as the set of all natural
numbers. And then you would recall Peano’s axioms that we saw in the module on logic.

We define addition in this manner. a plus 0 is a and a plus the successor of b is the successor
of a plus b. Similarly, multiplication is also defined. Multiplication is defined as a * 0 is 0 and
a * the successor of b is a * b plus a. So multiplication is defined using addition and addition
is defined using the successor function. So in this manner, we can construct natural numbers
and the two operations, addition and multiplication. You can verify that once we defined
numbers in this manner, then all of Peano’s axioms will be satisfied.

(Refer Time Slide: 26:33)

Suppose we have one variable predicate, alpha of n. Suppose we want to show that alpha of n
is true for every n belonging to the set of natural numbers. We can define T as the set of all
natural numbers, so that alpha of n holds. Suppose we prove that T is inductive, then we
would be done. Because the smallest inductive set is omega. And omega is the same as the
set of all natural numbers. So once we show that T is inductive, we would be done.

212
(Refer Time Slide: 27:40)

Suppose for example, we want to show that, every natural number except 0 is the successor
of a natural number. Suppose this is what we want to show. Then we can define T as all those
natural numbers that are either equal to 0 or are successors. Suppose, this is what T is, and
then we want to show that T is inductive. So this is the question we have, is T inductive? To
prove that T is inductive we have to show two things. First of all we have to show that empty
set belongs to T, but empty set is the same as 0 and 0 certainly belongs to t. Because T
contains all those natural numbers, that are either or are successors. So 0 is explicitly
included here. So 0, which is the empty set belongs to t. So T satisfies the first requirement.
Then, what is the second requirement, the second requirement says that if p belongs to T,
then p plus which is n, also belongs to T, that is also satisfied, therefore T is indeed inductive.

So T is an inductive subset of the set of natural numbers, but we have just seen that there is
no inductive proper subset of the set of all natural numbers, therefore T must indeed be the
set of all natural numbers. So this is the property that is satisfied by the set of natural
numbers. Every natural number except 0 is a successor of a natural number. So this is how
we embed the set of natural numbers in set theory.

213
(Refer Time Slide: 29:54)

A relation R on a set A is called reflexive, if x R x, for all x belonging to A. A relation R is


reflexive on a set A, if x R x is true, for every x belonging to A.

(Refer Time Slide: 30:33)

Similarly we say that a relation R is symmetric on A if x R y implies y R x for all x, y


belonging to A, and a relation R is transitive if for all x, y, z belonging to A, x R y and y R z
implies x R z.

214
(Refer Time Slide: 31:35)

A relation that is reflexive, symmetric and transitive on set A is an Equivalence relation on A.

(Refer Time Slide: 32:15)

As an example, consider set of natural numbers and let us say two numbers are equivalent,
that is two numbers are in relation R, x R y if and only if x minus y is 0 mod 3, which means
1 R 4, 4 R7, 7 R 10 and so on. These are all 1 mod 3, 1, 4, 7, 10 etc. are 1 mod 3. Similarly, 0
R 3, 3 R 6, 6 R 9 and so on. So, since, 0 R 3 and 3 R 6, by transitivity 0 R 6 as well. We also
have 2 R 5, 5 R 8, 8 R 11 and so on.

215
So the equivalence classes in this case would be 1,4,7,10,13,16 etc. This is one equivalence
class. 0,3,6,9,12 etc. form the other equivalence class, and the third equivalence class would
be 2,5,11 and so on. This corresponds to numbers that are multiples of 3, these are numbers
that are 1 mod 3 and these are numbers that are 2 mod 3. So this relation has 3 equivalence
classes.

(Refer Time Slide: 34:14)

An equivalence class of R is a maximum subset of A such that any two members of it are in
relation R to each other.

(Refer Time Slide: 34:57)

216
So the equivalence class containing x will be denoted thus. Formally, this is the set of all t
such that x R t, you might as well say t R x by symmetric.

(Refer Time Slide: 35:31)

The quotient A with respect to R is the set of all equivalence classes under R. Formally the
quotient of A with respect to R is the set of all [x]R such that x belongs to A.

(Refer Time Slide: 36:10)

Let us see how the theory of integers can be embedded in set theory. Let us consider ordered
pair of natural numbers. If (m, n) is an ordered pair, we would like to associate this ordered
pair to the integer m minus n. For example (2, 3) can be associated to the integers minus 1,

217
whereas (3, 2) will be associated will the integers plus 1. So we would map ordered pairs of
natural numbers onto integers in this fashion.

(Refer Time Slide: 37:00)

For this purpose what we do is this. We define a relation on natural numbers. This relation is
defined in this manner, we say that ordered pair (m, n) and (p, q) are in this relation if and
only if m plus q is the same as n plus p. Of course, we would have wanted to say that m
minus n is equal to p minus q, but we can't say this because the set of natural numbers is not
closed under subtraction. So, m minus q may not be a natural number, but of course from our
arithmetic, we know that if m minus n is equal to p minus q, then m plus q is the same as n
plus p. So using additions we're effectively saying the same thing.

218
(Refer Time Slide: 38:06)

So this relationship is an equivalence relation on N cross N. That is because, this holds


reflexivity. Why would this hold? That is because m plus n is equal to n plus m by
commutativity of addition of natural numbers, so reflexivity holds. For symmetry, we require
that if the relation holds between (m, n) and (p, q), then relation should also hold between (p,
q) and (m, n). If the relation holds between (m, n) and (p, q), then we have by definition m
plus q is equal to n plus p, which by commutativity of addition, or which implies that n plus p
is equal to m plus q which would imply that (p, q) is equivalent to (m, n) by commutativity of
addition, n plus p is the same as p plus n and m plus q is the same as q plus m. Therefore (p,
q) is equivalent to (m, n). That is, (p, q) is in this relation tilde with (m, n).

(Refer Time Slide: 39:54)

219
Therefore this relation is symmetric as well. And then you can verify transitivity if (m, n)
tilde (p, q) and (p, q) tilde (r, s), then you can easily verify that (m, n) tilde (r,s). So
transitivity also holds, therefore this is an equivalence relation. Then using this equivalence
relation, we can define integers in those manner. Let us define a set of integers Z as the
quotient of N cross N under the tilde function.

(Refer Time Slide: 40:43)

So what exactly do we do here? We define integer 2 as the equivalence class that contain (2,
0), the equivalence class under tilde which contains (2, 0). This would, of course contain
(2,0), (3, 1), (4, 2), this is a set. And it would also contain (1,-1) and (0, -2). So in particular
on the x-y plane if you consider the integral grids and consider line of slope 1 which passes
through (0, -2), all the integral grid points that fall on this line will form this set. So this set is
what we define as integer 2. Then what would integer minus 3 would be? That would be the
set of all pairs which are equivalent to (0, 3), under this equivalence class.

220
(Refer Time Slide: 42:14)

That would be all the integral grid points that fall on (0, 3), the line with slope 1 and passes
through (0, 3) will define this equivalence class. For example, (0, 3), (1, 4), (2, 5), (3, 6) and
so on. This is what is defined as integer minus 3. So we can define integers in this manner as
equivalence classes under this equivalence relation tilde.

(Refer Time Slide: 43:05)

And then we can define the operators on integers, addition and multiplication on integers
appropriately. So using this devise what we do is to define each integer as a set and then
operations addition and multiplication that work on integers would be defined as operations
on these special sets.

221
If the operators that we define behave in a manner which is exactly mimicking of the addition
and multiplication operations on integers, then, we would be able to say that we have
embedded the theory of integers in set theory. So, that is precisely what we will attempt to do.
The details we will work out in the next class. That is it from this lecture. Hope to see you in
the next. Thank you.

222
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture 10
Embedding of the theories of integers and rational numbers in set theory

Welcome to NPTEL MOOC on Discrete Mathematics. This is the third lecture on set theory.

(Refer Slide Time: 0:46)

At the end of the last lecture, we were seeing how the theory of integers could be embedded in
the set theory. Our idea was this: an ordered pair (m, n) of natural numbers should stand for the
integer m minus n. Then, the ordered pair (2, 3) would stand for minus 1, the ordered pair (3, 2)
will stand for plus 1.

223
(Refer Slide Time: 1:00)

So, what we do is this. We define a relation tilde on N cross N, where N is set of natural
numbers. The idea is that, the ordered pairs (m, n) and (p, q) would be in relation tilde with each
other, if m minus n is equal to p minus q. But since natural numbers are not closed under
subtraction, we essentially capture the same by using addition.

We write m plus q equals n plus p. We know that this would be the case precisely when m minus
n is equal to p minus q, and m plus q and n plus p are both well-defined natural numbers. So, we
achieve the same end. We define the relationship tilde in this fashion for ordered pairs (m, n) and
(p, q). (m, n) is in relation tilde with (p, q) precisely when m plus q is equal to n plus p.

224
(Refer Slide Time: 1:59)

Then we can see that tilde is an equivalence relation because it is reflexive, symmetric and
transitive.

225
(Refer Slide Time: 2:10)

Then, using this we can express integers in this fashion, for example, the integer 2 which we
denote 2z to distinguish it from the natural number 2 would be the equivalence class to which (2,
0) belongs. So, this equivalence class contains (2, 0) (3, 1) (4, 2) (1, -1), (0, -2) etc. All these
points would lie on the straight line with slope 1 and passing through (0, -2). Similarly, the
integer minus 3 which we denote -3z to distinguish it from the natural number 3 would be the
equivalence class containing the ordered pair (0, 3).

(Refer Slide Time: 3:01)

226
All the integral points belonging to the straight line with slope 1 and passing through (0, 3) will
also belong to the same equivalence class. So, all these points are equivalent under the relation
tilde. So, that is how we define the integers. Now comes the question, how do we define the
integral operators?

(Refer Slide Time: 3:25)

For example, addition, addition of integers which we denote plus with subscript z to distinguish
it from addition of natural numbers. Addition of integers is defined in this manner. The
equivalence class containing the ordered pair (m, n) under the relation tilde, this is an integer.
This added to the integer which is defined by the equivalence class to which ordered pair (p, q)
belongs. So, these 2 are integers. This is one integer and this is another integer.

This integer is supposed to correspond to m minus n and this is supposed to correspond to p


minus q, where m, n, p, q are all natural numbers. Then, their sum ought to correspond to the
integer m minus n plus p minus q, which could be written as m plus p minus n plus q. But, the
equivalence class under tilde to which ordered pairs of this sort will belong would be clearly this.
Therefore, this ought to be the result of the addition we have here, that is when the integers
corresponding to the equivalence class (m, n) under tilde and (p, q) under tilde are added, we
should get the equivalence class corresponding to the ordered pair (m + p, n + q). If this is how
we define addition, then, it would indeed be consistent with the notion that we have. The notion

227
we have is that the equivalence class corresponding to ordered pair (m, n) stands for the integer
m minus n that is indeed the case under addition defined in this manner.

(Refer Slide Time: 5:58)

Then, we can see that this addition is commutative, changing the order of the order of the
arguments will not change the result. We will get exactly the same equivalence class. It is
associative. It has an identity. This is the identity of the addition operation we have defined.
Remember 0z is the equivalence class to which, for example (1, 1) belongs. Each integer you can
see has an additive inverse. Under addition, which is defined in this manner, we can see that
when this integer is added to this integer, what we get is 0. Therefore, every integer has an
additive inverse.

228
(Refer Slide Time: 7:27)

In short, the integers that we have defined along with operator plus z form an abelian group. You
would recall the definition from the module on algebra. It is clear that (Z, +z) is an abelian group.

(Refer Slide Time: 7:46)

The other operator that we want to define on integers is multiplication. Let us denote
multiplication by star z. Multiplication would be defined in this manner. When we multiply
integer m minus n and p minus q, we would like the result to be mp plus nq minus mq minus np.

229
So, with this intuitive understanding, we can define the result of multiplication as the
equivalence class containing the ordered pair (mp + nq, mq + np), this addition is the addition
over natural numbers. So, this is how we should define multiplication of integers that correspond
to equivalence classes containing ordered pair (m, n) and ordered pair (p, q) under tilde. So, if
multiplication is defined in this manner it will be consistent with our intuitive notion of our
integer multiplication.

(Refer Slide Time: 9:29)

So, you can verify that this multiplication operator is commutative, associative. It distributes
over +z and 1z is the identity of this multiplication. Moreover you can also show that whenever a
multiplied by b for 2 integers a and b is 0, either a is 0, this is of course 0z, the integer 0, either a
is 0 or b is 0. All this is exactly consistent with our definition of integers. Therefore, what we
have understood is that, this definition of integers along with the definitions of addition of
integers along with multiplication of integers, behave exactly the way the integers are supposed
to behave.

(Refer Slide Time: 10:56)

230
In other words, this definition of integers along with the addition operation, the multiplication
operation, the special number 0 and 1, which function as the identities of addition and
multiplication respectively is an integral domain. Again, recall the definition from module on
algebra. Therefore, the definition of integers work exactly the way we want. So, this way of
defining integers as set theoretic constructs, achieves what we wanted to, the theory of integers is
now embedded in set theory.

(Refer Slide Time: 11:49)

231
The less than relation on integers is defined as follows. We want to say that, integer
corresponding to the ordered pair (m, n) is less than the integer corresponding to ordered pair (p,
q). When would we want to say this? Intuitively we want to precisely say this when m minus n is
less than p minus q, that is integer m minus n is less than p minus q, which would be the case
when m plus q is less than p plus n. Since, we do not want to use subtraction, because natural
numbers are not closed under subtraction, we try to rewrite the same condition using addition.

Therefore, this is what we achieve. So, we would be able to say this precisely when, the natural
number m plus q belongs to p plus n. This condition would hold precisely when natural number
m plus q is less than natural number p plus n. The way we have defined natural numbers using
sets, we know that natural number 0 belongs to natural number 1 which belongs to natural
number 2 and so on.

(Refer Slide Time: 13:31)

Recall that 0 corresponded to the empty set, 1 corresponded to the singleton containing empty
set. 2 corresponded to the 2 member set containing these. 3 corresponded to the 3 member set
containing these and so on. So you can see that, 0 belongs to 1 which belongs to 2 which belongs
to 3 and so on. You could say that n belongs to p, precisely when you want to say that n less than
p.

232
(Refer Slide Time: 14:25)

So, that is what we have done here. We say that the integer corresponding to the equivalence
class containing the ordered pair (m, n) is less than the integer corresponding to the equivalence
class containing the ordered pair (p, q), precisely when the natural number (m + q) belongs to
natural number (p + n).

(Refer Slide Time: 14:45)

233
A word about the relation 'less than': A binary relation R on a set A is a linear ordering, if R is
transitive on A and satisfies trichotomy. What is trichotomy? R satisfies trichotomy on A if and
only if, for every x and y in A, exactly one of the following hold. Either x R y holds, x is equal to
y or y R x holds. If this is the case, then we say that R is a trichotomy. So, a binary relation R on
A is linear ordering when R is transitive and satisfies trichotomy. So, you can say that less than
relation on natural numbers is a trichotomy.

For example, either x less than y or y less than x or x is equal to y and the less than relation is
transitive. So, using the less than relation on natural numbers, we have now defined the less than
relation on integers. So, with these basic definitions we see that, the theories of integers that we
have evolved, behaves exactly according to our intuitive understanding of integers. So, that is
why we say that the theory of integers could be embedded in set theory.

(Refer Slide Time: 16:49)

Now, let us see, how the theory of rational numbers can be embedded in set theory. So, we have
now defined the set of integers Z. Using Z, let us define Z prime as Z minus integer 0. Then, we
define a new relation, which we denote like this.

234
(Refer Slide Time: 18:08)

This new relation is defined as follows. We say that ordered pair (a, b) stands in this relation to
ordered pair (c, d). So, mind you, these are ordered pairs from Z cross Z prime. Z prime does not
contain 0z, therefore, b and d cannot be 0. So, our idea is that these ordered pairs should stand for
a by b and c by d respectively. But since on integers the division operation is not defined, that is
integers are not closed under the division operation, so we do not want to mention division here.

But what we want to say is a by b is equal to c by d, that is we want to say ordered pair a b and
ordered pair (c, d) in this relation precisely when the fraction a by b is equal to fraction c by d.
These two fractions are well defined, because b and d are non-zero. But this would be the case
precisely when, a into d equals to b into c. So, that is precisely what we are going to say. We
would say that ordered pair (a, b) stands in this relation to ordered pair (c, d), precisely when, a
multiplied by d, we use the integer multiplication, that is same as b multiplied by c. So, this is
how we define the relationship.

235
(Refer Slide Time: 19:44)

And then on this relationship we consider a set of equivalence classes. We define the set of
rational numbers, which we define as Q as the equivalence class set of equivalence classes of Z
cross Z prime under the relation we have just defined. What does this mean?

(Refer Slide Time: 20:30)

It means that the rational number 2, this is of course 2 by 1. This corresponds to the equivalence
class containing the ordered pair (2, 1), under this relation. This would of course contain (4, 2),

236
(6, 3), (8, 4) and so on. So, this set is defined as the rational number 2. This set will stand for the
rational number 2. So, exactly as we did before, we will now equate every rational number to a
set.

(Refer Slide Time: 21:40)

So, in particular on the x-y plane, where would these points fall? We have (2, 1) here, (4, 2) here,
and so on. So, these points would fall on a line with slope 1 by 2 passing through the origin. So,
we consider all integral grid points that fall on the line with slope 1 by 2 and passing through the
origin. These points would correspond to rational number 2. All the integral points falling on this
line would form the set which is defined as rational number 2.

237
(Refer Slide Time: 22:46)

Then let us consider minus 1 by 3, rational number minus 1 by 3. This would correspond to
equivalence class containing the ordered pair (-1, 3) under this relationship. If you enumerate
some of the ordered pairs belonging to this, you would have (-1, 3), (1, -3), (-2, 6), (2, 6) and so
on. These would correspond to the integral points on the line passing through the origin and of
slope minus 3.

(Refer Slide Time: 24:00)

238
That is if you plot, consider (0, 0) here, we want (-1, 3) and (1, -3) to be on the line. So, it would
pass through these two points. So, this has a slope of minus 3 and every integral grid point falling
on this line would form the set would be in the set which is defined as minus 1 by 3, the rational
number minus 1 by 3.

(Refer Slide Time: 24:38)

So, in this sense we define every rational number as a set. So, under this definition every rational
number is defined as a set. So, this is analogous to what we did earlier. We defined every natural
number as a set, first of all, then we define every integer as a set and now we define every
rational number as a set.

239
(Refer Slide Time: 25:07)

Then how would we define addition and multiplication of rational numbers? First, we want to
define addition of rational numbers. So, let us consider two rational numbers, that correspond to
equivalence classes containing ordered pair (a, b) and (c, d) respectively. These intuitively
correspond to a by b and c by d. So, our requirement is this, we want to add the fractions a by b
and c by d, add the rational numbers a by b and c by d.

We know that, when we add them together we will get bd in the denominator and ad plus bc in
the numerator. This would correspond to the ordered pair (ad + bc, bd). So, the equivalence class
containing this ordered pair under the relationship we have defined should be the result of the
addition. So, if the addition of rational numbers is defined in this manner, it would exactly
correspond to the intuition we have.

240
(Refer Slide Time: 26:55)

So, that is how we will define the addition of natural numbers and then we can see that according
to this definition, addition is commutative, associative, rational number Q is the identity.
Rational number 0Q corresponds to this equivalence class, 0 divided by a non-zero integer.

(Refer Slide Time: 27:54)

This is the identity of the addition as you can readily verify, and we can also see that this
addition is invertible, that is for all a belonging to Q there exists a, b in Q such that a and b added

241
using this addition operator, will render 0Q. So, every rational number has an additive inverse.
Therefore, we have that set of rational numbers along with that addition operation is an abelian
group, exactly as it was in the case of integers.

(Refer Slide Time: 28:53)

Now, coming to multiplication, you can readily work out how multiplication has to be defined.
We consider the multiplication of rational numbers. So, let us say, we want to multiply the
rational numbers corresponding to the equivalence class containing ordered pair (a, b) and the
rational number corresponding to the equivalence class containing the ordered pair (c, d). These
two are what we want to multiply. These equivalence classes respectively stand for a by b and c
by d where b and d are non-zero. These two are what we want to multiply.

From our understanding of rational numbers, we know that the answer would be ac by bd. So, we
should define the equivalence class containing the ordered pair (ac, bd) under this relationship as
the result of multiplying these two. So, if multiplication of rational numbers is defined in this
manner, it is exactly analogous to our intuitive understanding of rational number multiplication.

242
(Refer Slide Time: 30:21)

Then we can see that, this operator is commutative, is associative, distributes over rational
number addition. Rational number 1 is the identity of this multiplication. Whenever the product
of two rational numbers is 0; either one or the other is 0 or both.

(Refer Slide Time: 31:24)

Moreover, every non zero rational number has a reciprocal, which is the multiplicative inverse.
So, every non zero rational number has a multiplicative inverse. This is in variance with the case

243
of integers, integers do not have multiplicative inverses, that is integer multiplication is not
invertible. Therefore, this algebraic system Q along with the addition operation, the
multiplication operation, the rational number 0 and rational number 1 is a Field. Recall the
definition of a field, from the module on algebra.

(Refer Slide Time: 32:32)

Now, let us define the less than relation on rational numbers. We want to say that, the rational
number corresponding to the ordered pair (a, b) is less than the rational number corresponding to
the equivalence class containing the ordered pair (c, d). We want to define this relationship;
when would we be able to say this?

We want to say that rational number a by b is less than rational number c by d, where ab, cd are
integers and particularly b and d are non-zero integers. We will be able to say this precisely when
ad less than bc, where this is the integer relationship. So, we will be able to say this precisely
when ad is less than bc, integer ad is less than integer bc. You can readily see that this
relationship is a linear order, satisfies trichotomy and transitive.

244
(Refer Slide Time: 33:51)

So, we have now shown that the theory of rational numbers is embeddable in set theory. But the
way we have defined the rational numbers, integers which form a subset of rational numbers is
not a subset of Q. That is, the way we have defined Z and the way we have defined Q, Z is not a
subset of Q. We have used different definition techniques for Z and Q, but then is this a
contradiction? Not exactly.

(Refer Slide Time: 34:46)

245
That is because, we can define a set called ZQ, as the set of all equivalence classes of this sort.
Consider all ordered pairs (n, 1) and the equivalence class defined by them. So, the set of all
these equivalence classes will form ZQ. So, they correspond to rational numbers in which the
denominator is 1 and integer is precisely a rational number in which the denominator is 1. So, we
define ZQ as this set. So, you can readily see this ZQ is set of real integers convertible functions
within the set of all rational numbers. So, we can see that ZQ and the set Z we defined earlier are
isomorphic. They have exactly the same mathematical behaviour.

(Refer Slide Time: 36:00)

Now, we have an interesting theorem. We know that Z is countable. What is a countable set? We
say that, a set A is countable if there is a 1 to 1 mapping from A into N. Now, how do we know
that Z is countable? Consider the set of all integers.

246
(Refer Slide Time: 36:40)

So, integers have gone on both sides of 0. We have the positive side here, and the negative side
here. We should devise a way of counting these integers. So, you should be able to say that this
is the 0th integer, this is the first integer, this is the second integer and so on. You count only
using natural numbers. So, if you count in this manner, 0 followed by 1 followed by minus 1
then 2, then minus 2, then 3 then minus 3. If you continue in this manner then you can readily
see that we are defining 1 to 1 mapping from a set of integers onto the set of natural numbers.

So, we will be threading the integers in this manner. 0 to 1, 1 to minus 1, minus 1 to 2, 2 to


minus 2, minus 2 to 3 and so on. So, if you thread integers in this manner you will find that every
integer will be threaded together. We are giving 1 particular ordinal position for every single
integer in the enumeration that we make. So, we are enumerating the integers. In this
enumeration is built in 1 to 1 mapping from the set of integers onto the natural numbers that we
talk about. Therefore, the set of all integers namely Z is countable. You can count them using
natural numbers.

247
(Refer Slide Time: 38:41)

What about Q? First of all let us consider 1 quadrant, that is only the non-negative rational
numbers.

(Refer Slide Time: 39:16)

Let us first show that, the set of all these are countable. But we will start off with all ordered
pairs of natural numbers. So, ordered pairs of natural numbers will be (0,0), (0,1), (0,2), (0,3),
(0,4), (1,0), (1,1), (1,2), (1,3), (1,4) and so on. How would we enumerate all these ordered pairs?

248
We could thread them together in this manner. Let us start from (0, 0) ,then we come to (0, 1)
then to (1, 0), then (2, 0), (1, 1) and (0, 2), then (0, 3), (1, 2), (2, 1), (3, 0). Here of course we
would have (4, 0) as well.

So, if you thread the ordered pairs in this manner, you can see that every single ordered pair
would be in this list that is we are giving an ordinal position for every single ordered pair when
we enumerate them in this manner. So, (0, 0) is the first ordered pair, (0, 1) is the second ordered
pair and then (1, 0) is the third, (2, 0) is the fourth, (1, 1) is the fifth, (0, 2) is the sixth and so on.
So, if you enumerate them in this manner, what exactly is the technique we are using?

We are enumerating ordered pairs in the order of the sum of the first component and the second
component. For example, (0,0) alone stands for the class of ordered pairs, in which the sum is 0.
(1, 0) and (0, 1) correspond to the class of ordered pairs in which the sum is 1. Then we have, the
set of ordered pairs in which the sum is 2. (2,0), (1,1) and (0,2), all have a sum of 2.

So, we have sum of 0 coming first, sum of 1 coming next, sum of 2 coming third and so on. So,
this is how we will enumerate the ordered pairs. Now, you can figure out how to go from an
enumeration of ordered pairs of natural numbers to an enumeration of rational numbers. So, I am
leaving it to you as an exercise for now. We will discuss it again later. That is all, from this
lecture, today. Hope to see you in the next. Thank you!

249
Discrete Mathematics
Professor. Sajith Gopalan
Professor. Benny George
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati.
Lecture 11
Introduction to graph theory

Welcome to the lectures on Graph theory. In this section, we will learn about various kinds of
graphs. You can think of functions and their graphs, but these graphs are very different
graphs, the only commonality between the two of them is that, both are pictorial
representations of, we can think of them as pictorial representations of certain mathematical
objects or graphical representations of certain mathematical objects. So let us formally define
what our graph is.

(Refer Slide Time: 01:04)

So it is usually denoted by letter G, and the graph will have two components, the first is going
to be call as vertex set, which we will denote by the letter V. This is a finite set and this finite
set we will regard it as the vertex set. The finiteness assumption is not sacrosanct, we can
have graph where the vertex set is infinite, but for time being we will restrict ourselves to
finite sets. And then the second component of graphs is the edge set, which we will denote by
E. So this is again a set, so we will think of them as two subsets of V.

In other words, look at the power set, 2 to the power V. From this, consider all the subsets of
size 2, any subset of that will essentially be the edge set. We will see an example. Suppose
our vertex set is, let us say 1, 2, 3, 4, and 5. Our edge set is going to be 2 elements of sets, or

250
edge set is going to be a subset consisting of two elements subsets. If you think of this
particular graph, it has two components. The first being the vertex, and the second being the
edges.

So here, the edge set consist of six elements, so we can say that this is a graph on 5 vertices
and 6 edges. The usual diagrammatic representation of this graph is as follows, we will have
1 dot or 1 point for each vertex. So, 1, 2, 3, 4 and 5 were the vertices, and for each spare of
edge, we just connect those corresponding dots. So, (1, 2) is an edge and (1, 4) is an edge. (5,
3) is an edge, (4, 2) is an edge, (1, 3) is another edge and (2, 5) is an edge. So this is the graph
that we have in mind.

So, what we have drawn here is the diagrammatic representation of this particular graph. So,
graph you can think of as a network of nodes which are connected to each other and which
node is connected to which node is captured by the edge set. So where do we look at these
things, let us first see a problem. Many different problems in mathematics can be modelled as
graph theoretic questions and can be solved.

(Refer Slide Time: 04:53)

251
Let’s see one problem, this is known as the party problem. So this is the description of the
problem. So there are let us say 5 couples, so let us call each couple as (m1, w1), (m2, w2), and
so on, (m5, w5), are your couples and one of these couples, let us say (m1,w1), they decide to
host a party. So, there is a party which will consist of 10 people, and these 10 people, when
they meet, some of them know each other before and some of them do not know, and people
shake their hands. So, each person may or may not shake hands with another person.

But, the rule of this problem is that, spouses do not shake each other’s hands. So if a married
to b, or if they are partners, then they do not shake hands. This is the setting, and what
happens is, after the party is over, m1 asks everybody else, so, m1 asks all the other nine
people as to how many hands did they shake? With how many people, each of them
exchanged the hand shake? And they give their answers and what turns out is that, each one
of these people had a distinct number of hand shake. So, we will write that as the outcome.

When asked about the number of handshakes, each person gave a distinct number as reply.
Now, the party problem is the following. With this information, can we determine how many
people shook hands with w1? So, determine the number of handshakes involving w1. So, it
looks surprising, because there is hardly any information. But, we will see that the amount of
information that is already present is enough to determine this number. So, m1 was the person
who had asked people about the number of handshakes, and the nine other people gave
answers and all their answers were different.

252
Form this information, we can figure out the exact number of handshakes, involving w1. So
how do we solve this? I mean how do we determine this? So we know that the number of
handshakes were different for each person, and each person can shake at most 8 hands, so
that is a fact, and the least number of handshakes is 0. Minimum number of handshakes is
equal to 0.

So, since each of the nine people gave distinct answers, their answers has to be 1, 2, 3, up to
8. So each person, the number of handshakes is some number distinct number between 0 to 8,
so that is all we know. Let us draw this diagrammatically. We will think of a graph. So, each
person is a vertex in this graph, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, these are the 10 people. Now m1
we will write it as this particular vertex and all the other vertices we will just number them by
the number of handshakes.

So, surely there is one person who has shook 0 hands, there is another person who has shook
1 hand, another person 2, another 3, 4 and so on, 8. Now, we could think of a graph in which,
these are our vertices and there is an edge between two vertices, if there is a handshake
involving the two of them. So if we knew this graph, then of course we can determine, I
mean, if we had this complete graph, then from that we hope to, we know who is this spouse,
I mean how many handshakes are there involving the person who is the spouse of m1.

But, we do not know much about this graph, other than the fact that if you look at the number
of edges connecting 0, there are no handshakes involving 0, so 0 is not going to be linked to
any other vertex. 1 is going to be linked to some particular thing. We do not know which
amongst these is connected to 1. From 2, there are 2 such things. Where exactly it lands, we
have no clue. Similarly for 3, and similarly for 8. So there are 8 edges out of 8, 7 edges out of
7 and so on.

And for m1, we do not know how many edges are there from m1, how many people shook
hands with m1, we have no idea. But, from this diagram itself we can infer some certain
things. 8 and 0 must be married to each other. The reason is as follows. Any person can shake
at most 8 hands, he does not shake hands with himself. He does not shake hands with his
spouse. There are 10 people, if you take these two out there are 8 people. So 8 if somebody
has shook 8 hands, his spouse is a person who has shook no hands.

253
Because, the only other person with whom he did not shake hands is himself. So 8 and 0, they
are a couple. Now, if you look at the person who has shook 7 hands and if you look at the
person who has shook 1 hand. This 1 hand shake person has certainly shook hands with the
8th numbered person, and he or she has not shook hands with any of the other people. 1
clearly has had a handshake involving 8, and since all his handshakes are accounted for, we
know that he has not shook hands with anybody else.

Whereas, here we have a person, who has had 7 handshakes. The 7 handshake person would
have 3 people with whom he has not had a hand shake. One of those people is a 0 hand shake
person. The other is himself or herselves. Who is the third person? 0 is not the spouse of 7,
there is yet another person with whom, 7 has not had a hand shake with. If you look at 1, his
only handshake is with 8 and so 7 has not had a hand shake with 1, that is clear. Moreover, 7
has had hand shake with every other people. In other words, 7 has not shook hands with 0, 1
and 7.

So, 0, 1 and 7 are not involved in a handshake involving 7, so the others cannot be 7’s
spouse. 0 is certainly not spouse of 7, 7 is also not a spouse of 7. So, 7 and 1 are a couple.
You can recurse on this logic and you can conclude that 6 and 2 are a couple, and 5 and 3 are
also a couple. Now, after all these four couples have been accounted for, what is remaining is
m1 and 4, they must be a couple, and the questioned that we started off with was, how many
hands did m1’s partner shake?

And the answer clearly is 4, because this particular person has shook hands with 4 people. So,
in a nut shell what we looked at is, we constructed a graph where there is an edge between
two vertices, they had shook hands with each other, if they had a hand shake. Once we
constructed this graph, we could pair of couples and once we paired off couples, what
remained was m1 and the person who has shook 4 hands, and therefore, that person is going
to be m1’s partner and that helps us determine the number of handshakes that m1’s partner has
exchanged. So we saw graphs and we saw an interesting problem involving graphs.

254
(Refer Slide Time: 16:58)

Now let us look at this problem of, when are two graphs equal. Once we define a
mathematical structure, we want to tell, when these two mathematical structures are same.
And the notion is that of isomorphism. So, these are vertex 1, 2, 3, and 4, and another graph,
where the vertices are named a, b, c and d, they are essentially same object. So, this is a
notion that we want to capture in terms of isomorphism. So, these are equal, and that is not
equal to let's say 1 2 and 3.

So, this is the notion that we want to capture via isomorphism. So, let us provide a definition.
So let G be a graph and let G1 and G2 be graphs, and we will assume that G1 is (V1, E1). So,
vertex set is V1 and edge set is E1 and for G2, we will assume that this equal to (V2, E2). We
will say that, G1 is isomorphic to G2, if, certain conditions hold. The first requirement is,
there should be a bijection between the vertex sets. For example, in this example that we have
taken, we could map 1 to b, 2 to c, 3 to a, and 4 to d.

Not only that, once we have mapped these, the pair of vertices which had an edge. When you
look at any pair of vertices, such that there is edge between them, if you look at the image
there is an edge there as well. If these two conditions are met then, we will say that the graphs
are isomorphic. Let us quickly do one example. If we had this particular graph, note that we
could map vertices to each other. The vertex 1 in G1 could be mapped to vertex 1 in G2 and
so on.

255
But, if you do that, there is an edge 2, 3 which, is unaccounted for, there is no, I mean if you
are trying to map G1 to G2, there is an edge between 2 and 3, but there is no edge between
the images of 2 and 3, okay? So, this is not an isomorphism. So formally, G1 is isomorphic to
G2 if there exist a bijection from V1, so, let us call this bijection f from V1 to V2. So, f maps
V1 to V2, and, the second condition is, if (a, b) belongs to E1 implies, and it is implied by, f
of a, f of b, belongs to E2. This is the definition of isomorphism.

So basically, we want to rename the vertices in one graph, by just we want to rename the
vertices in such a way that if there is an edge between two vertices, then the renaming also
preserves that edge. And further if there is an edge in the renaming then when you pre
compute or look at the back image, the graph should have an edge, there as well. This
condition is met, then, we will say that the graphs are isomorphic. So, let us see a couple of
example of graphs. So determining whether 2 graphs are isomorphic, that is a difficult
question. Let us see some examples to understand why this might be difficult?

(Refer Slide Time: 22:09)

So, let me construct a couple of graphs and you can ponder about it, whether these graphs are
actually isomorphic or not. So this graph has 6 vertices 1, 2, 3, 4, 5, 6. This is also a graph
with 6 vertices. There are 9 edges here, 3 for the outer triangle. 3 for the inner triangles and 3
connecting them and here are also there are 9 edges. So, 3 vertex, 9 edge graph, there are two
of them, that is drawn here. Now, are these graphs isomorphic? You can argue that, these are
not isomorphic, because, if you take, there is in graph 1, there is triangle. That means, there is
collection of vertices such that they are all adjust to each other, there is collection of 3
vertices which are all adjacent to each other.

256
Whereas in the graph 2, no two vertices have a, I mean if you take any two vertices which are
connected, they never have a common neighbour. So you look at 1 and 2, they are connected
and they do not have a common neighbour and that takes care of by symmetry, if you look at
1 and 2, that takes care of a lot of edges like (1,2), (2,3), etc. (3,4), (4,5), (5,6) and (6,1) are
taken care of. If you look at 1 and 4, that is the other kind of edge they also do not have a
common neighbour. So that takes care of the entire graph. So, if we can argue that these
graphs are not isomorphic, so it is tricky to find the reason for why certain pair of graphs may
be non-isomorphic. Let us take couple of other examples. So, here is a graph. So, this is a
graph, you can think of it is a cube with 8 vertices.

And here is another graph with 8 vertices. So, let us call this as graph 1 and graph 2. Are
these graphs isomorphic? In fact they are, and you can think of them as same, they are the
same objects mathematically. So if you think of this as a, b, c, d, e, f, g, h , you can map these
numbers in such a way that, the edges are preserved across the map. You can think of a, b, c,
d as 1, 2, 3, 4. You have to be careful while you write the map. (a, g) is an edge. So this is 5,
a is 1, b is 2, c is 3, d is 4, g has to be 5, h has to be 7, f has to be 8 and, e has to be 6. So, you
can look at this map. So, this map is an isomorphism between these two graphs.

So the cube, if it is laid out flat on a plane, you will get something like this, and that
placement essentially preserves the edges, and therefore we can argue that they are
isomorphic. Now, so these examples more or less they look like different graphs, let me give
you yet another example, where it is not straight forward that the graphs are isomorphic or
are not isomorphic.

257
(Refer Slide Time: 27:08)

So, this is special graph called as Peterson graph. So, this is a graph on 10 vertices. So, this is
the Peterson graph there are 10 vertices, and the outer pentagon has 5 edges, the inner star
also has the 5 edges and then there are these connecting edges, there are 5 of them. So this is
a 10 vertex, 15 edge graph. Let us call this is as G1 and I will draw another graph, think about
whether these graphs are isomorphic. So I have 1, 2, 3, 4, 5, 6, 7, 8, 9, 9 vertices, and then
there is 1 vertex in the centre, this is also a 10 vertex graph and the connections are as
follows.

So, this is also a nicely symmetric graph. So, the centre vertex is there and then in between
each of these spokes, there are precisely 2 other vertices. And then, this vertex is connected to
this vertex. These vertices are connected and these vertices are also connected. So this is
another graph, again, the outer circle has 9 edges plus there are 3 spokes, and then are 3 of
these cross edges ,so there is total of 15 edges. Now are these graphs are isomorphic? Is G1
and G2 isomorphic? Can you somehow twist the graph G1 or G2 and get something, get the
other graph. Twist G1 and get G2, is it possible?

So we can show that they are isomorphic. Let us fix an isomorphism, so we will name these
vertices as a, b, c, d, e and a1, b1, c1, d1, e1. So, let me just call this as a, b, c, d and e.
Clearly there is a cycle involving them, and then so if you think of that as the outer circle, the
other vertices should form an inner cycle. So, this, if I call it as a1, so a1 is connected to c1
and c1 is connected to e1, and e1 is connected to b1 and b1 is connected to d1 and d1 is
connected to a1. So, we can draw that as a, we can think of that as the other cycle.

258
We need to have the other cycle a to cl, c1 to e1 and e1 to b1, b1 to d1 and d1 to a1, and the
other cycle was this. And you can see that, the corresponding edges e to e1, b to b1, c to c1, a
to a1 and d to d1, they are also present. It is not easy to transform one of the others to give the
other one, but, you can see that all this, if you map it like this all the edge relations are
essentially taken care of. So, the drawings of the graphs may look different. But they could
still be isomorphic. So, Peterson graph is an example of, these are two examples, two
representations of the same graph. These graphs are isomorphic.

(Refer Slide Time: 33:11)

So, the next thing that will learn about is, about representation of graphs. So how do we if we
wanted to represent these graphs on a computer. If we had to write on a pen and paper these
are all fine but if we were to represent it on a computer, how do we do it? There are two
common ways. One is called as the adjacency list and the second is called as the adjacency
matrix. So, let us take the Peterson graph and see what is the adjacency matrix representation
and the adjacency list representation. So, when you represent it as adjacency list, for each
vertex there is an entry.

So a is connected to, if you look at vertex a, it is connected to a1, b and e. b is connected to a,


b1 and c, and c is connected to b, d, c1. d is connected to d1, e and c. e is connected to e1, a
and d and a1 is connected to c1, d1 and a. b1 is connected to, if you look at vertex b1, it is
connected d1 and e1, and it is also connected to b. If you look at c1, it is connected to a1 and
e1 and it is connected to c. If you look at d1, it is connected to a1 and b1and it is also
connected d, and e1 is also connected to, if you look at e1 it is connected to c1, b1 and e.

259
So, this representation is called as the adjacency list representation. Note that, here, this is a
very nice symmetric graph, each vertex has a corresponding list of length 3, but, if this is not
a regular graph, so these kind of graphs is an example of what is a regular graph and, if it
were not of this kind, for example you had another vertex. So, let us call this as z, then, there
will be an entry corresponding to z and z is connected to only d and c, and if you look at d
and c, there will be these entries corresponding to z.

So, this is not a symmetric one, as the previous one. So this is the adjacency list
representation. So, when you talk about adjacency list presentation, for every vertex, you
have a list telling, which are all the other edges which have, which share an edge with this
particular vertex. Adjacency matrix representation is a slightly different one, so, here you
represent the entire information in terms of a matrix. So if you are looking at the Peterson
graph, the rows and columns are indexed by the vertices.

Furthermore, if you look at the entry, the (d, a) entry, this entry is 1 if an only if (d, a) is an
edge. In other words, adjacency matrix is an n cross n matrix, with 0, 1 as entries. So, the i,j
th entry, so if you call the matrix as A, and Aij, as the i,j th entry, this is equal to 1 if and only
if (i, j) is an edge in G. So clearly, every graph will have a unique adjacency matrix, unique
up to the ordering of the vertices. If we number the vertices as 1 to n and then use your rows
and columns appropriately, then, there is a fixed matrix which contains all the information
about the graph.

So that is the adjacency matrix representation. When we want to run algorithms on matrices,
on graphs, we essentially convert it into one of these forms. These are the most standard
forms used for graph algorithms and then our algorithm would process these objects. So
adjacency matrix is going to be of size n square, where as adjacency list is going to be of size
proportional to the number of edges.

260
(Refer Slide Time: 39:00)

The next thing that we will learn about graph is something called as its degree, degree of
vertices. So, this is simply defined as the number of edges sharing a vertex. So degree is
defined for a particular vertex. So it is denoted usually by delta. So, delta v is equal to
number of edges involving v. So, if we look at the, if you look at this graph, the degree of
vertex 1 is going to be 3, whereas degree of vertex 4 is 2. And degree of 2 is going to be 3,
degree of 5 is going to be 2 and degree of 3 is again going to be 2.

If you add them up, what you will get in this case is 12. And this turns out to be two times the
number of edges. That, we will state as a theorem. So, it is not particular to this particular
graph, in any graph, if you look at the degree of each vertex and sum it over all the vertices,
you will get twice the number of edges. So, let G be any graph. Summation over v belonging
to the vertex set, delta v is going to be equal to two times the number of edges. Proof is
straight forward. Let us look at each vertex and look at how many outgoing edges are there,
add it up, that is the quantity on LHS.

So, the left hand side is equal to sum over sum of delta v over each vertex. So, each vertex
you are looking at the number of edges at that particular vertex. Now, if you think about this
particular summation, if you look at any particular edge, each edge is being counted precisely
twice. So, if you look at any particular edge, say between a vertex u and v, when you are
computing delta u, this edge you can tick it once, and whenever you are counting delta v, you
can tick it again. So, what we are suggesting is the following. When, we are doing this
counting over every vertex, we just look at all out going edges at one particular vertex.

261
And whenever it is counted, we just tick on those vertices. Now if you have summed up over
every vertex, you will note that each edge is ticked exactly twice. One for each of its n points.
So, every vertex would have been ticked exactly twice, I mean sorry every edge would have
been ticked exactly twice. So total number of ticks that you put, by one count it is going to be
summation over v delta v and since every edge is ticked exactly twice, the total number of
ticks is just two times number of edges. So, that is the proof.

A graph will have some definitions, which we will use later. We talked about regular or
regularity, when we talked about the Peterson graph. So, we will call a graph as regular, if
every vertex has the same degree. So if you look at this particular graph, the cycle, this is a
regular graph, because every vertex has degree 2. We would also take the complete graph, so
here every vertex is connected to every other vertex and delta v is equal to 4 for all v, take
any vertex, its degree is going to be 4. So, that is the notion of regularity.

(Refer Slide Time: 44:20)

Let us introduce some more terms, some more basic terms in graph theory. We will learn
about walks, paths, and cycles. So, walk is the easiest to describe. A walk is just a sequence
of vertices, such that between the consecutive vertices there is an edge. So, a walk is a
sequence of vertices, so if you are given a particular graph, a walk in that particular graph, so
I have not written it down but that is what it mean. A walk in graph is a sequence of vertices
v1, v2,..,vr. Some of these vertices could be repeating, does not matter. But the only
requirement is for every valid i, (vi, vi + 1) is an edge in the graph.

So, i can range from, when we say valid i, i can range from 1 to r minus 1, so that is going to
be called as a walk. A walk is just a sequence of vertices, such that between adjacent vertices,

262
between vertices adjacent in the sequence, there is an edge. So, if this was our graph. If, you
take 1 2 3 1 2 3, between 1 and 2 there is an edge and in between 3 and 2 there is an edge but
between 3 and 1 there is no edge, so this is not a walk. Whereas if you take 1 2 6 3 2 6 5 4,
you can verify that, that is a walk, that walk is nothing but the one indicated by this particular
red line.

So, note that in this second walk that we have written, there are repeated vertices. A walk
without repeated vertices is going to be called as a path. So, that gives us a next definition. A
path is a walk without repeated vertices. So, in a graph of n vertices, the longest path can
have at most n vertices in it. So, if you take 1 2 3 4 5 6, that is path if you take 1 2 6 3 5 4,
that is also a path and this path will have 5 edges in it.

So, we have learned what is a walk and what is a path, and the next thing is what is a cycle?
So a cycle is a walk, where the first and last vertices are the same and no other vertices
repeat. Natural to call it is a cycle, if we have taken 1 6 3 2 1, in this particular graph, 1 6 3 2
1, so that is going to be a cycle. So, no vertex repeats other than the first end last. So, in some
textbooks, you will see that the last vertex is not explicitly mentioned and we will think about
the cycle involving those, we will just assume that the last vertex and the first vertex are
connected. But, for this course, when we say, we will just list out all the vertices in the path
including the repeated ones. The first and the last one are repeated ones, so, that is a cycle.

(Refer Slide Time: 49:20)

263
Now, let us describe some more special kind of cycles. There are two types of cycles that we
will introduce, one is something called as an Eulerian cycle and the other is called as
Hamiltonian cycle. We can also refer to this as the Eulerian circuit. This is in the strict sense,
you cannot think of this as a cycle, because vertices repeat. So, Eulerian cycle is not a cycle
in this sense, of vertices not repeating. But, it will be a walk. Eulerian circuit is walk which
starts at some particular vertex and ends back at a particular vertex. So, we will formally
define an Eulerian cycle, as a walk where the first and last vertices are the same.

So this is the first requirement, first and last vertices are the same. That is notion of cycle.
The second condition is, no edges repeat. What does it means for edges to repeat? After all,
when we think of a walk, we have all these sequence of vertices, between any two vertices.
So, vi and vi + 1, we have an edge, so you think of an edge appearing there and if you look at
the sequence of edges that are being present in the walk, none of those edges repeat.

And the third requirement, the crucial one is that, every edge in the graph appears at least
once. So, what it means for a graph to be Eulerian is, the graph has an Eulerian cycle, then,
we say the graph is Eulerian. There is also a notion of Eulerian path, where all these
conditions should be met, but the first and the last vertices, we will relax this condition that
the first and the vertices are the same. So, if just this condition 1 is exempted, then we will
say that what is resulting as an Eulerian path.

So, now the question is a given a graph does the graph has an Eulerian cycle. So, this
problem was investigated by the great mathematician Euler. So, Euler basically looked at
what is now known as the Königsberg bridge problem. So, this is a, the problem was initially
posed in the following way. This is a river and there are two river islands, and there is a
bridge to either bank. There are two bridges to either banks, from this island 1 and from
island, let us call this as A and from island B, there are bridges to either of the banks.

So this is the river, and there is a bridge connecting the islands. Now, the question is can you
start somewhere in one of the islands, or on one of the banks and then come back to the same
place after being on every bridge at least once. So, can you start at a bank, let us say this is a
bank C, and this is the other bank D. So, can you start at C and return to C, after traversing
each bridge exactly once. It is also same as, can you traverse this without lifting your pencil
and without redrawing anything.

264
So, one attempt would be start from here. Go like this. But, when you reach here, there is so
no way you can complete that particular circuit. So, that attempt does not work, but that does
not mean that none of the attempts are going to work. So, Euler solved this problem and
showed that this graph cannot be traversed, I mean this particular, this is not yet a graph, but
this can be converted into a graph and certain properties of that graph would imply that there
are no way of solving this particular problem. So, we will see that. So, when does a graph
have an Eulerian cycle and when does a graph not have an Eulerian cycle.

(Refer Slide Time: 55:40)

s
So, the following theorem characterises it. So, we will write it as Euler theorem, one of the
many Euler theorems. A graph is Eulerian, that is it contains an Eulerian cycle if an only if it
is connected, that is a requirement. What does it mean to be connected? We will see in a
while, and every vertex must have an even degree. So, before we go into the proof, let us
define what is connected. So, let us look at all the vertices on the graph and we can define a
relationship.

So, we will say x is related to y, if there is a path starting at x and ending at y. For simplicity,
we can just think of, I mean we can replace the notion of path with a walk, makes the proof
simpler. So, walk means, there could be repeated vertices, path means there are no repeated
vertices. If you had a walk, you can easily, if there is such a walk then you can just throw
away the repeated vertices or the portion where, the portion of walk where vertices are
repeating and get a path.

265
But, we will just stick with the notion of walk, for defining connected. So, this is a
relationship between vertices. Now, you can see that this is an equivalence relation. We will
say that a vertex is automatically connected to itself. So, it is reflexive by definition and it is
transitive, because, if x is connected to y and if y is connected z, that mean there is a path
starting from x, or there is a walk starting from x to y, and there is another walk starting from
y to z. If you join these, if you just write this sequence, one after other, you will basically get
a path from x to z.

So, this would imply that x and z are connected. So, reflective transitive and it is also
symmetric. If, there is a path from x to y, then, that would imply that y to x is also a path,
because you see same path in the reverse. So this relationship is an equivalence relation. As
we know, if you have an equivalence relation on a set, that induces a partition on the
underlying set, and that partition, if it has exactly one equivalence class. So, if the
equivalence relation partitions V into a single class.

It means there is only one equivalence class, then, we say that, G is connected. So, we define
when are two vertices connected, and that relation is basically an equivalence relation and
that relation induces a partition, and if this partition had precisely one equivalence class, then
we will say that, the graph is a connected graph. So, Euler’s theorem states that, if you take
any graph which is connected and if every vertex has even degree, then the graph will have
an Eulerian cycle. We will do the proof of this in the next session. We will stop here for
today.

266
Discrete Mathematics
Professor. Sajith Gopalan
Professor. Benny George
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Lecture 12
Trees, cycles and graph coloring

In the previous lecture, we defined the notion, we introduced the notion of graphs, and then
we defined when are two graphs same, we talked about what is the isomorphism between
graphs. And, then we saw a theorem called as Euler's theorem.

(Refer Slide Time: 00:47)

We did not prove this. So, that is the first thing that we will do today. So Euler’s theorem
states that, a connected graph has an Eulerian circuit if and only if, the degree of every vertex
is even. So, Eulerian circuit was a walk in the graph, where no edges traversed twice, and
every edge is traversed exactly once. For example, if we consider the following graph. This
graph will not have an Eulerian circuit, as per the theorem. Because, if you look at these
vertices, they all have degree 3 but we can convert them into, i mean if you modify the graph,
this particular graph, this will have an Eulerian circuit, because, the degree of every vertex is
even.

I am just writing down the degrees of the vertices, this are not the vertex labels. So, this graph
will surely have an Eulerian circuit and we will in fact construct one such and convert that
construction into a proof. So, let us say we start from this particular vertex, so if we went
about in this particular fashion, there are lots of edges which are not traversed. We can

267
separately look at them. This is a yet another one, so we drew this as, we traversed this graph
by means of three cycles. We could have done that more systematically. We could have
started from here, and at this point, we can just come back here or again we can take this,
come back here, then, do this. So that gives us an Eulerian circuit.

So, the method by which we drew look like arbitrary, which is arbitrarily trace the edges, we
can essentially convert that method into a proof of our theorem. We will do that, first thing.
So, the key idea is since every vertex has degree even, if you enter any vertex, you have to
obviously leave the vertex. You will never be stranded at any particular vertex. So, let us say
we are randomly moving from one vertex to other may not being particular clever about
which is the outgoing edge.

For example, if we come to this particular vertex, one of the edges have been used, but, the
fact that the vertex has an even degree means that, there is at least one leaving edge. So, for
every vertex, if we start at any arbitrary vertex and keep on moving, that is tracing edges, the
only requirement or the only restriction that we will impose on our walk is that we will not
take the same edge twice. Now, if you do this, what happens? We can write the following
statement.

If we walk in an arbitrary fashion, we will always have an outgoing edge at every vertex. Of
course, there is a particular case that we have to be careful about, suppose, we come back to a
vertex and all the edges have been exhausted at that point. So, you start at some place and
you move around, and finally come back at this place and suppose this only had exactly two
vertices, then what we will do? So, that is one situation that we have to be careful. It could be
the case that once we did this, there are other edges lying outside of this particular circuit.
What we have ruled out so far is that, if we reached some particular vertex after lot of travels,
either it is one of the starting vertices, we could reach at the starting vertex. If we are at in
between vertex and you have no edges to move, that means you have come there some
number of times and you have left that vertex one less time.

So formally, we can say this as following. If we are, let us say stranded, suppose this a
possibility that we are stranded. Then, we have entered the vertex, the stranded vertex x times
and left it only x minus 1 times. The starting vertex is different, because starting vertex we
never enter, we just leave the starting vertex. But for every other vertex, you are entering that
from some other place and then you are leaving it. So, you would have left it one less time
than you have entered. So, the total edges of that particular vertex which is been traversed is

268
equal to, that accounts only for 2 times x minus 1 edges, and that is an odd number of vertex,
and since we assume that every vertex has even degree, we will never be in such a situation.
We could of course start at a vertex and then come back to same vertex. That is a possibility,
and at that point it could be that there are other parts of the graph which are not traversed.

But, here we can use induction. Now, by virtue of this graph being connected, we can say
that, if there are other edges, which are traversed, which are left behind, then, there should be
some connection to that, to the left out edges and what we can do is, the first point from
where there is an edge which is not accounted for, so look at this path and look at the first
place, first vertex which has an edge which is not used, we could start there and then continue
And we must essentially come back somewhere here.

We will never be stranded. The only way we can be stranded is if we had come back here. So
we can just traverse along the particular path, complete that cycle and then continue until
original cycle. So, this can be done inductively and therefore that will guarantee that, if the
degree of every vertex is even, then we will automatically have an Eulerian circuit. The other
part is simpler, if we had some vertex of odd degree, then there are no Eulerian circuits. This
can be seen because, if you look at the Eulerian circuit, the Eulerian circuit goes through
every edge.

So, if you look at any particular vertex and count the edges, it is, the Eulerian circuit has
entered every vertex and left every vertex. The number of times it is entered is exactly equal
to the number times it is left to that particular vertex. So, since these are equal and together
they account for total degree of any vertex. We will automatically get that the vertices must
have an even degree, every vertex must be of even degree. So that concludes the proof of
Euler’s theorem. So, the next thing that we will learn today is another kind of walk or a path
on graphs, which are called as Hamiltonian cycles.

269
(Refer Slide Time: 09:55)

So, we will first define what a Hamiltonian cycle is. A Hamiltonian cycle is a cycle in which
every vertex of the graph is visited exactly once. It is a cycle in G, so, we are talking about
Hamiltonian cycle in a graph G. So, of course, the natural question to ask is: Given a graph,
does it have a Hamiltonian cycle or not. So, we will see some examples. So, if you take
complete graph on n vertices. This, of course has many Hamiltonian cycles. You can start at
any vertex, and go to any other vertex in this particular graph.

So, complete graph on n vertices, this is, here the value of n is 5. So, K5 has a Hamiltonian
cycle. If you look at, let's say the cycle graph, this is also a graph which contains a
Hamiltonian cycle. Whereas, if you take this particular graph, this does not have a
Hamiltonian cycle because, the vertex, let us say, if we number this as vertex 1, you start at
vertex 1, you can never end up back at vertex 1 and, if you start at any other vertex, the
moment you reach the vertex 1, you are stranded there. And therefore this is graph, which
does not have a Hamiltonian cycle. So, in these examples it was easy to see whether it
contain Hamiltonian cycle or not. Unlike the Eulerian cycle problem, Hamiltonian cycle
problem is not very easy to solve. So, we will see a particular example. So, we will again
look at the Peterson graph. So, this is a graph on 10 vertices. So, this is 10 vertices and 15
edges, and we want to know whether this graph contains a Hamiltonian cycle.

You can look at the graph for, stair at it for a few minutes and figure out, if it does contain a
Hamiltonian cycle. If it does not contain a Hamiltonian cycle, what we have to show is that,
no matter how you walk in this graph you can never start at the vertex and reach back at the
same vertex after visiting every vertex exactly once. So, in fact that is what we will show,

270
that this graph does not have a Hamiltonian cycle. Peterson graph does not contain a
Hamiltonian cycle. Why is that the case? We will prove this claim via a very crucial
observation about Peterson graph.

So, let us call this Peterson graph as G. So, first claim is G does not have any 4 cycle. This is
a crucial fact that we are going to use. How do we see that this a graph without 4 cycle? Now,
if you look at this graph carefully, you will see that there are two cycles in this graph, each of
length 5. There are many 5 cycles. But, you can see that this outer pentagon and the inner star
which also can be drawn as a pentagon.

So, if I number this as 1, 2, the outer one's as 1, 2, 3, 4, 5 and the inner ones as 6, 7, 8, 9, 10.
So, you can see that 6, 7, 8, 9, 10 also forms a cycle. So 1, 2, 3, 4, 5, is a pentagon or a 5
cycle and 6, 7, 8, 9, 10 also form a pentagon. Now, these cycles, if you consider these cycles,
there are no other edges between them, for example between 1 and 3 there is no direct edge.
Between 5 and 3 also there is no edge. So these cycles are two disjoint cycles and they do not
have, if you just restrict the graph to those vertices, there are no smaller cycles in it.

So, if I look at the graph consisting of just vertices just 1 to 5, they do not have 4 cycles and if
I restrict to the graph to vertices 6 to 10, they also do not have a 4 cycle in it. And therefore
we can say that, it if at all there was a 4 cycle, it should have edges, which go from cycle 1 to
cycle 2. There should be an edge. Any 4 cycle should have an edge, which goes from the
outer circle to the inner circle. And outer circle to inner circle, the edges are, I am going to
draw that in red, so these red edges, at least one of this red edges must be present, if, there is
any 4 cycle.

If there is a 4 cycle, then, that four cycle must contain one of these red edges. Without red
edges, it is impossible to have a 4 cycle. What we will show is, even with the red edges, there
is no chance of having a 4 cycle. So let us say that one of these red edges is there. All the red
edges you can see that, from the symmetry, we can assume that all of them are of the same
kind. So if there is a 4 cycle involving one of red edges, then certainly there are 4 cycles
involving any other red edge as well.

That is from the symmetry of this diagram. So let us say 5,6 is an edge, which is there on
some 4 cycle. Now, 5 to 6, from 5, if there is 4 cycle, there are only 2 possible edges: 5,1 and
5,4, and from symmetry again we can say that, we can just look at 5,1. If there was a 4 cycle,
3 of its vertices are already in place, 5, 6 and 1. If there is 4 cycle, 6 and 1 should both be

271
connected to a common vertex. And you can see that there is no such vertex, 6 and 1, and
there is no other vertex, such that, that vertex is connected to both 1 and 6.

And since there are no other, so that is you cannot have a vertex x such that 6,x is an edge
and 1,x is an edge. So that basically means, that Peterson graph does not have any 4 cycle. So
how is this fact going to help us show that Peterson graph does not have a Hamiltonian cycle?

(Refer Slide Time: 17:58)

So at the moment, we have shown that the graph G or the Peterson graph does not have a 4
cycle. We need to show that G does not have any Hamiltonian cycle. So, suppose G did had a
Hamiltonian cycle, we can say that all the vertices could have been traversed in a systematic
manner. So let us say, the ordering of the vertices is V1 to V2 to V3. So, there is some
ordering of which we can just place these 10 vertices. So I do not know, whether V1 is the
vertex 1 or V2 is the vertex 2 and so on, but there is some order, I mean some way that you
can traverse all of them.

So, let us assume that starting vertex is V1. Now, this accounts, i means if you look at these
10 cycles, that accounts for 10 of the edges in the graph. Now we can think about what are
the other edges present. Just look at the Hamiltonian cycle, if there was one, and let us just
draw it and we will get a cycle consisting of 10 edges. And there are 5 other remaining edges,
which we have to add to this, to get our original graph.

Where all can we add that edge? V1 is an arbitrary vertex. If you had a Hamiltonian cycle,
surely there is one starting at any particular vertex, could have, because you look at the cycle,
start at, look at any particular vertex in the cycle, start at that point and traverse you will get

272
another Hamiltonian cycle. So if you have one Hamiltonian cycle, you can have the
Hamiltonian cycle start at any particular vertex. So, we can assume that V1 is equal to vertex
number 1. Now from V1, there are 3 edges. Two of them have already been accounted for the
third edge must go somewhere. Where all can it go? So, here we have shown that G does not
have a 4 cycle, it is apparent that G also does not have any triangles. If you take any 3 of
them, they are not going to form a triangle. So G does not have any 4 cycle or 3 cycle. So, V1,
let’s say V3 or V9 that would make a triangle in the graph. The graph does not have triangle.
The same applies if you connect it to V4.

Because then you will have a 4 cycle and the same applies if you connect it to V8. So, now
the only 3 options are V1 could be connected to V7, V6 and V5. So, V1, we can think of V6
as the diametrically opposite vertex. Note that, all of these vertices 1 to 10, has 1 edge
missing from there, and together there are 5 edges missing. Now, can it be the case that every
vertex has joined to its diametrically opposite vertex along this particular path?

By diametrically opposite, what I mean is V1 and V6 are diametrically opposite, V2 and V7


are diametrically opposite. V3 and V8 are diametrically opposite and so on. So, can every
vertex here be joined to its diametrically opposite vertex? If that was the case, V1 is joined to
V6 and V2 is joined to V7. If every vertex is joined to it’s diametrically opposite vertex and
this is what would have happened, but that automatically creates a 4 cycle: V1, V6, V7, V2.
So, this is not possible.

So, there is at least 1 vertex, which is not joined to its diametrically opposite vertex. So we
may assume, without loss of generality that, that one vertex is V1. So, V1 we may assume that
V1 is joined to one of V5 and V7 and these are symmetry cases. So we will just say that V1 is
connected to V5. So, we can assume without loss of generality that V1,V5 is an edge in the
graph. So, our starting point was, we assumed that there is a Hamiltonian cycle and we
somehow argued that Hamiltonian cycle has missing edges, and if you look at the starting
vertex as 1, it's missing edge is towards, goes to vertex V5.

So, this is part of the graph. There are other edges also. So if you look at vertex 6, where all
can 6 be connected to? We argued that, V6 and V1 are not connected, so there are other
possibilities. But, we will see that if you assume, I mean based on our claim that G does not
have 3 cycle or 4 cycle, we cannot connect 6 to any other edges. Clearly 6,8 and 6,4 are ruled
out because of 3 cycle. V9 and V3 are ruled out, because of 4 cycles. 6,8 and 6,4 causes 3
cycles. 6,9 and 6,3 causes 4 cycles. So they cannot, because V6, V9, V8, V7 forms a 4 cycle.

273
So, the only remaining vertices are, V7, V5 are anyway connected, they cannot be
considered. So 7 and 5 ruled out. 7, 5 and 1 are ruled out. The only remaining vertex is 10. If
you look at 6,10, you again have a 4 cycle, 6, 10, 1, 5 forms a 4 cycle.

So, this also causes a 4 cycle and therefore we cannot have any edge out 6. So, that
contradicts our initial assumptions that, there is a, I mean if we have this particular edge, we
looked at Peterson graph, surely there is an edge between 6 and 10, I mean 6 and some other
vertex, we argued that 6 cannot have any outgoing edge other than V7 and V5. So by this
argument we could show that the Peterson graph is a graph without any Hamiltonian cycle.

So, this required careful examination of many of the properties of Peterson graph. We could
not have, we did not have a general result like, we had in the case of Eulerian cycle, wherein
we told that, we just have to look at the degree of vertex, if each vertex is degree 2, then
automatically every connected such graph is going to have an Eulerian cycle. So, now we
will move on to some different notions. So, we learned about cycles. We learned about two
different types of cycles, Eulerian cycle and Hamiltonian cycle. The next special kind of
graphs, that we will look at is, what are called as Trees.

(Refer Slide Time: 26:32)

So, trees are graphs which have two crucial properties. So, let us call it by T. So T is also, this
is a graph. First requirement is, T should be connected. The second requirement is that, T
should not contain any cycle. So, let us see some examples. If you look at this graph on 6
vertices, this is called as star graph. So, this graph does not have any cycles, and it is a
connected graph. It is a line graph. This is also an example of a tree.

274
If you look at the wheel graph, this is not a tree, because there are many different cycles in
this graph. You can have trees of different kind, so this is an example of a rooted binary tree,
root because, we will usually, i mean when we draw it like this, this signifies the root and
binary because, every intermediary node has two children, we will study about those things
later. But, we can see that, this is a graph in which there are no cycles and that it is a
connected graph. So, these kind of graphs are called as trees. We will see some crucial,
simple properties of trees.

(Refer Slide Time: 28:32)

275
In a tree, let T be a tree. First fact is, there is a unique path from any vertex v to any other
vertex u. So choose two vertices, between those two vertices there is a unique path, whenever
the graph is a tree. We will prove all these facts. The second fact is, T has n minus 1 edges,
where n is the number of vertices. Third fact is, removal of any edge results in precisely two
disjoint trees. So if you take a tree and remove one edge from it, you will get another graph,
which will have two connected components and both those connected components will itself
be trees.

So, we will quickly see a proof of these facts. So, you look at any particular vertex u, and any
other vertex v, we want to say that there is precisely one path between them. So we may
assume the contraries, that is, let us assume that there are more than one path between u and
v. So, let us say this is some path, and then there is some other path. So, if you look at these
two paths, there, is a first place where these paths diverge. These paths start at the same
vertex and, end at same vertex. So, clearly there is some place at which these paths diverge.

So, let us say this is the point at which they diverge, and then of course they will come
together at some place. So, if you call this first point as i and then there is surely first point
after i where they come together because surely they come together at v, so they must come
together at some point and the first place where they come together, let's call that as j. So, in
between i and j, there are no common vertices. The paths do not have any other common
vertices, in between i and j. So these are all edges in the graph.

Now, if you look at vertices i and j, and if you start walking from i to j along the path 1, and
then from j to i along the path 2, that is basically a cycle in the original graph. If there are two
paths, then surely there is cycle in the original graph. If there is more than one path, we can
engineer a cycle in T. Because, by looking at these two paths, we can say that there will
surely be a cycle in the original graph, and the original graph by virtue of it being a tree, we
know that it cannot contain any cycle. So, this cannot be the case that, there are more than
one paths. So there is at most one path. How do we say that there is at least 1 path? Well
because this a connected graph between any two vertices you can go from one vertex to
another vertex. So that proves statement 1, that there is a unique path. We will now prove
statement 3, that removal of any edge results in two disjoint trees. How do we show this? Let
us look at one particular edge u, v. So there is direct edge between u and v.

276
Suppose we removed this. Now, look at all the vertices which are connected to u and all the
vertices which are connected to v. Now after removal of this edge there cannot be any third
kind of vertex which is neither, connected u nor connected to v, because if there was one such
vertex, then it is not connected to either u or v in the original graph and therefore we can
argue that every vertex in the original graph either belongs to this set Su or Sv.

Now, if you remove u v, Su and Sv are not going to be connected. Why? Because if they were
connected, then it would mean that from u to v, there is more than 1 path. Because u to v,
there is a direct path and if Su and Sv were connected by some other thing, clearly by virtue of
any vertex being in Su, there is path to u and here there is a path to v. So you can go from u to
this particular vertex u prime and from u prime we can go to v prime and v prime, we can go
back to v.

So, that is an alternate path. But by our first fact, we know that there is exactly one path. So,
we can argue that Su and Sv are disconnected components. And by virtue of this being
connected to all these, by definition, of all those things which are connected to u, that is a
connected path and of course there is no cycle in Su. If there were cycle in Su, then there is a
cycle in original graph.

So, this is a cycle free connected part and therefore that is a tree. And this is also another tree.
So, we know the removal of any edge results in exactly two disjoint trees. And, now we can
prove the third fact, that any tree has at most n minus 1 edges. So, if you take any tree with 1
vertex, clearly there are no edges in it. So, the base case would be that, we are going to prove
this statement by induction. And the induction hypothesis would be true for a graph with one
vertex. Now, you take any other tree. Suppose it is more than one vertex and surely it should
have an edge. Remove that particular edge. So, take any tree T and if the number of vertex is
greater than 1, then surely there is at least one edge, and remove any of those edge and then
we would get two components, let us say Su and Sv, S u is a tree and therefore the number of
edges by induction,

Su has, size Su, when we write size Su, that means the number of vertices in that particular
component minus 1 edges. And Sv has, size Sv minus 1 edges. So, together the total number
of edges, that will be S u minus 1 plus Sv minus 1, plus 1 for the edge u,v that we had
removed. So, that would be, size of Su plus size of Sv that will be, n minus 2 plus 1, which is
equal to n minus 1. So, any tree will have exactly n minus 1 edges. We will just define
something for later. So, this is a notion of forest. A forest is just a collection of trees, which

277
are disjoint. So, if the collection had k disjoint trees, and total number of vertices was n, then
we will have n minus k edges in it, which can be proven by the by using fact 2 repeatedly.
We have just 1 connected component. The number of edges was n minus 1. I will start off
with the next topic, which is known as colouring.

(Refer Slide Time: 38:35)

So, we will define what are called colourings. So we have a graph and then we want to colour
the vertices. What is a valid colouring, is what we will define first. Then, we will look at a
simple algorithm to colour a graph. So let us formally define what a colouring is. A
colouring is basically a function from vertex, this vertex set, to natural numbers. This is a set
of natural numbers, and any function from the vertex set to natural number is called as a
colouring if it is satisfies certain requirements. The requirement is, c(x) should not be equal to
c(y), if (x, y) is an edge. Although, we are writing it as ordered pair (x, y), since we are
looking at undirected graphs, they essentially mean the set x, y. So, each vertex is being
assigned a number and for vertices which share an edge between them, they should be
assigned different numbers. So, let us look at our Peterson graph. If we give, let us say red
colour to this particular vertex, then vertices 2, 8 and 5 cannot be given red colour. They
should be given a different colour.

These colours you could assign them as numbers, so this should be a different colour. 8 can
be the same the colour, there is no problem, because there is no edge between 2 and 8. So,
these vertices can be coloured in this particular manner, and then 10 can be given red colour
itself. 7 can again be given red colour, but 6 cannot be given red colour, because 6,7 is an
edge and it cannot be given yellow colour, so it should be given some other colour, blue

278
colour let us say. Now if you look at vertex 9, that also can be given the blue colour, because
9 is not, I mean, the vertex 9 cannot be given either red or yellow. And if you now look at
vertex 4 that can be given red colour, and vertex 3 can be given blue colour. Vertex 3 cannot
be given red or yellow, because it is neighbours which is coloured with same colour. So, here
we can see that Peterson graph can be coloured using 3 colours. If you say red is 1, blue is 2
and yellow is 3, you get a function from the vertex set to natural numbers.

So that is the notion of colouring. Clearly if you have n vertices you can surely colour it using
n colours, and if you take the complete graph on n vertices, you would require n colours. But,
what we are interested in is, if we are given a graph, how do we determine the minimum
number of colours required to colour the graph, and that we will define that particular
property, the least number of colours required to colour a graph G is called as the chromatic
number of the graph. And this is denoted by kai G. So, what we are interested in is, given a
graph, how do we determine its chromatic number? This is a difficult problem for arbitrary
graphs, finding an algorithm which will determine this is not an easy task. We will look at a
greedy algorithm for colouring and we will say that the chromatic number is going to be
certainly less than some quantity. So, we will basically prove the following theorem.

Suppose, the maximum degree in G is k, that means if you look at any vertex, the maximum
degree that it has, amongst all the vertices, the vertex which is maximum degree has degree k.
Then, kai G, or the chromatic number is less than or equal to k plus 1. That is the first
statement. The second statement says, if G is connected and not regular, then, kai G is less
than or equal to k. Not regular, means, so we will first define what are regular graphs. So, a
graph is called a regular if every vertex has the same degree.

If you look at our Peterson graph, look at every vertex, its degree is exactly 3. So, this is the
example of a 3 regular graph. So if it is not regular, then it will require less than or equal to k.
If you look at this cycle graph, every vertex has degree exactly 2, and since the degree of
every vertex is exactly 2, kai G is less than or equal to 3, that is what first statement says. But,
this is an example of graph which is regular. If you had taken some other graph, where the
degree at most 2, but it is not a regular graph, then you can argue that you don’t require more
than 2 colours. So, the proof of this theorem will be via an algorithm.

279
(Refer Slide Time: 45:45)

So, we will show that the greedy colouring algorithm will correctly colour the graph without
using more colour. So, what is the greedy colouring algorithm? The algorithm is very simple.
So, let us describe the algorithm systematically. So, choose an order on vertices. So let us say
you are going to examine vertices in this particular order: V1, V2, Vn and colour of V1 we
will assume it to be 1, and at any stage suppose you are colouring i th vertex. Look at the
least number that can be assigned to that, without violating the colouring property.

So look at the vertex i and look at its neighbours. So they would have got some colour, and
you can choose for i, the colour which is least amongst the acceptable colours. So, acceptable
colours would mean any number which is not a colour of its neighbour. So, i such that i th

280
colour is used by a neighbour. Consider the set of colours that neighbours have already used
and remove those colours from natural numbers and whatever is the least remaining colour
that would be the least amongst the acceptable colours and that is the colour that you choose
for i.

This is the greedy colouring algorithm. Now, if you look at the greedy colouring algorithm.
Whenever you are examining a particular vertex, since it has at most d neighbours, if you
look at 1 to d plus 1, if you look at the set of 1 to d plus 1, amongst these numbers, at least 1
number would be missing and that number can surely be chosen as a colour for i. Now, we
are using colours from 1 to d plus 1 and using these colours you can colour the entire graph
and therefore, the greedy colouring algorithm finds an acceptable colouring, where you do
not use any colour greater than d plus 1.

The second part of the theorem states that, if you have a connected graph which is not
regular, then you can colour it with less than or equal to k colours. So max degree is k, and if
the graph was not regular, then k colours suffice. Clearly, if the graph was regular, then we
cannot guarantee that it will be colourable using k colours. For example, if you take
pentagon, which is a 2 regular graph, we can see that we require at least 3 colours. But, if you
look at, let us say, the square, this is a two regular graph and it can be coloured with 2
colours. The second condition just states that, if you are guaranteed that is not regular, then
you do not need more than the max degree number of colours. So, let us prove that part.
Again we are going to use our greedy colouring algorithm and what we will do is, fix the
order in which the vertices are to be coloured.

281
(Refer Slide Time: 49:56)

So we want to fix the order of vertices V1 to Vn, we will start describing this by first
stating what is Vn. So what we know is that, max degree is equal to k, and this is not a k
regular graph. This is a not regular graph. If it is not a regular graph, then it means that there
is at least 1 vertex whose degree is less than k. So, we will choose that vertex as Vn. So, Vn
is the vertex of degree less than k. So there is vertex whose degree is strictly less than k and
we will take that vertex as Vn. So if you take Vn, it is going have lot of neighbours in the
graph.

So, suppose this is the graph G and Vn is this particular vertex, it is connected to some other
vertices. There can be at most k minus 1 of them, we first write all of them, in any order, so
Vn-1, Vn-2, up to Vn-r. So only requirement is r plus 1 will be less than k. And then we will
take Vn and look at neighbours of Vn, so write down neighbours of Vn-1. So these are
neighbours of Vn-1. And, then we can write down neighbours of Vn-2 and so on.

This is the first block, is neighbours of Vn and second block is neighbours of Vn-1 and so on.
So, of course when you are writing neighbours of Vn-1, if you have included, if there is some
common neighbours with Vn, you do not include them. So all the fresh neighbours of Vn-1,
neighbours which has not been included so far, is included into this block and you do
systematically. Since, it is connected graph, this will list out all the neighbours and finally
you will get V1.

282
So if you look at vertices in the order V1 to Vn, and apply greedy colouring in this particular
order, we can argue that we do not require more than k colours, at any point. Look at any
vertex that you pick. Let us say Vi is some particular vertex, one of its neighbour is
somewhere ahead, there is at least one neighbour which is further ahead, except for Vn, Vn
does not have any neighbour ahead of it. All its neighbours would have appeared previously.
So if you take Vi and look at all its neighbours, and since at least one of the neighbours is in
front, the number of colours used by the greedy colouring algorithm to colour the neighbours
of Vi would be at most k minus 1.

Because Vi, degree of Vi is going to be less than or equal to k. But, amongst these, since at
least one neighbour is in front of Vi, it comes after Vi, we know that the colours so far used,
will be strictly less than k. So there is at least one colour left amongst colours 1 to k, and that
colour can be used for Vi. Of course, this argument does not work for Vn. But Vn, we are any
way guaranteed that, its degree is less than k. Because, that is how we choose Vn. So even for
Vn, we will have a colour left with, when we are colouring via the greedy colouring method.
So, that concludes the proof.

283
Discrete Mathematics
Professor. Sajith Gopalan
Professor. Benny George
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati.
Lecture 13
Bipartite Graphs

(Refer Slide Time: 00:32)

So, in this lecture, we will learn about two things. One is about bipartite graphs and the
second thing is about minimal spanning trees. These are the concepts that we are going to
learn, in this lecture. Let us first define what bipartite graphs are. In the previous lecture, we
talked about coloring, or coloring the vertices of a graph. Graphs, which can be colored using
two colors, which can be vertex colored using two colors are referred to as bipartite Graphs.
So, lets us write down the definition. Two colorable graphs are called bipartite graphs.

So, if you have any graph and it can be colored using two colors, then these are called as
bipartite graphs. And, the reason why they are called bipartite graph is because, it naturally
splits into two parts. Let us say these are all the vertices getting one color. So, let us call it as
color 1, and these are all vertices getting color 2. Now, if you look at the edges of graph,
every edge is of this kind. One end is in this particular set, let us call this as A, and this is B,
and, the other edge will be in B. This is how every bipartite graph would look like.

284
One edge in the one part and other edge in the other part, because, if you had two vertices in
the same part, then of course, these where the colors given to graph, and, if you had an edge
in which both vertices where are on one side, then that would essentially mean that edge is
not properly colored. Its ends point received the same color, so that is not allowed. So,
bipartite graphs are essentially two colorable graphs. We will give an alternate
characterization to bipartite graphs. So, we write it as a theorem and we will prove it. A graph
is bipartite if an only if it contains no cycles of odd length. And, look at all the cycles in the
graph. If none of those cycles are of odd length, then it will invariably be a bipartite graph,
okay. And, if you take a bipartite graph, that cannot contain any cycles of odd length. Let us
try and prove this theorem.

(Refer Slide Time: 03:50)

One direction is easy. Suppose let us just imagine that there is an odd cycle. Suppose this is
one particular cycle in the graph, cycle having odd number of edges. Now, suppose this graph
was colorable, then all these vertices would have gotten some color. So, if you take this
vertex, one of these vertex, start at any vertex, if, you give color 1 to this, its neighbors must
surely get a different color. So, this must be of color 2. There is no other option. And, the
next vertex should be of color 1, and the next should be 2, and the next should again be 1.
And, that would cause this particular edge to be colored in such a way, that both its vertices
have the same color, so, that is not a valid coloring. So, this would happen in any odd length
cycle. If, you have a cycle of odd length, you start with any vertex, by the time you reach
back to the vertex, you will see, as the colors have to alternate, the last edge cannot be
colored properly, okay?

285
So, odd lengths cycles would imply that the graph cannot be colored using two colors, and,
hence the graph cannot be bipartite. So, we know that odd length cycle implies not bipartite.
So, the contrapositive of this statement would mean bipartite implies no odd cycles. Now, we
need to show the other direction that is, let us take a graph, which does not contain any odd
cycle that will be bipartite. So let us look at any graph and we are going to color the edges in
a particular order.

So, we take any particular vertex. Let us call this as v1. So, we are now looking at a graph
without any odd cycles and we will show that it is bipartite, okay? So let us take any
particular vertex of such a graph. So, let G be a graph without odd cycles and let us choose a
particular vertex v1. We are going to construct an ordering, and for this we will first look at
all the vertices which are adjust to v1. That is what we will call as level 1. Level 1 consists of
all vertices which are neighbors of v1.

And then, we will look at the neighbors of vertices in level 1. That will be our level 2. Look
at all their neighbors except of course the neighbor v1, we will look at all the other neighbors
and that we will be putting in the next level, and we continue this so we will get something
like a leveled arrangement of the vertices. This is the level 0, that is vertex v1 and then there
are level 1 vertices, its neighbors will give you the level 2 and so on. So, this will have some
number of levels.

Note that, if you look at the original graph and look at its all edges, they will pass from level i
to level i plus 1 or, level i to level i minus 1. They cannot jump two levels, because if there
was some vertex here and, there was let say there was some vertex at level 1 and it is
connected to level 3, then, automatically it would have been in level 2 as well, because it is a
neighbor. So it would have to be in level 2. So, there is no possibility of jumping the levels.

So, all the neighbors were essentially accounted for here. Of course, the previous neighbors
whatever, I mean, for example if you take a node here, some of its neighbors are going to be
here and, some of the neighbors are going to be in the layer above it. What we have argued is,
nothing can go from one layer to this. This is forbidden. Now, can there be edge between two
vertices in the same level? Is this possible? We will argue that this is also not possible
because, suppose there was one such edge that is within a level, then, these would have some
ancestors, some common ancestor, of course v1 is there common ancestor, but there could be
some common ancestor at a layer closer to this particular level. If you call this as a level i,

286
there is some other level which is closer to i than let say v1 and there is a node there such that
that’s a common ancestor of both u and v. Now, if you look at the common ancestor and the
path to u, there is a path from that common ancestor to v and the path from common ancestor
to u. And both these paths will have length exactly, the height difference between these two
layers and they will be equal.

And, therefore if you look at, so, if this was a common ancestor, there is a path, common
ancestor and then there is a path to u, and there is path to v and there is an edge between
them. Together, this forms an odd cycle, because this path length is x and, this path length is
x and this is of length 1. And, therefore, 2x plus 1 being an odd number you have a cycle of
odd length, but we assume that G is a graph without odd cycles, we wrote it differently, but
yes, this is, G is the graph without odd cycles. So, if you take a graph without odd cycles,
then this is not possible.

So, all the edges will be between adjacent layers and within a layer, there are no edges. Now,
any graph without odd cycle, we can construct this leveled arrangement. And now if you look
at the odd layers and give it one color, and, the even layers and give it another color, we can
essentially get a valid coloring using just two color. Because, between two odd layers the
distance is at least two and therefore, there is no edge between them, same applies for,
between the even layers. So, that completes the proof that, if you take a graph without odd
cycles, they can be colored with two colors. It is a two colorable graph or a bipartite graph.
Now we will see another concept called as minimal spanning trees.

(Refer Slide Time: 11:34)

287
We learnt about trees and trees were graphs, such that there were no cycles in it, and that they
were connected. Now, here when we are talking about minimal spanning tree, we have a
specialized kind of graph under consideration. So, what we will look at is what is called as
weighted graph. So, weighted graph is nothing but a graph with the edges having a certain
weight. So, a weighted graph is a graph with weights on the edges. That means, we have a
function, let us say wt, which goes from the edge set to natural numbers.

So, wt(e) would be the, this will be called as the weight of the edge. For example, here is a
weighted graph with 6 vertices and the weights being as indicated by the numbers above the
edges. So, here if you look at the edge a,b and edge b,c, they are having the same weight. So,
what we will assume in our study of minimal spanning trees is that, the edge weights are
distinct. Say if e1 is not equal to e2, then weight of e1 is not equal to the weight of e2. So, we
can think about our weighted graphs as graphs having unique edge weights.

Each edge weight is a distinct number and we will assume that it is a natural number. Now,
let us define what a minimal spanning tree is. A spanning tree is a tree, so T is a spanning tree
of a graph G, is a tree T which is a sub graph of G. So, there are no isolated vertices, there are
no isolated components. So, if you take a sub graph of G, swings a subset of the edges such
that, they form a connected graph and it does not have any cycle, in other words you wanted a
tree such that every vertex is part of this tree.

For example, if I take the red colored edges that forms a spanning tree you can see that there
are many spanning trees for this particular graphs. Now, the weight of a spanning tree is
defined as the sum of the weights of the edges. So, weight of a spanning tree is the sum over
the edges in T, weight of edge. For example, there is red tree that has been described, it's
weight is, so if you call red tree by T, weight of T is equal to 1 plus 3 plus 2 plus 5 plus 6, so
that is 17. Minimal spanning tree is tree which has the smallest weight amongst all the
spanning trees. So, there could be multiple spanning trees and if we did not assume that the
edge weights are distinct, there is a possibility that there could be multiple trees with same
weight. And, so all of them will be valid minimal spanning trees. Look at the smallest
weighted spanning tree. That is called as a minimal spanning tree. For example, in this graph
if you take another collection of edges.

So, if you take these blue colored edges, their weight is going to be, if you called them as T1,
weight of T1 is going to be 1 plus 2 plus 2 plus 4 plus 4, that’s going to be 13. So, that is less
then T, but it is not clear why this should be the minimal spanning tree of this particular

288
graph. Now, we will see certain properties, which will help us, compute the minimal
spanning tree of any graph.

(Refer Slide Time: 17:01)

So, there are two crucial properties which will help us identify the minimal spanning tree.
The first property is called as the cut property. So, cut property essentially states that the least
weighted cut edge is part of every minimal spanning tree. So, we need to understand what a
cut edge is. So, a cut is defined as the partition of the vertices. So, when we say partition of
vertices, we need to get we need to have a partition of the vertices into two parts, and two
non-empty parts.

So, let this be part A and this is part B. So, A union B is your vertex, collection of vertex,
vertex set. And, the edges, so, now if you look at the edges of the graph, there are 3 kinds of
edges, edges which lie completely inside A, edges which lie completely inside B and then
there are edges which go from A to B, these are the only three kinds of edges. And these
edges which go from one part of the cut to the other part, is called as the cut edge. So, that
essentially tell us what the cut property is.

Look at all these edges, these cut edges, for any possible cut, all such edges, would
essentially be part of every minimal spanning tree. Here, we are assuming that the edge
weights are distinct. And, therefore there will be a unique least weighted cut edge. We can
always relax these conditions. For example, if there are multiple edges carrying the same
weight, even then we can apply cut property. For that, what we need to guarantee is that for
the cut that we have chosen, there is a least weighted cut edge, there is one cut edge whose

289
weight is strictly less than every other one, and that cut edge would be present in everything
else.

If, you had multiple cut edges minimal once all having the same weight then what we can do
at best is that, one of those would be part of every minimal spanning tree. So, let us look at
some particular uses of them above lemma. So, if you look at the cut where vertex A is on
one side. So, if vertex A is on one side and everything else is on the other side and there are
two edges which goes from this side to the other side, namely the edge having weight 1 and
the edge having weight 2.

The cut property says that 1 must surely be present in all minimal spanning trees. Why is this
so? We will see shortly. Similarly, if we had taken A and B as a cut, as one side of the cut
and rest everything on the other side or a,b,d on one side and c,e,f on the other side, and
there are two cut edges, namely the edge 4 and the edge 5, sorry the edge having weight 4
and the edge having weight 2 and we can say that the edge b,c will surely be part of every
minimal spanning tree. Now, let us see how we can prove cut property.

(Refer Slide Time: 22:04)

Now, suppose the cut property is false, what does it mean? That means there is some
particular edge, which is the least weighted cut edge, but is still not present in the minimal
spanning tree. Suppose, let us say the edge e is equal to (u, v), is the minimum weighted cut
edge across let say the cut we will call it as (A, B) cut, means one side is A and the other side
is B. And A, let us say A contains u and B contains v, that is how (u, v) is the cut edge. And
suppose e does not belong to MST. So, let us draw this MST.

290
So some tree is there and then if you add this edge e into it what will happen is, that is going
to create a cycle. Because, spanning tree, by virtue of this being a spanning tree, there is
already a path between u and v because, every vertex was present and this is a tree. Now, if
you add the edge u,v, that is going to create a cycle. Can we somehow say that the minimal
spanning tree that you currently have is not the optimal spanning tree? So, this diagram was
our spanning tree T and we are assuming that u,v is the minimum weighted cut edge.

Now, minimum weighted cut edge means, it is across some particular cut A, B and we want
to identify those vertices. So, let us first draw the diagram, so this is our A and this is our B
and on top of this diagram we are going to layout our tree. So, tree will have some edges on
the A side, some edges on the B side and surely because it is a tree, there is going to be edges
going from one side to the other. Now, what we are guaranteed by our assumptions is that,
u,v is not one of the edges, but, if add u,v to this, that is going to create a cycle.

So, let us just add the edge u,v. Suppose u is this and v is this, we add u,v to this, that is going
to create cycle and in this cycle, by virtue of this being a cycle, there is, I mean, if you just
trace the edges, there is a path from v to u, u to v is an edge that we added and surely there is
path from v to u and that path should surely cause the cut at some point, that means there is
an edge which starts in B and ends up in A, by virtue of that being a cut edge for the cut (A,
B), its weight, so let us call this is as e prime, weight of e prime is going to be greater than
weight of e.

So, now let us imagine the tree, which is obtained by removing the edge e prime, so this edge
we are throwing out and we are in places that adding the u,v edge. Now, we will claim that
the new object is going to be a tree, it is going to be a tree because this does not contain any
cycle. The only cycle that was there was the cycle that we got by adding u,v. Now that, the
cycle has been destroyed by removing the edge e prime, this is a cycle free graph. And, every
vertex which is connected earlier still remains connected.

So, this is a new tree and the weight of this particular tree, so, let us say the new tree is T
prime, weight of T prime is equal to weight of T plus weight of (u,v) minus weight of e
prime, but e prime’s weight is going to be greater than the weight of (u, v), or the edge e and
therefore this quantity is a negative quantity. So, weight of T plus negative quantity will give
you the weight of T prime. And, this contradicts the assumption that T was the tree having
least weight.

291
So it is not a minimal spanning tree, this argument tell us that T cannot be the minimal
spanning tree, because we have found one particular tree whose weight is strictly less than
that of T. That is the proof of cut property. So, construct any possible cut, split the vertices
into any possible manner and look at the edges which goes from one side of the cut to the
other and amongst those the least weighted edge, surely is going to be present in all minimal
spanning trees, that is the cut property. Later on we can see how we can repeatedly use this
cut property and construct the minimal spanning tree of any graph. Next property, which is
again to be going to be a useful property for constructing minimal spanning tree, is the cycle
property.

(Refer Slide Time: 28:39)

So consider a cycle in any graph, any weighted graph and look at the maximum weighted
edge in that cycle, make a guess as to whether that edge will be present or absent in minimal
spanning trees. The cycle properties states that, max weighted edge will not be part of any
minimal spanning tree, so how do we prove this? Let us take a graph G and suppose the cycle
property is false. Before we go into the proof, let us see a couple of examples.

So, in this particular diagram, this graph G, if you look at (d,b), that is the maximum
weighted edge of cycle involving edges with weight 1, 2 and 3. So, the edge marked in green
cannot be present. This edge cannot be present because it is a maximum waited edge. The
same applies to the edge (e,c), because that is also the maximum weighted edge of the cycle
b, c, e, d. And e, f is also not part of a minimal spanning tree, of part of any minimal spanning
tree, because that is maximum weighted edge of the cycle c, f, e.

292
So, in this diagram, in this graph if you look at the edges that has been excluded, they are fit
to be excluded. None of those edges can be part of any minimal spanning tree. And therefore
if you look at the remaining edges they must be part of every minimal spanning tree, reason
being if you remove any one of them, there is no other edge that you can add to make this
graph connected. So, the tree indicated by the blue lines is going to be the unique minimal
spanning tree.

Now, let us see by the cycle properties. So let us see a proof of cycle property. So, suppose
cycle property is false. What does it mean? That means there exist a cycle, such that, e is the
maximum weight edge of that cycle and e belongs to the minimal spanning tree. Let us call
the minimal spanning tree as T, so e belongs to T, e is an edge of the tree T, so this should
give us the contradiction. So, let us look at our tree, we will draw it out, and some particular
edge e, which is the maximum weighted edge of some particular cycle is present.

Suppose this was that particular edge, if you remove that particular edge, the tree will
automatically break into two parts. So that will naturally give us a cut involving, or partition
of the vertices. So let us say you are going to draw a tree like this, this is our edge u,v and
there are no other edge going from side A to side B in the tree. So, there are all these other
vertices, they are all connected to u and all these other vertices which are connected to v, so
that is our tree. Our minimal spanning tree was something of this kind and if you remove u,v,
then that disconnects the graph, but now if you think of the original cycle c of which u,v was
a maximum edge, if you look at that cycle and trace out that cycle starting at the edge u,v.
Then at some point, there should be an edge going back into, going for B to A. The cycle of
which the edge e was part of must have an edge, which goes from B to A.

And clearly this not going to be an edge that was the part of tree T, because the tree T did not
have any edge going from side A to side B other than the u,v edge. So, let us call this edge as
e prime and since this is part of the cycle, weight of e prime is going to be strictly less than
weight of u,v. Again, replacing the edge e prime, replacing the edge e with e prime, we get a
tree of strictly less weight. So, we can conclude that T is not minimal, not the minimal
spanning tree.

And therefore our assumption is contradicted. We know that cycle property is also true. So
now equipped with these two properties, cycle property and cut property, we will describe a
couple of algorithms to solve the minimal spanning tree problem. Minimal spanning tree
problem is you are given some particular graph, some particular weighted graph and you

293
need to compute the minimal spanning tree. These are classic algorithms, the first algorithms
is called as Kruskal’s algorithm.

(Refer Slide Time: 35:03)

We will not bother about the data structures used in these algorithms, but instead we will just
look at the steps of the algorithm. So Kruskal’s algorithm does the following. You look at all
the edges sorted by their weights in increasing order. So, consider edges in increasing order
of their weights e1, e2 and so on. There were m edges. Then you look at edges e1, e2, em and
keep on adding this into the tree. So, initially you have these n vertices, which we can think
as isolated vertices.

First edge that you add is e1 and then maybe you will add e2 and so on. Each time you are
adding an edge you will check if there is cycle formed by the edges that are already present.
So, suppose e1 and e2 and then you add e3, and suppose e4 was between these, then, you are
not going to add e4. So, consider the edges in these orders. Keep adding them, without
creating cycles, and in the end if you had started with a connected graph you will get the
minimal spanning tree.

Why is this algorithm correct? The reason is the cycle property. Now, every edge that we
have removed from this particular tree is an edge, which deserves to be removed because that
cannot be a part of any minimal spanning tree. This also tells us that, if edge weights were
distinct that gives rise to unique ordering on the edges and therefore it will create a unique
minimal spanning tree. So, on graphs where the edge weights are distinct, there is precisely
one minimal spanning tree and that can found by this method.

294
Method's correctness rests on the fact that, all the edge that we have removed in process to
create the tree can, I mean, those are the edges which cannot be present in any minimal
spanning tree. So, if we look at our original graph. If, we sorted the edges according to edge
weights, the edge weights would have been 1, 2, 2, 3, 4, 4, 5, 6. We need to have some way
of identifying which was the weight 2 edge, but if you look at weight 1 and these edges will
automatically be added. This also can be added. When you add 3, there is a cycle so that is
not going to be added. When you add 4, that, is not going create a cycle that gets added. Next
4 is also going to be added and 5 and 6 are also going to create cycle, and therefore they will
not be going to be added. And therefore these examples verify that the tree obtained is indeed
is the minimal spanning tree.

(Refer Slide Time: 38:30)

Now Prim’s algorithm is also a very simple way of constructing the minimal spanning tree.
You start with, the algorithm works in the following way, you start at one particular vertex
and look at all it neighbors and amongst the neighbors add the least weighted edge, and keep
on, then let us say if you have added one particular vertex, now you consider this as a set and
from this collection find out the least weighted edge and then add that and then consider those
three as a set and add the least weighted and so on.

At any stage, you have a collection of edges, collection of vertices which forms a connected
component and from this component, look at the least weighted outgoing edge and that is
going to be added to expand this particular tree, and in the end you will get a tree, and that
tree is guaranteed to be the minimal spanning tree. Why is it the minimal spanning tree?
Because each edge that you add at any given stage, is the minimum weighted edge of a

295
certain cut. So, when you added the first edge. If you look at cut consisting of just that one
vertex and everything else in the other side, you have a valid reason for adding the particular
edge that you have added. At any particular stage, you have some partial tree and those
partial trees define a set of vertices and the complement of that would be the other part. So, if
you consider this is as a cut (A, B), the next edge that you are going to add is the minimum
weighted edge across the cut (A, B), and therefore by cut property, that is an edge which
deserve to be added to every minimal spanning tree.

So, you have learnt two algorithms to compute the minimal spanning tree, and the correctness
of both these algorithms comes from simple properties, simple graph properties like cut
property and cycle property. We will stop here for today’s lecture here. We will end today's
lecture here. We will continue studying other properties of graphs in the coming lectures.

296
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering
Indian Institute of Technology, Guwahati
Lecture No. 14
Bipartite Graphs: Edge Coloring and Matching

(Refer Slide Time: 00:29)

In this lecture, we will learn more about bipartite graphs. So, first we will look at some notion
known as Edge Coloring, and then we will look at something called as Matching. We will
look at these properties with respect to bipartite graphs. So let us see what edge coloring in a
graph is. So, let us take an example, this is a graph on 5 vertices. There is some number of
edges. The objective is to color the edges, that is give each edge some color. The restriction is
adjacent edges should have different colors. So, write it in the following way. So, we require
two things. First, color every edge, and second, edges sharing a vertex, they should have
distinct colors. So, any such coloring will be called as a valid edge coloring. We want to find
an edge coloring which minimizes the number of colors used. We could of course give
different colors, distinct colors to each edge and surely all the properties would be met but
that does not minimize the number of colors used.

One thing that we can see is, for this particular graph, we would require at least 3 colors,
because there are some vertices whose degree is 3. Now the question is can we do it in
exactly 3? 2 is impossible because, since there is a vertex of degree 3, all the 3 edges should
get distinct colors. So let us see if we can do that. So, this is red. Let’s say this is given blue

297
color, and the third edge is given green color. Now, the 1,4 edge could be given green color,
and 4,5 could now be given, say red color and 4,3 must be given blue color. And now if you
look at vertex 3, there are 2 colors already being used there, blue and green. So the 3,5 edge
cannot be blue or green. And because, the 4,5 edge is using red color, this cannot be red
either. So, maybe you will have to use a different color. Let us say if we use a pink color,
then this is a valid coloring. So, here we required 4. But, is 4 really required? There were 4
colors used, can we do with 3 colors is a question, and we want to answer this question for
bipartite graph. So, the specific question that we will look at is, is the number of colors
required to edge color a graph equal to its maximum degree? So every graph will have a
maximum degree, and will the number of colors required to edge color be equal to the max
degree? Clearly, in general graphs this is not the case.

For example, we could take just the 5 cycle. So, in the example that we have considered, it is
not clear that we require 4. May be there is a coloring which requires just 3, we can think
about that. And we can construct examples where, the max degree of the vertex is not
sufficient. For example, if we took this particular graph, this is a 5 cycle, the degree of any
vertex is only 2. But we can argue that 2 colors will not suffice to edge color this graph,
because, we can without loss of generality assume that, one of the edges is colored blue, and
then its neighboring edges have to be colored using a different color. So, let us say it is red.
And its neighboring has to be colored using blue. If we use any other color, we are exceeding
the number of colors. But, there are two neighbors and both of them cannot be given blue
color because of the construction of this graph.

So, in general graph, you can construct examples where max degree is not equal to the
number of colors required to edge color. So, we can summarize that, the following way. Max
degree can be less than the number of colors required to edge color. In the general graph, this
could be the case. The C5 or any odd cycle is an example where this max degree is less than
the number of colors required to edge color. Now, let us focus on just bipartite graph. If we
restrict our attention to bipartite graph, can every bipartite graph be colored with number of
colors which is equal to the max degree? We will show that this is the case.

298
(Refer Slide Time: 07:13)

So, this is a theorem that we will prove. So, let G be a bipartite graph. Number of colors
required to edge color G is equal to max degree of G. Why is this so? What is the proof? So,
first of all, what is the bipartite graph? You can split the vertices into 2 parts, say X and Y. So
if we think of G, as (V, E) where V is the set of vertices and E is the set of edges, V can be
written as X union Y. So this is the disjoint union. They do not share any vertices. So, V can
be, the vertex set can be split as X union Y such that all edges are between X and Y. So, you
cannot have an edge of this kind, from X to itself. So, this is forbidden. And you cannot have
an edge from Y to itself. This is also forbidden. So, all the edges go from X to Y. That is the
definition of a bipartite graph. So, if we look at a bipartite graph we want to show that, the
max degree is equal to the number of colors required to edge color the graph. So, clearly we
can see one one direction, that is you require at least those many colors. So, let delta be the
max degree. Number of colors required is going to be greater than or equal to delta, because
at least delta is required. This is because, if you look at that particular vertex with degree
delta. Each of these edges must get distinct colors. So, number of colors required definitely is
greater than or equal to delta. Now, if you can show that there exists a coloring which uses no
more than delta colors, then, it means the minimum number, so this is the min number of
colors required, so that is going to be equal to delta.

299
(Refer Slide Time: 09:57)

So how do we show that, that is the case here? So proof will be by induction. We will induct
on the number of edges in the graph. Suppose we denote the number of edges by m. Then, m
is equal to 1. So, this is the base case, m equals 1 would mean that there is bipartite graph
with a single edge and you require only one color. So you can think of that as a trivial case.
To color it requires only one color. So we will assume that statement is true for all graphs or
bipartite graphs with at most m edges. Now, we will look at a graph with m plus 1 edges, and
show that it can be colored using delta colors. So, let G be a graph on m plus 1 edges. So
now, this is some bipartite graph. We can just remove one edge from G. So remove a
particular edge, any arbitrary edge. So let us call that as the edge (x, y) from G to obtain G
prime. So, we just removed one particular edge, some arbitrary edge and we will get a
smaller graph.

Clearly by our induction hypothesis, on the smaller graph G prime, you can color it with delta
colors. So delta is the max degree in G. So, clearly we can obtain a coloring of G prime
which uses no more than delta colors. G prime can be colored using delta colors. Now, let us
look at these vertices x and y. That is corresponding to the edge that we have removed. This
is an edge that has not been colored. So, G is the main graph and G prime is the residual
graph and if we add this edge (x, y) we will get this as the complete graph G. Now we need to
color this and if we manage to color it without using any additional color, then we are done.
We will see how that can be done. Now look at the vertex x. Its degree is at most delta in the
original graph. Since one edge has been removed, the degree of x in G prime is at most delta

300
minus 1; degree of both x and y, so there is an unused color left at x and y. So if you look at
the vertex x and y in the graph G prime, there is some color that is unused. So, let alpha be
the unused color at vertex x and beta be the unused color at vertex y. So, whatever are those
unused colors, there should be at least 1 because degree of x is at most delta, and since we are
removing one edge at x, we would have used only delta minus 1 colors, maximum of delta
minus 1 colors, the left-behind color is what we will call as alpha. Now, the simple case is
when these alpha and beta were the same. If alpha and beta were both, let us say, equal, then,
we can simply use that color to color the edge (x, y). So (x, y), there is an unused color, we
can just give that unused color to that particular edge. But when alpha and beta are different,
we are in a little bit of trouble. But we will see that case can also be dealt with. What we will
do is, we will look at the coloring given by G prime. G prime was the smaller graph which
we assume, which by our assumption can be colored, because G prime has at most m edges.
(Refer Slide Time: 15:37)

So, now let us look at this vertex, so x and y are 2 vertices. The color that was unused at x
was alpha and the color that was unused at y was beta. Now, so alpha, we will just denote it
by red and beta we will denote by blue. Now, the red color was not used at x and blue color
should have been used at x. So, let us look at the blue colored edge from x to the other side,
so that we will go to some particular vertex which we will call as y1. Now, if you look at
vertex y1, there are 2 situations.

There is either a red edge back to the other side, or the red edge is not used. If the red edge is
not used, then we will stop. If the red edge is used, we will take the red edge. So the red edge

301
goes to some other place, some other vertex. Let us call that as x1 and then from here, we will
go back via blue edge and we will continue this process. So, all that we are doing is the
following. Start at the vertex x and you go to the other side via the blue edge. If there is no
blue edge, then we will stop. And when we go to the other side, you will come back via the
red edge. If you cannot find the red edge you will stop. So you will get an alternate path of
red and blue edges. The claim is, this process has to stop at some point, and once this process
stops we can recolor this particular path without requiring additional colors. So let us see how
that can be done.

At any point when we stop, it means there is no edge to be taken for the other color. So, what
it means is if we could have changed the blues to red and the reds to blue and still things
would work fine. Why is that so? So let us say this is yn and xn. The process stopped at yn,
because there was no red edge to take to go to the other side. That is guaranteed because this
is a bipartite graph and because all the edges go to the other side. So you can change this back
to red. But if you change that back to red, there is a violation at xn, but we can change that
back to blue. So, we can alternate in this particular manner and get another valid coloring. So,
now what happens is, initially the color that was used at vertex x was blue and now that blue
has changed to red, and since we initially assumed that red color was not used and the blue
was now a free color. So when blue is a free color, that is the same color that is free for
vertex y and therefore we can color the edge (x, y) using the blue color. So that is the proof.

So let us just quickly recap the proof. We looked at the graph and we removed one particular
edge. The smaller graph can be colored using delta colors. Now, the larger graph that you get
by introducing (x, y) into the graph, into the residual graph can also be colored using delta
colors because we could start at the vertex x and keep on alternating using the two colors that
were left unused at the vertices x and y and once you get these alternating path of colors, you
can swap the colors on the path to recolor the edges. Once you have recolored the edges the
vertices x and y both will have the same unused color and that color can be used for coloring
the edge (x, y). So we will now move on.

302
(Refer Slide Time: 20:19)

We will look at another property known as matching. So, Matching is, I mean, you can find
matchings in general graph. But, here we are going to restrict our attention to bipartite graphs.
Let us understand what is a matching. We will start with the general graph itself. This is a
graph on 5 vertices, let us say 7 vertices. A matching is simply a collection of edges, such
that they do not share a vertex. So, the red colored edges will basically form a matching, that
is an example of a matching. So, I have drawn 3 red edges in this graph and this forms a
matching.

If we number the vertices as 1 to 7, the matching will consist of the following edge, the
matching which is described here using the red edges will consist of the following edges; 1,2,
4,6 and 5,7. Look at any 2 pair of edges inside this collection. They do not share a vertex.
This is also a maximal matching, in the sense, this particular graph G, if we call this graph G,
this graph G cannot have a matching of size greater than 3. It has only 7 vertices and each
matching edge will take 2 vertices with it. So, the maximum number of such pairs that you
can obtain is certainly less than or equal to 7 by 2 and 3.5 being a non-integer, the maximum
possible is only 3. So, 3 is the maximal matching as well as the maximum matching. So, let
us see few other examples.

303
(Refer Slide Time: 23:12)

So, we will look at these concepts more carefully. Maximal matching, Maximum matching,
Complete matching or Perfect matching. So, maximal means there is no additional edge that
one can add to this collection that is given to you, to get a matching. Let us see an example.
So, if you look at this particular graph G, and if you just look at the red edge, if you look at
the red edge, that forms a maximal matching, because you cannot add any more edges to this
collection without violating the matching property, because, all the other edges must share a
vertex with either vertex 1 or vertex 2. But, this is clearly not the maximum matching, in the
sense there are larger matching that can be obtained. For example, if you look at the blue
colored edges, that is an example of a maximum matching, in the sense, this cannot be further
extended. But this is not unique maximum matching. The pink edges will also form a
maximum matching. In this case this is also a complete matching or a perfect matching,
because all the vertices have been matched.

If we had looked at a different graph, namely the triangle, the maximal matching as well as
the maximum matching will consist of precisely one edge. If you take any particular edge,
that is going to be a maximum matching. You cannot further extend it. So you cannot get a
matching which cover all the vertices. So there is no, there are no complete matching in this
particular graph. What we will see in the remainder of this lecture is a characterization for
when a bipartite graph has a complete matching.

304
(Refer Slide Time: 26:19)

So, this is the question we will answer. When does a bipartite graph have a perfect matching?
So, we will restrict our attention to bipartite graphs which have equal number of vertices
when we talk about perfect matching, because if one side had, let us say 7 vertices and the
other side had 10 vertices we cannot clearly match all the vertices, but we can hope to match
7 of them. So if we match 7 of them, we could still call that as a complete matching, although
it is not a perfect matching. So, we will look at one side of the graph and we will try and
figure out when can all the vertices on one side be completely matched and we will call that
as a complete matching. The perfect matching case can be handled by this because all that
you have to check is the other side should contain an equal number of vertices. So, the other
side contains 7 vertices. Our criterion will basically help us figure out whether it has a perfect
matching or not.

So, condition is given by Hall's theorem. This is known as Hall's condition. We will require
the notion of what the neighbors of the vertex is, to state this condition. So, you look at one
particular vertex. It is connected to many other vertices. These vertices would be called as
neighbors of x and we will denote it by N(x), okay, so here x is a vertex and N(x) is a set of
neighbors of x. If instead of vertex, if this is a set of vertices, let us say S, so this is a set of
vertices which we will assume is not empty, N(S) would be basically union of x belonging to
S, N(x).

305
So this is your set S and all the neighbors of this together would be called as N(S). So, clearly
one can imagine that if the number of neighbors is strictly less than the number of elements in
a set, then there is no possibility of a complete matching because those vertices cannot be
matched. It does not have enough number of counterparts on the other side. If this condition
is met for every subset, then, there is perfect matching. That is what Halls condition says. So
let us write it down formally. So, let G is equal to X union Y. So the vertex set I am just
writing it as X union Y where X is one side and Y is the other side, E be a bipartite graph,
such that, the size of neighbors of S, if you look at the size of this, that is greater than or equal
to size of S, for all S subset of X, then G has a matching that matches all the vertices in X. So
everything in X can be matched and if the number of vertices on the other side is equal to the
size of X, then we know that it is a perfect matching.

So let us look at the condition carefully. The number denotes the set of neighbors of S, and
the size of that should be greater than the size of S, for every subset of X. If this condition is
met, this theorem or Hall's Theorem guarantees that there will be a perfect matching in the
bipartite graph G. But one direction is very easy, if, for some set S, if size of that set is greater
than the number of neighbors, then clearly that set at least cannot be matched. So what we
will prove is when N(S) is going to be greater than or equal to S for every subset S, G will
contain a perfect matching.

(Refer Slide Time: 31:35)

306
So, this is what we have to prove. We will assume that neighbors of S, is a set of size larger
than the size of S, for all S subset of X. So, with this assumption show that G has a matching
of size X. So we will split the proof into 2 cases, and we will do the proof based on induction.
So, proof method is again using induction, on the number of vertices in X. If X had just one,
if X had just one vertex, this is a trivial case because that one vertex should be connected to
some other vertex in Y, because otherwise the neighbors is not going to be greater than or
equal to size of S. If it is one neighbor, then of course, we can choose that particular edge and
that will be a matching whose size is equal to the size of X. Now, let us take a graph with, we
will assume that statement is true for all graphs with m vertices on the X side, all bipartite
graphs. So, there are 2 cases. For all subset S of X, non-empty subsets, neighbors of S, is
greater than, the size of this is greater than the size of S.

So, now in this case we can just take any particular edge. So, let us say we choose one
particular edge (x, y), add it to the matching and now look at the remaining graph, that is the
graph obtained by removing vertex, vertices x and y from the from the original graph. So,
when you have removed this, the degree is going to reduce by at most 1. We already had the
condition that N(S) was greater than size of S for all S, so now after removal of vertex x and
vertex y, N(S) will be guaranteed to be greater than or equal to size of S. As only one vertex
is removed from Y, we know that for every set S, the number of neighbors is greater than or
equal to size of S.

(Refer Slide Time: 35:15)

307
So, in case 2, there exists a set S such that neighbors of S, its size is equal to the size of S.
This is a slightly tricky case, be careful about this. So, let us say this is our set X and this is
our set Y. And we want to find a perfect matching between these. And we have a set S here,
so let us say that is a top portion of this part X and the number of neighbors is exactly equal
to the size of S. So, let us say this is mapped to this particular region. This is neighbors of S
and we have size of S is equal to size of neighbors of S. Now, if you look at, so this subset
should be, there exist a proper subset, we can assume that S is a proper subset. So, our earlier
case was, I mean, for every subset N(S) was strictly greater and now we are going to assume
that there is a proper subset for which the size is exactly same. Now this is the case, let us
restrict our attention to this set S.

We can apply the induction hypothesis on this set S, and get a perfect matching for S
involving just neighbors of S. So, S and N(S), now since S is a proper subset, we know that
the number of elements in S is strictly less than the number of elements in X. So, S and N(S)
can be matched with our induction hypothesis. Why is this so? Look at S as a set, I mean, so
look at the induced graph that you obtain, by considering just the vertices in S. Now, that
induced graph, that is also going to be a bipartite graph and that bipartite graph is going to be
completely restricted to this portion, because all the neighbors were in N(S). Of course there
could be edges which come from neighbors of S to the remaining portion. But those edges we
are ignoring by just looking at S and N(S). If you look at any subset of S, its number of
neighbors, when all its neighbors are going to lie in N(S), because N(S) was the neighbors of
the entire set.

308
So, any subset's neighbor should basically be in N(S) and because of our condition that, every
subset had at least as many neighbors as its size. We know that every subset will also satisfy
this property. So if you look at a subset S prime, neighbors of S prime, that set size is going
to be greater than or equal to size of S prime for every subset of S, even while we are
restricting to just the graph formed by vertices S and N(S). So this portion involving the top
half of X, involving S and N(S), they can be perfectly matched. Now what about the
remaining portion? Can the remaining set X be matched to Y? The only problem is, if we
look at subsets inside X, let us call it as X prime. They might have certain neighbors, but
some of those neighbors could be in N(S), but we will show that, that is, even when some of
the neighbors are in N(S), our condition that N(S) is greater than, the neighbors of X prime is
a set of size larger than the size of X prime will be a valid assumption. So, what we already,
what we have right now is S and N(S) can be matched.

(Refer Slide Time: 39:25)

309
And what we need to show is, X prime and N of X prime restricted to Y minus N(S) can be
matched. This is the diagram. S was matched to neighbors of S, and this portion we called it
as X prime. So X prime is equal to X minus S and let us call the remaining portion as Y
prime here. So, Y prime is equal to Y minus N(S). So, we want to show that X prime and Y
prime can be matched. So, we can apply induction hypothesis provided we can show that, for
all subsets A of X prime, N(A) is greater than or equal to size of A. This condition was true
earlier, but at that time we would have been, when we look at a subset of X prime and when
we count its neighbors, some of those neighbors could be in N(S). But now we are allowed to
count only those neighbors which belong to Y prime, but we will show that this condition is
still true. So, why is that so?

So, let us assume the contrary. Suppose there is a subset A of X prime, such that this
condition is violated. Suppose, there exists a subset A of X prime, such that N(A) is strictly
less than, the size of N(A) is strictly less than size of A. Then, we will show that there is a
subset in the original graph, where Hall's condition is not met. So, the particular subset is
easy to construct. Let us look at A union S. So, A union S in G has how many neighbors? By
our assumption, A union S in G had at least as many neighbors as the size of A union S.
Neighbors of A union S, its size is greater than size of A union S. This is our assumption. We
will show that condition is now violated, so here this, when I say N(A), this stands for
neighbors in G prime. So I am abusing the notation N a little bit. Sometimes, we use that to

310
denote the neighbors in G. Sometimes we use it to denote the neighbors in G prime. But the
context makes it clear as to which is the meaning that we are giving to N(A).

So when you look at N(A), A is the subset of X prime, so we are looking at the neighbors in
X prime. So, size of N(A) union S, this set is equal to neighbors of A union neighbors of S,
and this is a disjoint union. So, clearly every element of N(S) has to be there in neighbors of
A union S, and clearly neighbor of A also has to be in this particular collection. Now,
neighbor of A is when it is restricted to Y prime. So now N(A) union S, its size N(A) union S
is equal to, size of N(A) plus size of N(S). But size of N(S), we had it to be equal to size of S.
So, this is equal to size of N(A) plus size of S. And size of N(A), its size is less than size of A
so this is going to be less than size of A plus size of S. So, what we have here is, and size of
A plus size of S, that is, A and S do not have any common elements, so that is size of A union
S. So, here we have the following inequality, N(A) union S is of size strictly smaller than A
union S.

So, that is the contradiction and therefore, our assumption is wrong, namely, where we
assume that N(A) is less than A is a faulty assumption. So, that would mean N(A) is going to
be greater than or equal to A. And now we can apply Hall's Theorem to the smaller set X
prime, and we can get a perfect matching, we can get a matching of size X prime between X
prime and Y prime. Combine these two matching, and you will get a matching in the whole
graph. So, what we have accomplished is the following. In any bipartite graph, if the numbers
of neighbors of S is greater than the number of elements in S, for every subset of X, where X
is one side of the graph, then the graph will definitely have a perfect matching. Okay, will
stop here for today.

311
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering.
Indian Institute of Technology, Guwahati
Lecture No. 15
Planar Graphs

So, in this lecture, we will learn about planar graphs.

(Refer Slide Time: 0:33)

We will start with a puzzle. So, imagine that there are 3 houses. Let us say A, B and C
and there are 3 wells, W1, W2 and W3. What we want is to construct paths from each
house to each well, but these people A, B and C do not get along well with each other,
they do not get along with each other, so they want their own individual paths. They do
not want their paths to ever cross. So, maybe for A, we can try it like this. There can be a
direct, so, we can have paths of this kind A to W1, A to W2 and A to W3 and similarly,
we can draw a path from B toW1, B to W2 and B to W3 and C to W3, C to W2, but once
this has been drawn, somehow at least in this drawing, W1 is inside a region and there is
no way we can go from C to W1 without intersecting one of the already existing paths.
Now, is this an artifact of our drawing? Did it happen because we drew it in a certain
kind or is it the case that no matter how complicated we draw the paths, we still cannot
manage to get non-intersecting paths.

312
So, in that case, so we want to answer this question. So, we can reformulate this in a
graph theoretic fashion by asking whether the following graph is planar or non-planar.
We could think of this as A, B and C, W1, W2, W3 and we have the following graph
with all these edges present. We are interested in knowing whether this graph can be
embedded on the plane. So, we can talk about what is called as a graph drawing. So,
when we say a graph drawing, we mean, we have to find, so, position the vertices in the
plane and, and if A, B is an edge, then, connect them by a simple curve. And if we can
find a graph drawing in such a way that the edges, the curves corresponding to the edges
do not intersect, then, we say that the graph is planar. If we can find a graph drawing,
such that there are no crossings, then we say that the graph is planar. So, now the puzzle
can be reformulated as, K3,3, that is the graph that we are interested in, is it planar. So, we
will develop some tools by which we can answer this question. We will, what we will
show is that K3,3 is not planar, and what will help us do that is something known as the
Euler's theorem.

Let us see one more example of planarity. So, let us say we look at this particular graph
on 5 vertices. So, this is 5 vertices and 9 edges. It is the complete graph from which
exactly 1 edge has been removed. So you can write this is K5 minus 1 particular edge, let
us say if we call that a edge (a, b). That is the graph that we are looking at, is a planar
graph? Now if you look at it, there are a lot of crossings in this as well, so here there is a
crossing, this is another crossing, so there are lots of crossings in this graph. But, can we
redraw it in such a way that all the crossings can be avoided? In fact, so, what I am trying
to emphasize is that, if one drawing involves a lot of crossing, that does not mean that we
cannot redraw it to get another drawing where there are no crossing. In this case, we can
do that and that can be seen in as follows.

So, there is an inner star, which we can think of it as a Pentagon. So, a is connected to c, c
is connected to e and e is connected to b, b is connected to d and d is connected to a. So,
that takes care of the inner pentagram. So, 5 edges are taken care of. The other edges that
are missing is (b, c), (c, d), (d, e) and (e, a). So, those also we can draw. (b, c) is an edge

313
and (c, d) is an edge and (d, e) is an edge. We can write in this way, and (e, a) is an edge.
So this is an, this is a redrawing of the same graph in such a way that, there are no
crossings and once we draw a graph, in this particular fashion, that is called as a planar
embedding. So, what we see here is a planar embedding of the graph G.

Now, once we have a planar embedding, let us imagine that, so, we have the entire plane
and we have these particular embedding, if you just remove the edges and the vertices
from this picture, what happens is the plane may get split into disjoint portions. So,
basically you can think of this as, we are removing this particular region and we could
ask this question, how many geographically connected regions would remain. So, we will
give a name to those regions that will basically be called as a face. So, here in this
diagram, there are going to be, this is one face and the green colored region is another
face. Pink is going to be another face, this is yet another face. So, in all you can see that
there are going to be 1, 2, 3, 4, 5, 6 faces, the outside is also going to be a face. The outer
region, the entire outer region, that is, going to be an infinite region. That is going to be
the sixth face. So Euler's theorem basically relates the number of faces, number of edges
and the number of vertices.

(Refer Slide Time: 9:03)

314
So, let us state Euler's theorem. Before that, let us write down the definition of a face. So,
we will informally think of it as the connected regions in a planar graph drawing, called a
face. So now, we are in a position to state the Euler's Planarity theorem. So, this says that
the number of vertices, so V is the number of vertices and e is the number of edges and
the f is the number of faces, and V minus e plus f will be equal to 2. So, how do we prove
this statement? The proof is fairly simple. We can use an induction on the number of
edges. So, this will be true for connected graphs.

So, take a connected graph, it can either be a cycle free graph, or it could contain cycles.
So, we will split the proof into two parts. So, case one, let us say G is our graph. The only
other case is, when G does not contain cycle, so, the connected graph without cycles, that
is called as a tree, so G is a tree. When you have a tree and if you embed the tree into the
plane, then there is, does not create any regions, and therefore, the number of faces will
be 1. So, case two proof is easy, V will be equal to number of vertices, if you call it as n,
and e will be equal to n minus 1, and f will be equal to 1. So, V minus e plus f will be n
minus n minus 1 plus 1, and that is going to equal to 2.

That is the easy case. Other cases are also fairly easy. So, if you have cycles, then one
edge, there is at least. So, if you have cycles, and if you embed it, you are going to get
some particular region, which is a face, could have any number of edges, we do not
bother about how many edges are there in a particular face, but of course, what we can

315
conclude is that, there is going to be some particular edge, which is part of some cycle
and there are 2 faces on the either side of this edge. So, let us say that is f1 and f2, if you
remove this particular edge from the graph, you will get a smaller graph and in that
smaller graph, what we can say is a number of vertices do not change, number of edges
have reduced by 1, the faces f1 and f2 is going to coalesce into a single face.

So, V will now be n and e will be, let us say e prime, if you think of this as a reduced
graph, the number of vertices in the reduced graph is the same as the original one and e
prime will be equal to e minus 1 and f prime will be f minus 1. And by induction
hypothesis, we can say that V prime minus e prime plus f must be equal to, plus f prime
must be equal to 2. So, this is equal to V and this is equal to e minus 1, and this is equal
to f minus 1. So that can be rewritten as V minus e plus f is equal to 2. So, that is a
straightforward proof of Euler's Planarity theorem.

Let us try and show that K3,3 is not planar. This is what we had, this is our main objective,
let us try and prove this. In order to do that, what we will, we will try and derive a
corollary of Euler's Planarity theorem. So, now, if you look at simple graph, that is graph
without self-loops or parallel edges. So, let us look at any planar graph and any planar
graph will have a drawing or an embedding in the plane. Now, if you look at a particular
edge, any edge is shared by at most 2 faces. So, let us look at a face and if you look at a
new face, there are some number of edges that is bounding this particular face. So, if this
is a planar embedding of a graph, the number of faces here are, let us say, this is F1, this
is F2, this is F3 and the outer face is F4. Now, if you look at F1, these edges e1, e2, e3,
e4, e5 and e6 are the boundary edges and you can do this for each particular face. For F3,
the boundary edges are going to be e5, e10, e11, and e2. Note that any edge can act or
serve as the boundary edge for at most 2 faces. So, associated with each face, we can
have this number fi, this denotes the number of boundary edges. So, if you sum up these
fi's, over all the faces, what you will get is 2 times the number of edges. Now, if you take
a simple graph, each face is going to have a boundary with at least 3 edges.

316
For a simple graph, every face will have at least 3 edges. So here, and in this, we are
assuming that every edges, while writing this particular formula, we are assuming that
there are no vertices of degree 1, we can argue that summation of fi's is two times the
number of edges, because if you have vertices with degree 1, the edge corresponding to
that is part of only 1 face. So, the 2 term does not come.

(Refer Slide Time: 17:00)

317
Our objective is to show that K3,3 is not planar. In order to do this, we will obtain a
corollary of the Euler's Planarity theorem. Now, let G be any graph with, so consider a
connected graph without any vertices of degree 1, so this would mean that, every vertex
is part of some particular cycle. Now, if you further assume that G is planar graph, that it
has an embedding. So, consider a particular embedding on the graph and if you look at
the faces, if we denote them by F1, F2, F3, F4 and so on, we can associate with each face,
its boundary edges. For example, these red edges are the boundary edges of the face F4.

Now, let small fi denote the number of boundary edges for the face fi. So, note that every
edge is part of precisely 2 faces. For example, if you look at this blue edge, that is a
boundary edge of F3 and F1 and therefore, we can conclude that summation fi, over all
the faces, that is, going to be equal to twice the number of edges. Because, each boundary
edge is being counted twice, once for each of the face that it belongs to. So, now, if you
take a simple connected graph, we can argue that, every face will have at least 3
boundary edges. If it is just 3 boundary edges, then each fi is greater than or equal to 3.
Therefore, if f is the total number of faces, 3f is going to be definitely, so summation fi is
at least 3f, and this is equal to 2e. So, we can conclude that f is less than 2e by 3.

Now, if we substitute this in the Euler's formula, V minus e plus f is equal to 2, what we
get is, e is equal to V plus f minus 2, and that is going to be less than V plus 2e by 3
minus 2, therefore, e is less than 3V minus 6. So, this is a corollary that we can use. And

318
if we assume that graph does not have any cycles of length less than 3. So, then we can
conclude that every face has at least 4 boundary edges. So if the graph is triangle free,
then what we have is, every face must contain at least 4 edges. So 4f will be less than
summation fi overall faces, and this is equal to 2e, so f is less than e by 2. And in that
case, if you substitute in the formula e is equal to V plus f minus 2, we will get e is less
than or equal to V plus e by 2 minus 2, so there we can conclude that e is going to be less
than 2V minus 4. So, these are the two things that we can use in order to show that
various graphs are non-planar.

So let us do that. So, the first thing we will show is, K3,3 is non-planar. So, this is 6
vertices and 9 edges. If we substituting e is less than 2V minus 4, e is equal to 9, and
should be less than or equal to 2 times V, that is 2 * 6 minus 4 that is equal to 8. So, 9
must be less than 8. If K3,3 was planar, it must satisfy this particular relation. Because K3,3
is a bipartite graph, it is triangle free, and therefore, this condition applies. And if it
applies, then, 9 should be less than 8. And that is not the case, so we have our
contradiction. If we look at K5, that is another graph that we can show to be non-planar.
We showed that a particular graph obtained by removing just one edge from K5 was
planar. But if we include that edge into it, we have K5 and K5 we will show is non-planar
graph. Again, it is simple application of this particular corollary. Number of edges in K5
is 10, because there were 2 pentagons, and 10 should be less than 3 times number of
vertices, number of vertices was 5 minus 6. So this is 15 minus 6, the 10 should be less
than 9 if K5 were planar. So, that is a proof of non-planarity of K5.

319
(Refer Slide Time: 23:58)

We look at a slightly more complicated example. But essentially the same principle, if
you look at our Peterson graph. This is a graph with10 vertices. So, V is 10 and number
of edges is 15 and we can show that this is certainly not a bipartite graph, because it is not
cycle. So, the only condition that we can hope to apply from the 2 corollaries that we
have drawn is, e is less than 3V minus 6. But that condition unfortunately, it does not lead
to a contradiction, because e is 15 and 3V is 30 minus 6, so the condition holds. But, the
condition holds does not necessarily mean that the graph is planar. But here, what we can
do is we can look at the original condition that says summation fi, over all faces is equal
to 2e. So here, there are no 4 cycles, or 3 cycles. So, if you, so Peterson graph do not have
3 cycles or 4 cycles, is something that you can easily check, and because of this, every
face must contain at least 5 edges. So 5 times f is going to be less than summation fi that
is equal to 2e.

So, we can conclude from this that, f is less than 2e by 5. Now, if we look at e is equal to
V plus f minus 2, we can conclude that e should be less than and if it were planar, then e
must be equal to V plus f minus 2 and e should be less than or equal to V plus 2e by 5
minus 2. So, 3 times e by 5, should be less than V minus 2, or in other words, e should be
less than 5V minus 10 by 3, now if you substitute the values, in case of Peterson graph’s
e is equal to 15 and that is less than 50, because V is 10 minus 10 by 3, you will get 15 is

320
less than 40 by 3, that is clearly a contradiction. So, we have seen planarity and we had
seen multiple applications of Euler’s Planarity theorem to prove that various graphs are
not planar, to prove the non-planarity of various graphs. We will stop here.

321
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture No. 16
Graph Searching: BFS and DFS

In this lecture, we will learn about graph searching. So, graph for us is a collection of vertices
and edges and we need to explore the vertices in a certain order. Let us take a simple
example. If you think of any social networking site, you can look at your individual page as a
certain node and you want to know all your connections. So, you want to know who your
friends are, their friends and so on.

(Refer Slide Time: 1:11)

So, you can think of this as a huge graph in which each person is a node and there is an edge
between two people, if they are friends. So, if given some such graph, so here there is an
example of 4 people; 1, 2, 3 and 4. And here, 2 and 4 are friends with each other and they are
friends with 1 as well. So, you are given some arbitrary graph, how do you look at all the
vertices, list all the vertices in a certain order, restricted to, let us say, I mean, if you think
about the case of social networking sites, you want to order all your connections in some
order. You can either look at, I mean, you can look at your friends, list them out and then
friends of friends and then friends of friends of friends and so on and that would essentially
be what we refer to as breadth first search.

322
You can also look at an alternate approach. You find one of your friends. Look at another
friend of his and keep on doing that sequence of friends, then you will get what is called as
DFS. The commonly used searching techniques, they are, we will just write them down,
breadth first search and depth first search. So, here if you start your search at vertex 1, we
may view vertices 2 and 4 are the vertices at depth 1 and the vertex 3 is at depth 2. So, if you
look out at the order 1 followed by 2 4 followed by 3, that would be the breadth first search.
You have listed them in the order of their depths, that is called as a depth first search,
whereas if you list it as 1, 2, 3, 4, so you have gone from 1 to 2, from 2 you could go to 4,
from 4 you could go to 3. But, there are examples where these two searches return different
orderings.

(Refer Slide Time: 4:01)

So, let us look at this more formally. First, look at a little more complicated example. So, we
have the following graph. Suppose we have this graph. What will the BFS ordering be, and
what will be the depth first search ordering. So, we will first the write down the BFS
ordering. Starting at vertex 1, so the start vertex, let us colour it using red, this is just to
indicate that it is a starting vertex. So, that will be the first vertex that is encountered in BFS
or in DFS. Now, look at all the neighbours of 1. There are precisely two neighbours 2 and 3.
List them out, and you will get 2 and 3. And the next stage would be to list out all the
neighbours of 2 followed by all the neighbours of 3. To do that, in this case the neighbours of
2, there is 3. So, we will not list out the already listed neighbours.

323
So, there are precisely two neighbours, namely 4 and 5. So, 4 is a neighbour of 2 as well as 3.
5 is a neighbour only of 3 and not of 2. And then, we will list out the neighbours of 4 and 5,
we will get 6, 7 and then we have the neighbour 8. So, this would be the BFS ordering.
Whereas, if you take that DFS ordering, so let us say, so in BFS ordering, it is not fully
determined in the sense, did the vertex set vertex 2 before 3 or do we visit 3 before 2, that is
not really specified, you could visit in any order and that has a chain of other choices that
may not be fully determined., The same applies in DFS, which neighbour you pick there is no
restrictions in either BFS or DFS but whether you look at all neighbours of a node and then
go to the next level or whether you keep on traversing the graph layer by layer from one
vertex to its neighbour and then to its neighbour and so on. So, that is what differentiates BFS
and DFS.

So, DFS ordering, so you could go from vertex 1 to vertex, say 3, and from 3, you could go to
4. From 4, you could go either to 5 or 6. So, the ones which we have visited, we just draw in
red. So, 1 to 3 we went, from 3 to 4 we went and then 4 to 6 we could go, and from 6 we
could go to 8 and from 8 we could go to 7, and from 7, we could go to 5, and after we have
reached 5 we see that all its neighbours are essentially visited. So the neighbours of 5 are 3, 4
and 6. All of them are visited, so you go back to the node from where you came to 5, namely
7, its neighbours are all being taken care of, same with 8, same with 6, same with 4, you back
track all the way up to vertex 4. 4 have a neighbour, which is not yet visited, so that is node
number 2, so that will be the last node that is visited. So this will be the DFS ordering. Now,
we will see how we can algorithmically implement this. We will see an algorithm for doing
these searches.

324
(Refer Slide Time: 8:32)

What we will do is a following. We will have colours associated with every vertex. So, these
colours are to indicate whether certain vertex has been visited, whether all its neighbours
have been explored and so on. So, we will have 3 colours. So colours will be one attribute,
another attribute would be its predecessor, each vertex will have a predecessor. So, this will
indicate the vertex from which we explored the given vertex. So, if you have a vertex u, so u
dot colour will initially be white for every vertex. So, this is a start state, every vertex will be
initially white. The other colours are grey and black. Grey would mean we have partially
explored that vertex and it is still under process, some of its neighbours have been explored
some of them have not been.

And black basically means all it's, I mean that vertex, everything that we had to do with
respect to that vertex is over. We have we have finished the processing of that particular
vertex and there will be another parameter associated with each vertex that will be its
predecessor, so denote it by u dot pi, so u dot pi would mean the predecessor of u and this is
the vertex from which we reached vertex u. So, that will be the predecessor of u. Now our
algorithm, the way it works is, it will have a queue and we will push these vertices into a
queue as the algorithm progresses, and after all processing is done, you would have visited all
the vertices from a given starting vertex. We will assume that the initial graph that is given to
us is an undirected graph, which is a connected graph. Of course, BFS would work even if the
graph was a directed graph and if it was a disconnected graph as well.

325
When the graph was disconnected, what we can do, we can start a BFS at some arbitrary
node in each connected component. We might need to discover what are the connected
components or we could just start BFS at every vertex of the graph. To start the BFS at every
vertex of the graph, we need to check whether a certain vertex is already being covered by
any other previous BFS. So, while doing the breadth first search, if we have already seen
some vertex as part of searching certain other component, we can ignore that BFS search. So,
that modification can be made. And directed graphs these are the essentially the same
algorithm would work, but for just to keep things simple we will just look at undirected
example. So in BFS algorithm, we have to initialise the colours. So, there is an initialisation
phase, where in, for every vertex so u dot colour will be set to white, that means, at the start,
vertices are all unexplored.

So we will assume that the BFS algorithm, which may, we may call it as BFS. It has 2
parameters; the given graph and the starting vertex S. And for every vertex, the parent u dot
pi will be equal to nil. So initially we do not know what is the predecessor or parent of a
particular vertex, so we will just set it to pi. As our algorithm progresses, when a node is
discovered for very first time, the node, the edge which cause the discovery of this particular
node will be explored and that edge will tell us what is the predecessor of the particular node
that we have just now found. So, initially this is set to nil. And then, we can also initialise the
queue. We will call it as a, the queue, we will name it as Q itself and this is set to empty. So,
there is an empty queue, the queue Q will be used to track the vertices and it is initialised to
empty. And then, what the algorithm does is, it will repeatedly explore for whatever is there
in the Q. So, after the Q has been initialised, we have to add one element. So, we will add S
to Q. The starting element that we had, the starting vertex is added to the Q. This is the first
step of the algorithm and then while Q is non-empty, we will repeatedly do certain actions.
So, delete an element from the Q. So, we will call this as deleted element as v from Q, so Q is
a first-in first-out data structure, which means the element that you have added, they will be
removed in the same order.

So, at the start we have added just one element. So, when you try to delete, the first element
that will be deleted is going to be S, and then for every neighbour of v, so we have deleted
one particular element, for each of its neighbours, we would start exploring those vertices.
So, for every neighbour, let's say u of v, we will do the following, we have one particular
node that is deleted and that might have many neighbours and for each of the neighbours, if it

326
is an unexplored neighbour, means no in this vertex has not been explored before, we would
want to do something with it. If it is already been explored, we will just leave it as it is.

So, if u dot colour is equal to white, in that case, what we will have to do is, we will have to
push this into the Q and before adding it into the Q, we will change its colour. So, this means
we need to process it so it will change its colour and the processing has begun, u dot colour is
equal to grey and its parent, u dot predecessor, this we will set it to v, because v was the
vertex from which we visited u because u is a neighbour of v. So the parent is being set and
then we can add u to the Q. Add u to Q, and after this processing is done for every neighbour,
we can change the colour of the vertex v to black because v has been processed. So, v dot
colour is equal to black. So that is the algorithm.

(Refer Slide Time: 17:35)

So, now let us see this in action. So, let us say we have this following graph. If you start at
vertex 1, so this we will explain in a queue. So, this is our initial queue and we will first add
vertex 1 to the queue, that is, the only element that is there in the queue. And then we delete
one. So, initially the colours of all vertices are essentially white. Once we have added the
colour of that vertex changes it becomes grey. So, 1 is now a grey vertex. And we will look at
all the neighbours of u and change its colour and we will set its parent to 1. So, the vertex 1
has two neighbours, namely 2 and 3. So, 2 and 3, their colour is being changed to grey and
the parent is being set to vertex 1. So, there is going to be, so by this arrow, I will indicate the
parent. So, 2's is parent is 1, 3's parent is also 1.

327
And after this, so this is the step that you do, and 2 and 3 has been added, so we may assume
that first 2 is added and after the processing is done, 3 is added. So, vertex 2 would be added
and vertex 3 would be added and we would have taken off 1 after this. So, that would have
been the first step. We have deleted that element and we have changed the colour of 1 to
black. So, now it is no longer grey, but it has become black indicating that full processing is
over. The queue is still not empty. So, we will pick out vertex 2, and if you look at vertex 2
that has been deleted. Once vertex 2 has been deleted, we will look at all its neighbours.

Vertex 2 has namely 3 neighbours: 1, 5 and 6 are the neighbours. When you look at these
neighbours, 5 and 6 both are currently of colour white. So, we can add them. So, let us say we
first add 5 and then add 6. So, that goes inside the queue and their colour as well as parent is
reset. So, 5 becomes the first neighbour, let us say it is 5. So, the colour is changed and the
parent is set 2 and 5 is added, and after 5 is added, the next neighbour of 2 is going to be 6,
that is also going to change its colour to grey and we have parent for 6 which is 2, and 6 is
being added into the queue. So, after 5 and 6 is being added, the next node that you process is
3. 3 has again three neighbours; 1, 5 and 4. Out of them, the only white neighbour is going to
be 4. So, that will be converted into grey and the parent would be set to 3.

And additionally, we would have 4, coming into the queue and 3 by the stage is done with,
we had deleted 3. The next element that is going to be deleted is 5, but there is no further
processing to be done with 5 because all the neighbours of 5 would have been already of
colour other than white. So 5 is also done. So when we are deleting and after the delete phase,
after the neighbours have been processed we would have changed the colour to black. So, we
skipped couple of steps. We would have changed 2 to black and 3 to black after that, and then
5 to black. The next node that is going to be processed is going to be 6. 6 have one neighbour
namely 7. So we will add 7 into the queue, and its colours are changed and the parent is also
set. 6, the other neighbours of 6 are 5 and 2, which surely cannot be added, and after 7 is
being added, we can say that the processing of 6 is over, and its colour becomes black. The
next node that would be processed is 4. 4 have two neighbours 3 and 7 but both of them have
colours different from white and therefore there is nothing to be done with any other
neighbours of 4. 4's colour will be changed to black and then the queue contains only 7 that is
removed, all the neighbours of 7 namely 6 and 4 has already been processed, they are black
in colour and therefore nothing has to be done with 7 other than change its colour to black.

328
And at this stage, the queue is again empty. So, that processing is over and we would have
traversed all the nodes by that.

And note that if you look at just the predecessors of each node that will be an interesting
diagram. This will be a tree, there is going to be no cycles, it is a directed tree with every
node having a unique path to the vertex 1 and that tree is called as a BFS tree, and this entire
algorithm can be implemented in linear time, if you maintain the initial graph as an adjacency
list. The next search algorithm that we will see is the DFS search.

(Refer Slide Time: 25:31)

So, let us begin the DFS search, or the depth first search. So again, we will have two
parameters associated with each node, namely its colour and the parent. Instead of using a
queue, we could implement DFS algorithm using a stack. We will see a recursive version of
DFS algorithm wherein we will not have to exquisitely maintain the stack. Recursion would
essentially keep track of this stack. So, DFS algorithm, this will require two inputs two
parameters; one is the graph itself and the other is the starting vertex. We will have an
initialisation phase, wherein the colour of each vertex, so u dot colour is equal to white and u
dot parent is equal to nil. So, before you start the DFS algorithm on any particular graph, we
need to do this initialisation.

And once this is initialised, we can call our DFS routine. So write down the DFS routine, two
parameters G and S. We will assume that the initialisation has already been done and what
we will do is the following. So we will make the colour of S to be grey, so S dot colour is

329
equal to grey and then for every neighbour v of S, for each, we will do the following. If v dot
colour is equal to white, so while the DFS is being run, each vertex, its colour could change.
We want to ensure that the colours change from white to grey to black, so if you have a white
coloured vertex that means it is not being explored yet and therefore we are ready to explore
it now, because it is a neighbour of vertex S and what we will do is we will just call DFS with
this new vertex on the same graph.

So, DFS(G, v), and after all the vertices of S has been processed, we can set the colour of S to
black. S dot colour is equal to black. Maybe at this point, just before calling DFS, we will
need to set the parent v dot pi is equal to S. So, let us see how this works. Suppose, we have
this particular graph and we start DFS at 1. So, initially every vertex is of colour white and
the parents, there are no parents, everything is nil. And when you call DFS on vertex 1, what
happens is the colour of that vertex immediately changes to grey, and then what happens is
the recursive call starts. So, for every neighbour, vertex 1 has two neighbours namely 2 and 3,
for each of them we are going to call the DFS.

Let us assume that the first DFS call will be for 2 and the next DFS call will be for 3. But
before that we need to check whether those are white, indeed they are white and therefore we
can make those calls. The very first call that will be for is DFS 2, when DFS 2 is being called,
what happens is, it will go into the recursive call, the recursive routine and inside that the
colour of this would be made grey, and we would have set the parent. The parent of 2 is
going to be 1 and now we are going to be process DFS G with 2. When we are running the
DFS on 2, its colour would have been set to grey and then for each neighbour of 2, we will
have to process this particular code. We will have to process this code segment for every
neighbour of 2.

2 have three neighbours namely 1, 4 and 6. But out of them only 4 and 6 are white, and
therefore those alone will be processed. So let us say vertex 4 is going to be processed first,
so its parent is going to be set to 2 and while DFS (G, 4) is being processed, this will turn into
a grey node. Once this is a grey node, we will have to explore all its neighbours. It has four
neighbours; 2, 3, 6 and 7. The white neighbours are three of them; 3, 6 and 7. So presumably
all of them will essentially be processed. Let us say DFS 4 when it is being called and
neighbour that is picked first is vertex number 7, and therefore the parent would be set and 7
would turn grey, and DFS 7 is going to be processed.

330
DFS 7 is processed. 7 has two neighbours namely 4 and 5, the only white neighbours is going
to be 5, so there is going to be a DFS 5 and the parent is going to be set. And DFS 5, if you
look at all its neighbours, its neighbours are 3 and 7, and 3 is still white in colour, so that
would be processed and that turns grey. And when you are processing 3, the only neighbour
of 3 that is still in contention is 6. So, it will directly go to 6. And let us look at the situation
now, 6 would have turned to grey and all the neighbours of 6 are of colour other than white.
So, the call will return back to 3. Let us just make one more neighbour. If we had a neighbour
8 which is connected to both 1 and 2, the recursive calls would have happened in the same
way.

If you look at 6, all its neighbours are being processed. So you will go to 3, and 3 also every
neighbour is being processed, there is no neighbour of 3 which is of colour white, and
therefore it goes back to 5, and from 5 it goes back to 7, from 7 it goes back to 4, because all
the neighbours of these nodes were already processed. When you come back to 2, 2 had lots
of neighbours namely 6, 4, 1 and 8. But the only neighbour which is still white in colour is 8,
so that will be processed and the parent would essentially be pointing to 2 and this would
become grey. And at this point, all the calls are being serviced and 8 has no longer any other
neighbour, I mean, it does not have any neighbour which is white in colour and therefore the
entire processing is over. We should have checked that while we are returning these calls, the
colour is changed from grey to black. So the place where we would have ended up search
initially, would have been 6. So at node 6, when we were at node 6, all its neighbours were
processed and at that point this would have turned black. 6 would have turned black and then
we would have come back to 3 and all neighbours of 3 have been processed so this would
have again been turned to black and then when the call returns to 5, this would have become
black. And then, that call returns to 7 and 7’s call is also serviced, this becomes black, this
would have become black and the call returns back to 2.

2 has one more neighbour to be processed 8, after 8’s processing is done, this becomes black
and then 2 becomes black and 1 becomes black. And at this point, we have a tree, which is
obtained by looking at the predecessor notes, and that is what we will call as the DFS tree.
This entire DFS operation, which we did, via means of recursive calls, we can implement it
by means of a stack. So, each recursive call instead of it being automatically serviced by
means of stack, we can explicitly construct our own stack and push the vertices into the stack

331
and pop it out from the stack as and when those vertices are processed. So, in either case we
can implement the algorithm in linear time, assuming that the graph is given in an adjacency
list format. So, this is about BFS and DFS. We will stop here for the time being. So, that is
the end of this lecture.

332
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture No. 17
Network Flows

In this lecture, we will learn about flow networks. So flow networks are nothing but
capacitated directed graphs. Let us formally define what flow networks are.

(Refer Slide Time: 0:45)

So, flow network has two components. First is a directed graph. So G, we will denote it by
the graph by G = (V, E). V is a set of vertices and E is a set of edges, and this is a directed
graph. This would mean that the edges, if we denote an edge by (u, v), there is a direction on
the edge. It goes from u to v and not v to v. And the second component of flow network is a
capacity on edges. So we can think of that as a function, that is, say c from the set of edges to
positive real. That would mean, for every edge, there is a number associated with it, a
nonnegative number which we will refer to as the capacity of the edge.

So we will use the notation ce or say, c of e to indicate the capacity of the edge e. So let us
see an example of such a network. So this network has 6 vertices, numbered 1 to 6, and there
is some number of edges. The edges have given it a direction. So 3-5 is an edge, whereas 5-3
is not an edge in this particular graph. But you could have edges going in both directions. So
since we are looking at directed edges, we could have an edge going in the backward
direction as well. So this would be a flow network. And there are, there is a third component.

333
Third component are the source and destination, there are two vertices s, t belonging to V,
which we will call as a source and destination. So in this, if s is a vertex 1, then this is what
we will call as a source and 6 is what we will call as a destination. So flow network contains
three parts. First is a directed graph, second is the capacity on the edges and the third is two
designated vertices, which we will refer to as source and destination. We will further assume
that all the edges that involve the vertex s are outgoing edges. So here there will be a small
trouble in this particular diagram. In particular the 3-1 edge, it is an incoming edge, it is an
incoming edge at the vertex 1, so what we will assume is that there are no such edges. So
these edges would be an outgoing edge. And at the destination, all the edges are incoming
edges. So these are the restriction. Capacity has to be positive. At the source all the edges are
outgoing edges, at the destination all the edges are incoming edges. And, at the internal
nodes, that is, the nodes, that are neither source nor destination, the edges could be in either
direction. So this is the definition of a flow network.

Here we have not put the edge weights. So we could put them. So we put some numbers on
the edges, those denote its capacity. So c of the edge 2-4 is going to be 5, the interpretation
that we are going to give to this is we can think of this as pipes, which can carry a certain
amount of fluid and the amount, the maximum amount of fluid that it can carry is bounded by
the capacity of the edge. And the problem that we are interested in solving is, what is the
maximum amount of flow that we can take from source to destination. So there are additional
assumptions, what are the conditions that should be met. Since we are thinking of this as fluid
flow, we can say that there are some natural conservation criteria that each node should
satisfy.

In particular, at any of the internal nodes, the incoming flow must be equal to the outgoing
flow and on any edge the maximum amount of flow that can be there is going to be bounded
by the capacity of that edge. So we will formally define what a flow is and the objective of
this lecture would be to have a procedure or characterise what is a maximum flow that is
possible in a flow network. So let us formally define what a flow is. The flow, we will denote
it by f and that is a function from edge to real. So at each edge, we are assigning certain
amount of flow, and this should satisfy some conditions. The first condition is the capacity
condition, which states that fe should be less than or equal to ce for all e belonging to the edge
set. The maximum amount of flow that can be routed through any particular edge is bounded
by the capacity of that particular edge.

334
The second condition is the conservation criteria, which says that the amount of flow that
comes into an internal node should be equal to the amount of flow that leaves that particular
node. This should be satisfied for all nodes except source and destination. So we will just
write this as inflow should be equal to outflow. We will formally define what these terms are.
But this is what the conservation criterion says, inflow is equal to outflow. So at every
internal node, so nodes other than s and t are what we refer to as internal nodes. At every
internal node, the incoming flow is equal to the outgoing flow. So let us try an example, and
construct a flow on this particular flow network.

So let us say I have 1 unit flowing from 1 to 2, and this is also 1. So, all my flows are going
to be of unit size that will be indicated by blue line. So if I look at this flow, you can verify
that, at every node the incoming flow is equal to the outgoing flow, because 1 unit is coming
in and 1 unit is going out. So this is a network where the flow is bounded by, I mean, if you
take these blue edges we get a valid flow, and at any node if you look at it carefully, you can
see that, at any edge the maximum amount of flow is only one and therefore it naturally
meets the capacity criterion. Now we could look at increasing the flows. So let me just
indicate it by numbers. So let us say if I had sent 3 units through this, and 4 units through
this. So there should be 3 flowing through this, the edge 2-4, and the 4 that comes here, it
could either routed through the 3-2 edge or through 3-5 edge. The 3-5 edge, I can send only 2
and the remaining 2 goes here and 5 comes here and 5 goes here and 2 comes here. So this is
a little more involved flow, in the sense, more edges are participating in this flow. The flow
still respects the capacity conditions and the conservation criterion. But the amount that is
flowing from the source to destination is equal to 7 units, which is better than the previous 2.
But is this the best possible? Can we find anything significantly better than this?

In fact, in this, we can argue that, it is not possible to find any flow which does better than
this. So that is a theorem that we will learn later on in this lecture that is called as a Max flow
min cut theorem. So if we you use the Max flow min cut theorem, we can say that, in this
particular network we cannot expect to get a flow of value greater than 7. We have not yet
defined what the value of a flow is. We will do that. This is just a glimpse of what we will be
saying in the remainder of this lecture.

335
(Refer Slide Time: 11:25)

So, so far we have defined what a capacitated network is and we have defined what a flow is.
A flow is some assignment of nonnegative integers to the edges in such a way that the
capacity criteria are met at every edge and the conservation criterion are met at every node.
Now we will define what the value of a flow is. So value for flow is defined as summation
over all the edges e out of s fe. So the flow naturally associates nonnegative integers to every
edge. Now look at all the edges at s, all the edges are outgoing edges. If you sum up the flows
of all of them what we will get is referred to as the value of the flow. Our objective is to
maximize the value, find a flow which maximises v(f). Find a flow f such that v(f) is
maximized. Let us understand this quantity v(f) carefully.

So we will define some additional quantities. So f is defined for each individual edge. We
will define f out at a particular vertex, let us say u. So this is defined as the outgoing flow at
vertex u. So that would be summation over all edge e out of u fe and similarly we can define f
in of u is summation e into u, f of e. So if you have a particular vertex u, and there are lots of
incoming edges and each of them carry a flow. So let us say this is f(e1), this is f(e2) and this
is f(e3), if you sum up over this, whatever you get is f in of u. And similarly if you have a lot
of outgoing edges, so let us say as f(e4), 4 was this edge, e5 and e6, the flow on the edge e4
that is f(e4) and sum up over all those outgoing edges their flows, what you get is f out of u.

And therefore we can say that value of the flow is now simply f out of s, because s was our
starting vertex, the source vertex. The flow out of s is defined as the, that was basically the
definition of the value of the flow. So f out, was a function from vertices to real numbers.

336
Now we can extend it as functions from, I mean, subset of vertices, so S subset of V to R
plus, we can extend it in a natural fashion. So f out of S would be, so this is a set of vertices.
So let us call this as S and there are lots of edges which leave S, and of course the edges that
come into S. So the incoming edges, let us mark in red, outgoing in black itself and then there
are other edges which are within S itself, so if you sum up over all those outgoing edges their
some will be f out of S.

So this is summation e out of S, that is summed over all the edges that are outgoing edges
with respect to S, their flow if you sum it up fe that will be f out of S, and similarly you can
have f in of S which will be equal to, so this S is the capital S and not to be confused to the
source vertex. Summation e in to S fe that will be f in of S and therefore, we can verify that
value of the flow is also equal to value of a set which contains a vertex s and does not
contains the vertex t. So if you sum up over any collection of vertices, which contains s but
does not contain t that will also be the value of the flow. So let us see why that is the case.

(Refer Slide Time: 16:51)

So we will first define, what is a cut, or we will call this as s to t cut. An s-t cut in a graph G
is a partition of the vertices into two groups or two parts, such that one part contains s and the
other contains t. So these are the vertices. If we split it into two parts and this part contains s
and this part contains t, then this split is called as an s-t cut. And if you look at the edges of
the graph, the edges are of 4 kinds, the edges which have both its endpoints in s itself. So
there are these kinds of edges, which starts and ends in s, and there are these edges which

337
starts and ends in t, and there are these other edges which starts from s and go to t and then
there are these edges which starts from t and comes to s.

And these edges which go across the cut, that is, from s to t or t to s, they are called as cut
edges. So we will now prove theorem. So let f be any flow from s to t and (A, B) be an s t cut,
when we say (A, B) is an s-t cut what we mean is, A is a collection of vertices which contains
s, and B is the complement of A which contains vertex t. Then, value of the flow is equal to f
out A minus f in of A. So we had defined what f out is and f in for a collection of vertices. So
f out of A is a sum over the flow over every edge that leaves A and f in A, is a sum over the
flows of every edge that comes into A, and this difference will be equal to value of the flow.
The proof is very simple. Basically putting in all the definitions together, so A is a set, it
contains s.

So now we have all these outgoing edges and what we want is the sum over all these
outgoing edges that will be our f out A. So outgoing edges we are marking in red, and
incoming edges let us mark in green, our RHS is going to be sum over the black edges minus
sum over the green edges. Whatever is the flow on these edges, they have to be added up
appropriately. The black edges you add up, the green edges you add up, subtract them, what
you get is the RHS. Whereas, the LHS is going to be the sum of the outgoing edges at s. s has
only outgoing edges, so you look at all red edges, their sum is going to be your left hand side.
Now we need to show that these two quantities are equal. So how do we do that?

So note that, value of the flow is equal to f out of s minus f in of s, okay, this is so because f
out of s is the real value of f, of the value of the flow, f in of s is 0. As this is 0, we can write
this equation. So f out minus f in for the particular node s, if you do that, what you will get is
the value of the flow. Now for every other node in this graph, other than t, f out s minus f in s
is going to be 0. The conservation law says that since f is a flow, if you take this quantity out
minus in for any other vertex that is going to be 0. So we can also write this as summation
over all v belonging to A, f out v minus f in v is going to be equal to this quantity. We can
write this because the vertex t is not there in this collection A. The vertex t belongs to the part
B. So let us look at this summation.

Every vertex belonging to A, it is either the vertex s or it is a vertex for which f out v minus f
in v is 0, because the only vertex for which it was nonzero were s and t. s we have already
considered, t is guaranteed to be not in A because (A,B) is an s-t cut. So this summation is

338
going to be value of the flow. So this we can just simply write it as summation v belonging to
A, f out v minus summation v belonging to A, f in v. Now if you look at this expression, what
this does is, for each vertex in A, look at all the outgoing edges and for them you add f out v.
So this is a particular edge. We are going to add fe, while we compute the first term of the
summation. Now the edges can be of 3 kinds. The internal edges, the edges that lies within A,
the edges that go from A to B or the edges that comes from B to A. These are the only kind of
edges possible. The edges which are within B to B are not even considered so A to A edges,
what happens to those edges? For them fe is counted once as positive and once as negative.
So if you look at this entire summation, the internal edges are going to cancel out and the
external edges are the only thing that is going to remain. So if you look at this term, there are
going to be lot of terms of the form f(ei), some of them are going to appear as plus, some of
them are going to appear as negative. So let us understand which ones appear as positive and
which one appears as negative.

All the edges which lie within A, they come in both positive and negative and therefore they
cancel out each other. Those edges which originate in A and end up in B are going to come
only in, this is a positive format. So one of them you give as positive and the other as
negative, all the outgoing ones are the ones which are going to be positive. So this summation
we can write as, sum over, so here the summation was over vertices, sum over edges e such
that e is out of A fe, those are the edges that are going to remain. And all those vertices which
come from B to A, they are also going to remain. Because they are going to come in as part
of f in of A, so that will be summation e into A fe. And this is going to be summation, and this
by definition is going to be f out A minus f in A. So that would mean the value of the flow is
equal to the quantity that we are interested in. So from this, what we can say is, the maximum
value of any flow is bounded by the positive quantity in this term.

339
(Refer Slide Time: 26:31)

So we can get an upper bound on the flow. So we know that value of the flow, of any flow is
going to be equal to f out of A minus f in of A . And if you look at any particular cut A, the f
out is going to be at most equal to the sum over the capacities. So there are lots of outgoing
edges, and let us say each of their capacities are c(e), c(e1), c(e2) and so on, c(ek) are the
outgoing edges. So f out is going to be less than, so this quantity is going to be less than
summation e out of A c(e) and minus some other quantity, so we will just ignore that quantity
because we are interested in only upper bound, and this quantity is what we call as the size of
the cut. So we call that as c(A,B).

So if we call this value as a cut size, what we have shown right now is v(f) is going to be less
than or equal to the size of the cut. What we will later on see is that there will be a flow such
that it matches the size of the cut. For any capacitated network, we can find a flow such that
the value of that flow is going to be equal to the capacity of some particular cut and that
would be the max flow min cut theorem. Here what it says is, you take any cut, the size of the
cut is a natural upper bound on the flow that is possible. So in particular, we can write the
maximum flow is going to be less than or equal to min cut. Find the cut whose capacity is the
least and this theorem says that any flow's value cannot exceed the value of the min cut. How
do we show that these quantities, the min cut will be equal to the max flow. That is, what we
will do in the remaining part of this lecture. So we need some additional concepts.

340
(Refer Slide Time: 29:12)

So we will define the concept of what is called as the residual graph. Before we do that, let us
motivate this concept through an example. So let me look at the capacitated network with 4
nodes. The 1-2 edge can carry let us say 30 units and the capacity of 2-3 is 20 and 3-4 is
again 30, that is 1-3 is 10 and 2-4 is also 10. What is the maximum amount that can be, what
is the maximum possible flow? So let us look at, this particular path, indicated by the green
line or green curve. We can route 20 units through this particular path without violating any
of the conditions. But once we have routed 20 units of flow, we cannot send any additional
flow without affecting the already existing flow.

But clearly we could have let us say sent 30 units from 1 to 2 and split 10 to 2 to 3. So we
could have done the following. 30 starts from here, 10 go here, 20 go here, and this 20 goes
here, an additional 10 comes from here and that is sent on this. So you could send 40 units
from 1 to 4, this is the source and this is the destination and you can send 40 units, and that is
going to be the maximum possible. So let us look at this case where we are only sending 20
units of flow from 1 to 4 using the green edges. In that case we will have a residual graph.
We will define the concept of residual graph, it will look something like this. 1 to 2, the
capacity of this edge is completely, I mean it is used to the extent of, if you look at 1-2 edge,
there was already 20 units flowing through that. So we can reverse that, so that will be
indicated by a back edge which can carry 20 units.

341
And forward capacity has been used to some extent, 20 units have been used, so there is a
remaining 10 units that can be routed. And 2 to 3 there is going to be, there was a forward
edge which was saturated. So we can only send back 20 units along this. And if you look at
the vertex 4, this edge remains because all the 10 units is available. You can route 10
additional flow through this and 3 to 4 there was 20 units going in the forward direction, so
there is 20 units in the backward direction and there is a residual capacity of 10, that can be
used, and here there is an additional 10 units of flow capacity, that we can still use. So this is
the residual graph. So this, if we call as the graph G and this is the residual graph for the flow
f. So f was routing 20 through 1,2, and 20 through 2,3, and 20 through 3,4. So corresponding
to that flow, we will get this particular residual graph. Now look at the residual graph, in the
residual graph there are many paths from, let us say 1 to 4, and those residual paths can be
used to route additional flow. For example, you could take the 1 to 3, 3 to 2 and 2 to 4 and
you could route an additional 10 units. Once you do that the flow changes, so in that case, the
already existing flow of 20 units was there, now we are routing an additional 10 units. So the
new flow becomes, so let us call this as f prime, this is equal to, if we think of it as a function,
20 units from 1 to 2, 20 units from 2 to 3 and 20 units from 3 to 4 was already existing, and
10 from 1 to 3 and 10 from 3 to 2. This we are routing in the residual graph, but we want a
flow in the original graph. So routing it from 3 to 2 is essentially amounting to decreasing the
flow, because here we are using the back edge.

1 to 3 was a forward edge, 3 to 2 was a back edge, and on that back edge, if you are sending
10, that means we are decrementing the original flow. So I will just write it as minus 10 from
2 to 3, and 10 from 2 to 3. So now the new network would essentially be, the new flow of the
new network, the new flow would be 20 from 1 to 2, and 10 from 1 to, sorry 2 to 3, 10 from 1
to 3, and 20 from 3 to 4. So this is taken care of, yeah, these two combine and gets us 10
from 2 to 3. So this is the new flow that we can, this is an additional flow of 10 units, that we
can get by looking at the residual graph.

342
(Refer Slide Time: 37:27)

Now corresponding to this new flow f prime, we can again compute the residual graph that
will essentially be of the following form. From 1 to 2, so 1 to 2 there was a flow of 20, so
there is still back flow possible of value 20 and forward there is still a residual capacity of 10
and 2 to 3, we had missed one particular edge, namely 10 from 1 to 3, 10 from 2 to 3, and we
had some flow of 10 from 2 to 4. So that will be the changed flow. So there will be a back
flow of size 10, there is no forward capacity and 1 to 3 is going to be a backward edge of
capacity 10 and from 2 to 3, there was a capacity of 10 that is used. So you can have 10 in the
forward direction and 10 in the backward direction. And 3 to 4, there was 20 units used. So
there is a 20 units back flow and 10 units in the forward direction, so this is the residual graph
corresponding to the new flow.

If you look at this particular residual graph what you can see is, it is still possible to route
another 10 units of flow. So that means, there is a path from 1 to 2, 3 to 4 and you can route a
flow of 10 units along this path, and that would give rise to a new flow and that flow
essentially will be as follows. So 10 units from 1 to 2, this is augmenting the already existing
20 units of flow. So you can say 30 from 1 to 2, and from 2 to 3 there is already a flow of 10
we are increasing, so this was a 10 units of flow that was already there from 2 to 3 in the
original graph, and we are using this particular edge to route an additional 10, so that would
amount to 20 from 2 to 3. And 3 to 4 is the other one that is affected, there was a flow of 20
from 3 to 4, we are increasing it by 10, so 30 from 3 to 4, and the other edges they remain
untouched.

343
10 from 1 to 3, that is going to be remaining as such, and 10 from 2 to 4 is also going to
remain as such. So now if you look at the original graph, this is a valid flow in the original
graph, and corresponding to this. So let us call this as f double prime, if you compute the
residual graph that will be as follows. Let us look at edge 1-2, there is already a flow of 30, 1
to 2 is a flow of 30 and that is the maximum possible and therefore there is no residual
capacity, the only thing that will remain is back edge of capacity 30. And from 2 to 3, there is
a flow of 20 and that edge is also saturated. So, only thing that will remain is a single back
edge of size 20. And 1 to 3 also, 3 to 4 is 30, so that is also saturated so you will have a back
edge of capacity 30.

And 1 to 3, that is also saturated and there will be back edge of capacity 10, and 2 to 4 there
is a flow of 10, so that is also saturated you will get 10 units. Now if you look at this residual
graph G f double prime, you will see that there are no paths from s to t and therefore it is not,
at least in this situation, it does not look that we can really increase the flow any further. We
will prove that as a mathematical theorem, in particular if you have a particular flow, so you
take any arbitrary flow, corresponding to that flow, we can compute whatever is known as its
residual graph. And if the residual graph does not have any path from s to t, then, we will
show that there is a cut which is saturated. So let us write this down.

(Refer Slide Time: 41:24)

344
If f is a flow such that the residual graph G f does not contain any path from s to t, then f is a
max flow, that is what we will prove. In order to show that f is a max flow, what we will
prove is, there exist a cut (A ,B) in G such that value of f is going to be equal to the capacity
of this cut (A, B). When the value of this flow is equal to the capacity of (A, B), it is
impossible to increase the value, if there is any other flow, whose value so if g was another
flow, such that value of g is greater than value of f, then automatically value of g will be
greater than c(A,B), but we have argued earlier that value of g can be utmost c(A, B). So if
you show that there exist a cut whose capacity is equal to the value of the flow then we have
enough reason to conclude that that flow that we have found is a max flow.

(Refer Slide Time: 43:01)

So let us formally define what is Gf, the residual graph. So f is any particular flow, the vertex
set Vf will be equal to V, vertex sets are the same. We are going to have additional edges, and
the capacities of the edges are also going to be changing. So for every flow carrying edge (a,
b). So, this is some particular edge (a, b) and if it carries a flow fe, then we will have two
edges. First is the forward edge, so forward edge will be from a to b and its capacity will be
capacity of the original edge minus whatever is the current flow. And the backward edge will
be (b, a) and its capacity will be equal to fe, whatever was the flow that is going to be the
capacity. So this is how the residual graph is defined.

345
Vertices are the same as the original graphs, if a particular edge is carrying a flow of value fe,
then you have forwarded whose capacity is bounded by the difference of the existing flow
and the maximum capacity and there will be a back edge whose capacity would be equal to
whatever is the value of the flow. So this is the formal definition of the residual graph. And
now what we need to show is that if the residual graph does not have any s to t connection, if
it does not contain a path from s to t, then we can find a cut which is saturated by this
particular flow.

(Refer Slide Time: 45:15)

So we will define the cut. So let f be a flow such that, there exist no s-t paths in Gf. So now
let A be the vertices reachable from s in Gf. So in the residual graph look at all the vertices
reachable from s. Clearly, since s-t path is not there, A does not contains t and the cut consists
of this A and its complement. So the cut that we have, we will call it as, let us say, let us call
this as A star, so cut (A star, B star) is defined in the following way. Vertices reachable from
s in Gf forms A star, and not reachable from s was remaining vertices, forms B star. Our
claim is, the value of this cut (A star, B star) is equal to, the capacity of this cut is equal to the
value of the flow. How do we prove this?

So let us look at these vertices. This is A star and the vertex s is sitting somewhere inside that
and then there is B star, which are the unreachable vertices and t is sitting somewhere inside.
And of course there are these edges which go from A to B, and there are edges from B to A
as well. The value of the flow, we have argued. So v(f), we can say is equal to the red edges
minus blue edges, the flow through the red edges, minus the flow through the blue edges.

346
What we will argue is that, the flow through red edges is equal to the capacity of the red
edges, and flow through the blue edges is equal to 0. So this is a cut which is saturated. So if
you look at the forward edges in the cut, they have all been saturated so this is going to be the
maximum possible flow. So if we prove these two statements, our conclusion follows,
because value of the flow was flow through the red edges minus flow through the blue edges,
and if you show that the value of the flow is equal to c(A, B), in order to show that we just
need to show that red edges are operating at full capacity, and blue edges are operating at 0
capacity. So now let us show that red edges are operating at the full capacity, it is straight
forward because suppose there was a red edge which is operating at capacity less than c.

So let us say this edge has capacity ce and the flow was anything less than fe, that would mean
that in the residual graph, there is a forward edge from A star to B star with capacity, so there
is going to be some edge from this part to this part, whose capacity is ce minus fe, and that
would make this particular vertex over here in B star come to A star, but we argued that B
star are all the unreachable vertices and therefore there cannot be forward edges. So therefore
every edge from A to B is operating at its full capacity. So, this part is done. For the second
part, if any of these edges were carrying some particular capacity. If the blue edge had some
capacity, had some flow through it, some nonzero amount, let us say this have a flow of
alpha, then in the residual graph there is going to be a back edge from here to here with the
same capacity alpha, and that would make this particular vertex here in B reachable. But we
have argued that B star, our definition said B star was all the unreachable vertices, and
therefore cannot be any such edge going from A star to B star.

So in particular that would mean that the value of the flow through blue edge should be 0.
And therefore now we have found a particular cut, and if there is no path from s to t in the
residual graph, then using that information we can compute a cut in the original graph and
argue that that cut's capacity is reached by this particular flow. So since we have found a flow
which matches the value of the cut, we know that, that is going to be the maximum flow and
that basically is the proof of Max flow Min cut theorem. So, the states, the value of the
maximum flow will be equal to the capacity of the min cut. One direction we had already
proven and this basically says that if the residual graph does not have a path, there is no way
we can increase the flow any further.

347
So if we have taken an arbitrary graph and computed the residual graph and kept on doing
that, till we can do so, the flow keeps on increasing and it should, I mean, if you assume that
the, weights are all integers, then at some point, the process must stop, and when the process
stops we know that there are no s to t paths. And when there is no s to t paths, there is a cut
which is being saturated. So that is basically the proof of Max flow Min cut theorem.

348
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati
Lecture 18
Counting Spanning Trees in Complete Graphs

In today’s lecture, we will learn about counting the spanning trees of the complete graph.

(Refer Slide Time: 0:37)

So let us understand what this means. So we have talked about spanning trees before.
Suppose G is a graph, a spanning tree T should satisfy two conditions. The first condition
being, T should be connected and the second condition is, T should be cycle free. So it is
basically a tree such that every vertex, there is a path from every vertex to every other vertex.
So you can say that it is a minimal such graph, minimal in the sense of number of edges
required. So this is the definition of a spanning tree and we are looking at the complete graph
on n vertices. So let us say if we had taken 3 vertices, this would be the complete graph on 3
vertices, which we represent by K3, and if we number the vertices as 1, 2, 3, this graph has
many spanning trees. In particular if we just try to draw, every spanning tree of this graph
will have exactly 2 edges. So 1-2, 2-3 is one possibility, so this is T1.

1-3, 2-3 will be the second spanning tree and there is yet another spanning tree which is 1-2
which consists of the edges 1-2 and 1-3. So we can verify that these are the only spanning
trees for K3. So when K equals 3, the number will be 3, if we had looked at the simpler
example of K2, it is precisely one spanning tree. So we can say for n equals 2 the value is 1,

349
for n equals 3, let us see what is the case when n is equal to 4. So we are looking at this
particular graph, the complete graph on 4 vertices. How many spanning trees does this
particular graph have? So we can enumerate them. So, one would be of this kind, so all the
spanning trees of K4 will have precisely 3 edges.

So the ones which consist of the boundaries or the square, there will be 3 of them. So there is
a missing edge here, and there are 4 ways you could choose them. So there are 4 trees of this
kind, and if you include the diagonal, so you can include both diagonals. So these 4 are
without including diagonal, now the choice becomes, do you include 2 diagonals or do you
include only one diagonal. So let us just look at including 2 diagonals, so these are the 2
edges and the third edge that you want to add could be any of the other 4. Okay, because each
of them, so if you had taken this, that will give you a spanning tree and all those 4 choices
would give you valid spanning trees. So there are 4 trees of that kind.

And if you choose to include just one diagonal, so let us say if you are using the 1-4 diagonal,
the other case would be symmetric. Now we can choose either these two, this is yet another
case. So these are the 4 spanning trees that we will get if we choose to add the 1-4 edge. If we
insist on having the 1-4 edge, you could add 2, the degree of this particular vertex 1 could be
2 in which case the only possibility is this, the degree could be 0 in which case, this is the
only possibility. The degree is 1. There are 2 possibilities namely these two. So there are 4
trees when you choose to include 1-4 diagonal and the symmetric case would be when you
include the 2-3 diagonals, there are again 4 nodes.

So the case where n is equal to 4, the number of spanning trees is going to be 16. You can
imagine that it is going to be extremely time-consuming if we were trying to do the same
thing for n is equal to 5. So what will be the number when n is equal to 5 and so on. So this is
what we need to understand, okay. We need to count the number of spanning trees of the
complete graph on n vertices.

350
(Refer Slide Time: 6:21)

And we will do it by a method known as Proofer coding. So you can look at these numbers
and see if you can detect a pattern there, in any case, from Proofer coding what we know is
the number of spanning trees of Kn. So we are looking at labelling Kn. So these are, and after
we have obtained the labelling how many distinct spanning trees are there that is what we
want to count. The number of spanning trees of Kn is equal to n raise to n minus 2, this is the
theorem that we will prove.

(Refer Slide Time: 7:19)

So the count is n raise to n minus 2, you can verify that when n is 2, 3 and 4. So, 2 raise to 2
minus 2, that is going to be 1. 3 raise to 3 minus 2 is going to be 3. 4 raise to 4 minus 2 is

351
going to be 16. Okay, so all those cases we have verified, we need to prove this in the general
case.

(Refer Slide Time: 7:42)

While proving this we will use an interesting technique. We will basically setup an
isomorphism between two sets. So let us look at this problem, you are given a particular set S
and you want to count the number of elements in it. One way to count is to explicitly count
the number of. I mean, enumerate the elements in S and then see what its count is. Another
way could be to setup a correspondence. So you can set up a correspondence with another set
which is easier to count. So this we can think of as an easier to count set. If we can find an
isomorphism, which maps every element to a unique element in the other collection in such a
way that, every element of the other set is occurring as the image of at least one of the
elements.

If we can set up a bijection between these 2 sets, then we know that the counts are going to
agree. So there are many natural objects whose count is n raise to n minus 2. We will show
that the spanning trees can be put in bijection to one of those sets. So one natural candidate
for a set having n raise to n minus 2 elements would be, if you consider list, we will think of
this as an ordered list of size n minus 2 from a set S consisting of n objects. So we have a set
S, let us call this set as S. Now this set has n objects, and we are allowed to form list of these
and the length of these list is n minus 2, the first element of course can be chosen in n ways
and we are allowing repetitions, second element also has n possibilities, n minus 2 elements

352
can also have n possibilities. So multiply and therefore the total number of ways of having
this list clearly is n raise to n minus 2.

So this is one of the quantities that we have, now we have this collection of the set, our main
set that we are interested in, let us call it as T, T consist of all the spanning trees of Kn. So if
we are able to find a bijection such that each such list corresponds to one unique element of
the set of spanning trees, then what we know is that the counts match, and this bijection is
basically known as the proofer coding. So let us see what this bijection is.

(Refer Slide Time: 10:58)

The bijection will be given via means of an algorithm. So we will assume that we are given a
tree. So input is a tree whose vertices are from the set S, and we are going to assume that the
set S is an ordered set, in the sense, given 2 elements in the set, we can say, which is the
larger element, which is the smaller elements and so on. So we will assume that there is a
total order on the set S. And the output will be a sequence of length n minus 2 or a list of n
minus 2 elements. So proofer coding basically, when given an input tree, it will produce an
output sequence. What we will need to show is that every sequence will be generated by
precisely one tree. When we look at this encoding, when we look at the algorithm for
encoding, we can see that, if you are given a tree there is only a unique sequence that can be
produced.

Sequence produced will be unique, but what we will argue is that, for every sequence, there is
a tree which generates that as well, and there is precisely one tree which will generate that. So

353
let us see what encoding is, so the proofer code is generated in an algorithmic fashion. So this
is the iterative step. So we will repeatedly do this iterative step for n minus 2 steps.

What we do is, pick the least valued leaf node. So you are given a tree so the tree must surely
have some leaf nodes, amongst the leaf nodes find the smallest valued leaf node, delete that
node along with the edge that it is, along with, the so since it is a leaf node, it is connected to
one node, delete that particular edge and output the neighbour of the deleted node. Since it is
a leaf node, its neighbour is unique. It is precisely one neighbour, so that neighbour we will
output, and we repeatedly do this for n minus 2 steps, we get a sequence of n minus 2
elements. Every tree produces a unique list of n minus 2 elements. So if you give a tree, the
answer is unique. But it could happen that two trees could give rise to the same list, it could
also happen that, certain list cannot be produced by any tree. We will rule out both these
cases. This fact is obvious, the fact that every tree produces a unique list is obvious, because
the algorithm has no randomness and at each step, since the least valued leaf node is picked,
that is unique and its neighbour is also unique and therefore, if you give a particular tree, the
output list is automatically fixed. Let us work this out for an example.

So suppose we look at this particular tree, the leaf nodes are nodes 1, 7, 5 and 6. And the least
valued leaf node is 1, so we will delete that and its neighbour is 2, so that is what we will first
note. And now if you will look at the remaining tree, the least valued leaf node is the node
number 2 or vertex number 2 and its neighbour is 4. So we will remove this and we will write
out the number 4. Now, amongst the remaining nodes the leaves are 5, 6, 7 and 8, the least
being 5 and its neighbour is 3, so we will basically remove 5 and after that we will remove 6,
7. And after we remove 6 and 7 and both their neighbours with 3, so when 6 is removed we
get a 3, when 7 is removed we again get a 3 and at this point, the tree consists of 3, 4 and 8.
And if we look at the leaves, there are precisely two leaves, 3 and 8. The least valued one is 3
and its neighbour is 4. So this is the sequence that we get.

Now what we will claim is, if we know the underlying vertex set, from that information and
from these numbers 2, 3, 4, 3, 3, 4, we can recover back the tree. So I hope this process is
clear, we can just, the iterative step is, at each stage, with the least value of the leaf node.
Delete the edge corresponding to it and the note the neighbour of the deleted node. Okay, so
let us just look at the sequence, 2, 3, 4, 3, 3, 4 and try and generate the tree back.

354
(Refer Slide Time: 17:45)

So suppose we know just this information, and we do not have access to what was the
original tree. From this, can we recover back the tree? If we look at the sequence, one thing
that we know is there are 2 facts that we can know, every leaf node will be absent from the
proofer code and the second aspect is, every internal node will be present in the proofer code.
Why is this so? The leaf node, the only numbers which appear in this are internal nodes,
because leaf node is the node that you are going to remove and when it is removed the node
itself is never returned, but its neighbour is returned.

So anything which appears in the proofer code, at some time, it will have to be an internal
node. So we can clearly see that, the leaf nodes will be surely absent from the proofer code. If
we look at the process of generating the proofer code, we are starting at a tree and we are
systematically removing the edges, so there will be a point at which any node, its degree will
keep on decreasing and there will be a point when it becomes either 0 or 1. The process of
generating the proofer code, we took at tree and we were removing the edges one after the
other. If you look at the initial tree, the process of generating the proofer code basically
removes an edge from the tree one edge at a time. So every internal node is going to lose
some of the edges corresponding to it, and when this happens, that node is going to appear in
the proofer code and therefore, we can argue that every internal node will be present in the
proofer code.

355
If you note these things, we know, that by looking at the proofer code 2, 4, 3, 3, 3, 4, that, the
leaf nodes essentially have to be the missing numbers. So this is the sequence of length, the
length is 6, and if the vertices were numbered 1 to 8, the leaf nodes would have been 1, 5, 6,
7, 8. We can look at the diagram and see that this is indeed the case, 1, 5, 6, 7 and 8. And
amongst these, the first one to be removed would have been 1 and that should have been
connected to 2, so 1-2 would have been an edge. And if you look at the remaining sequence
now 4, 3, 3, 3, 4. 2 is missing, and therefore, 2 must have been a leaf node for that particular
tree. So we can say that 2 must have been connected to 4. 4 is still present and amongst the
numbers that are not present the least one is 5 and 5 is to be connected to 3.

The next missing number is 6. So 6 is also connected to 3 and then you have 7, 7 is also
connected to 3. And after this point, 3 is no longer going to be present, and therefore 3 must
have been connected to 4 and then at this point, there is only one node that is remaining that
is 8 and that must have been connected to 4.

(Refer Slide Time: 21:39)

So from this number, we could in some sense, generate the tree. We have not given a formal
proof, we will do that shortly. Vertex 3 is connected to 5, 6, 7 and vertex 4 is connected to 8
and 2 and vertex 1 is connected to 2, that was all the, so at least in this case we have verified
that things work nicely. So let us formally prove the theorem.

356
(Refer Slide Time: 22:05)

The number of spanning trees for Kn is equal to n raise to n minus 2. So, let us assume that
the vertices are from a set S, which we will assume to be an ordered set. And let T denote the
set of spanning trees and L denote the set of lists of size n minus 2, where we will assume
that the size of S is equal to n, so the list of size n minus 2 over S. What we will prove is,
there exist a bijection between T and L. So we will prove this by induction. Base cases, we
have already checked and n is equal to let say 3, we had verified that T contains, if we call
this as Tn, Tn contains 3 elements and the list also contains 3 elements. So because what you
are looking at is list of size 3 minus 2, so that is list of size 1 over S. There are precisely 3
elements in S. So, the list of size 1 is precisely 3. So the base case can be easily checked.

Now, let us assume that the induction hypothesis holds for all values up to m minus 1. Now
we want to show that, for m also, this statement is true. So let us look at particular sequence.
So every sequence on S of size n minus 2 corresponds to a unique spanning tree. So let us
look at a particular sequence, so if you call the sequence by a, that is equal to a1, a2, an minus
2, we will exhibit a tree corresponding to this and we will argue that that tree is a unique tree.
Now, if you look at, let say a2 to an-2, that again is a sequence of length n minus 3, it is a
subsequence. And if we knew, which are the vertices that are under consideration here, we
will inductively argue that a unique spanning tree can be constructed corresponding to that.
To that spanning tree we add a particular edge and we will get the tree that we are interested
in. So that is our line of attack, so let us look at it more carefully.

357
(Refer Slide Time: 25:58)

So look at the smallest number that is not present in a, let us call it as x. We had argued
earlier that the numbers that are missing in a1, a2 up to an-2 has to be clearly leaves, and the
smallest of them would have been the first node that would have been removed when we
were constructing the spanning tree. So if you look at that node and call it as x, clearly x is
the node, is the leaf node that was removed initially and x would have been connected to a 1.
Now, if we look at the set S minus x, this is the set of size n minus 1 and if you look at the
subsequence a2 to an-2, this sequence cannot contain x in it. This does not contains x in it,
because, the original sequence itself did not contain x. So subsequence surely cannot contain
x.

358
(Refer Slide Time: 27:41)

And by our induction hypothesis, this part a2 to an-2 is going to correspond to unique tree T
prime. Now that tree with the edge (x, a1) will correspond to this particular sequence. So T
union the edge (x, a1), T prime union (x, a1), if you call this as T, T is a spanning tree whose
proofer code is a. So why is this so? Well, clearly if you add (x, a1) to T prime, it does not
create any cycle, because we assumed that x is not present here, and that would make x a leaf,
and if you attach a leaf to some other node that is not going to create a cycle. So the newer, so
the graph that you get is indeed going to be a tree, and if you look at the proofer code of that,
that is precisely going to be a1, a2, an-2. So this is basically the proof which says that
corresponding to every sequence, you can get a unique spanning tree. The uniqueness comes
out, because, a2 to an-2, by induction hypothesis has a unique T prime, and the only tree that is
going to be obtained is by adding (x, a1) to T.

So the only tree that could possibly correspond to this sequence, the only graph that could
possibly corresponding to, correspond to this particular proofer sequence, or this particular
code is going to be T, that is obtained by adding the (x, a1) edge to a2,.., an-2. So that concludes
the proof and the lecture.

359
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering,
Indian Institute of Technology, Guwahati
Lecture 19
Embedding of the theory of rational numbers in set theory Paradoxes 2

(Refer Slide Time: 00:36)

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the fourth lecture on Set
Theory. In the previous lecture, we have seen how to embed the theories of natural numbers,
integers and rational numbers in set theory.

(Refer Slide Time: 00:42)

360
In particular, we define the natural numbers, thus. 0 was defined as the empty set, then 1 was
defined as the successor of the empty set, which turns out to be the singleton containing the
empty set or in other words the singleton containing just 0.

The successor of 1 is 2, which turns out to be the two members set containing 0 and 1 or
containing the empty set and the singleton containing the empty set, and then the successor of
2 is 3, which is the set containing just three members 0, 1 and 2, and the successor of 3 is 4
which contains just four members 0, 1, 2, 3. In general, the natural number N can be
represented using the set containing 0, 1, 2, 3 etcetera up to N minus 1 that is natural number
0 to N minus 1 will form the set which has named natural number N.

So, every natural number in the sense is defined as a set. We define the notion of inductive
sets. An inductive set is a set which contains the empty set and is closed under the successor
operator. The successor operator on set A gives us the set containing, which is a union of A
and the singleton containing A. Then omega turns out with the smallest inductive set, omega
is an inductive set, in itself and this omega is defined as the set of all natural numbers. So, as
you can see omega contains the empty set which is 0, its successor 1, its successor 2 and so
on. So, by necessity, because it should be closed under the successor operator, it should
contain all these 0, 1, 2, 3 etcetera. So, omega is defined as the set of natural numbers. So,
every natural number is now a set and then we define operations on natural numbers as Set
Theoretic Operations.

For example, addition on natural numbers is defined in terms of the successor. For example,
the sum of N and M plus 1 is the successor of the sum of N and M. So we are defining the
addition, the sum on N and M plus 1, recursively use in the sum of N and M a smaller
number and the successor operator. So, effectively addition is defined in terms of the
successor operator and then multiplication in the same way can be defined using addition, the
product of N and M plus 1 is the product of N and M plus N. So we define multiplication
using addition. So, this way the operators on natural numbers, operations on natural numbers
could be defined as Set Theoretic Operations.

361
(Refer Slide Time: 03:20)

Then we went on to embed the theory of integers on the set of; in set theory. We consider the
cross product N cross N where N is the set of natural numbers. We consider as a subset of
this, in particular we consider a relation tilde, which is defined in this manner. Ordered pair
(m, n) is in relation tilde with (p, q) if and only if m plus q equals n plus p. The idea is that m
minus n should equal p minus q, and then the set of integers Z is defined as the set of
equivalent classes under this operation, this relation tilde, that is the quotient of N cross N
with tilde is what the set of integers is.

(Refer Slide Time: 04:07)

And then we define operations on integers as operations on such equivalence classes, such an
equivalence class in a integer, so operations on integers should be translate into operations on

362
such sets. So, we define plus Z and into Z appropriately. So, along with these operations, Z
with integers 0 and 1 forms an integral domain.

(Refer Slide Time: 4:30)

Then we went on to the theory of rational numbers and so, how to embed the theory of
rational numbers in set theory. In particular, we consider the set Z prime, which is Z with
sans 0, and then we consider the cross product Z cross Z prime and we define a relation
ordered pair (a, b) join ordered pair (c, d) if and only if ad is equal to bc. The idea is that a by
b should be the same as c by d. That is, fraction a by b should be the same as the fraction c by
d.

(Refer Slide Time: 04:58)

363
Then, the set of all equivalence classes of this relation is defined as the set of all rational
numbers. In other words Z cross Z prime’s quotient with the joint operation is called the set
of all rational numbers. So, a rational number is an equivalence class under this relation and
then the addition operation and the multiplication operation were appropriately defined
consisting with our notions of rational numbers addition and multiplication.

(Refer Slide Time: 05:29)

And then we found that, the set of rational numbers along with these two operations addition
and multiplication thus define, and this is Q here, and 0 and 1 is a field, and we define the
less than relation as a linear order on Q.

(Refer Slide Time: 05:44)

364
We say that the set S is countable, if there is an injective function from S to N, that is the
members of S could be counted using natural numbers. So, you could say this is the 0th
member, this is the first member, this is the second member, and so on.

(Refer Slide Time: 05:58)

We saw that the set of rational numbers is countable in particular, when you consider all
ordered pairs of natural numbers, this is countable. That is, if you consider the first quadrant
then you could count the natural numbers in this order. That is, counting order pairs
belonging to one diagonal at a time, you can count all of them. Extending this notion, you can
show that every rational number, the set of all rational numbers is countable.

(Refer Slide Time: 06:26)

365
Now, we come to real numbers. Of course we know that there are real numbers that are not
rational or irrational. In particular, root 2 is irrational. It has been known for long that root 2
is irrational, the proof goes thus, Assume the contrary. So suppose root 2 is rational, so here
we suppose is that root 2 is rational. If root 2 is rational then we would be able to represent
root 2 as a fraction a prime by b prime, where a prime and b prime are integers.

So, consider this fraction a prime by b prime, out of a prime and b prime, we can remove the
common factors and get this fraction in reduced form. Let us say a by b is the reduced form.
What I mean is that a and b, do not have a common factor, or that gcd of a and b is one. So, if
root 2 is a rational number then there exists a and b so, that root 2 is a by b and gcd of (a, b)
equal to 1.

366
(Refer Slide Time: 07:30)

So, root 2 is a by b. If root 2 is a by b, then root 2 b is a, squaring both sides we have 2 b


squared equal to a square. So, a square is the square that is what we have on the right hand
side. On the left hand side, we have 2b squared which is an even number. It is 2 multiplied by
something, which means a squared is an even square. We know that an even square is the
square of an even number. The square of an odd number is always odd. Therefore, if a
squared is an even number, then, a is also an even number.

(Refer Slide Time: 08:05)

367
So, a is even. So, let a be 2k. Then substitute it in this equation. We have a square equal to 2b
square. So, a squared equal to 4 k square, which is 2b squared. So in particular, let us
consider this equation 4 k square is equal to 2b square. So, 2 is common factor on both sides.
So, let us cancel 2 from both sides, so we have 2 k squared equal to b square. Now, 2 k
square is even. Therefore, b square is even as well, which means b square is an even square.

(Refer Slide Time: 08:40)

368
Which means b is even, but then we had earlier found that a is even. So, a is even and b is
even, which means 2 is common factor of a and b, or in other words, gcd of (a, b) is greater
or not equal to 2, at least 2 is a common factor, which is a contradiction, because here we
assume that gcd of (a, b) is equal to 1. Root 2 has been written in the reduced form a by b. So,
a and b do not have a common factor. But here we find the 2 is a common factor, which is a
contradiction. Therefore, root 2 cannot be rational. Root 2 is irrational. So, there are irrational
numbers.

(Refer Slide Time: 9:22)

369
So, there are real numbers which are not rational. Now, we are trying to embed the theory of
real numbers in set theory. So, what we need is that every real number should be constructed
as a set. The sets that we construct are called ‘Dedekind Cuts’.

(Refer Slide Time: 9:46)

We define a Dedekind cut thus, a subset x of Q. So, x is a set of rational numbers such that x
is not empty and x is not Q. x is closed downwards and there is one more condition which is
that x has no largest member. Such a set is called a ‘Dedekind cut’. So, once again a
Dedekind cut is a set of rational numbers. But not every set of rational numbers is a Dedekind
cut. In particular, a Dedekind cut cannot be the empty set, it cannot be Q either and a
Dedekind cut has to be closed downwards.

370
(Refer Slide Time: 10:29)

What it means is this, if x is a Dedekind cut, then, x is a set of rational numbers. Let us say q
is a member of x, so q is a rational number which belongs to x and let us say r is less than q
and r is rational number which is less than q. So, let us say r is a rational number which is less
than q. In that case r belongs to x. So, what we mean is this. When we say that x is closed
downwards, we mean that for every rational number q which belongs to x and for every
rational number r which is less than q, if q belongs to x, then r belongs to x.

371
(Refer Slide Time: 11:16)

And we say that x has no largest member. So, such a set is a Dedekind cut. So, we can say
that, when you consider a set of all rational numbers, the Dedekind cut x divides it into two
sets. x is one set and the complement of it, x bar is the other set, that is Q minus x is x bar, the
relative complement. It partitions the set of all rational numbers into 2. So, on the smaller
side we have x, on the higher side we have x bar. Every rational number belonging to X bar
is larger than every rational number belonging to x. x has no largest member. x bar may or
may not have a smallest member. Such a partition of the set of rational numbers is affected by
a Dedekind cut.

(Refer Slide Time: 12:12)

372
In a particular, let us consider the irrational number root 2, which we have just shown to be
rational. So, a decimal approximation for root 2 is this 1.414213562 etcetera. In particular, if
I consider a set of rational numbers containing 1, 1.4, 1.41, 1.414 etcetera. These are all
rational numbers that are smaller than root 2. So, the Dedekind cut corresponding to root 2
will contain all these rational numbers. It is a super set of all these.

(Refer Slide Time: 12:48)

We consider all rational numbers that are less than root 2. The set of those form a cut. This
cut is what we equate with root 2. That is the real number root 2 is equated to this cut. So
now, every real number becomes a set. In particular, it becomes a Dedekind cut

(Refer Slide Time: 13:14)

373
For every real number x, the cut that is the associated to x is a set of all rational numbers, that
are less than x, but of course the cut cannot be defined in this manner because here, x is a real
number. So, the cut is defined only using rational number, as we saw earlier.

(Refer Slide Time: 13:34)

So, the set of all real numbers is now, the set of all Dedekind cuts. Every real number is
Dedekind cut.

(Refer Slide Time: 13:40)

It can be shown that every real number has a decimal representation. For example, when you
are given real number x, consider x on the real line and then consider the largest integer
smaller than x. Let’s suppose, that is, N. Then, N is the integer approximation to x, that is x

374
floor will be N. Then consider the portion from, the portion on the real line from N to x. This
segment has a length less than 1. Divide this into 1-tenths. The number of 1-tenths from N to
x will be the next digit. Suppose that is n1. So, n1 will be the digit in the tenth place and so
on. So, you can construct the decimal representation of the real number in this manner.

(Refer Slide Time: 14:47)

Consider the rational number 1 by 3. A rational number is also a real number. Therefore, we
have a Dedekind cut associated with 1 by 3 as well. The set of all rational numbers less than 1
by 3, will form the Dedekind cut that is associated with the real number 1 by 3. Then, its
compliment x bar in this case has a smallest number which happens to be 1 by 3 itself.
Therefore, if the complement of a Dedekind cut has a smallest member, then that Dedekind
cut corresponds to a rational number. If it does not have a smallest member, then it
corresponds to an irrational number.

375
(Refer Slide Time: 15:35)

In particular, for root 2, the cut corresponding to root 2 is the set of all rational numbers less
than root 2, and its compliment is a set of all rational numbers greater than root 2. Since root
2 is not rational, it will not belong to either set. So, the compliment in this case we see does
not have a smallest element.

(Refer Slide Time: 15:56)

376
We can show there is a set of all real numbers, R, which is not countable. We saw earlier that
the set of all natural numbers, the set of all integers, the set of all rational numbers, are all
countable, but the set of all real numbers is not countable. So, let us prove this now. In
particular, let us consider the part of the real line that is with 0 and 1 excluded. So, we
consider the interval (0, 1) opened at both ends, that is, we are considering all real fractions.

(Refer Slide Time: 16:30)

We will show that (0, 1) is uncountable. You see a technique called ‘Diagonalization’. If (0,
1) is uncountable, then its super set are also should be uncountable. So, let us assume that (0,
1) is countable, that is the set of all real fractions is countable, let us say, and we will derive a
contradiction.

377
(Refer Slide Time: 16:58)

If this is countable, then there exists a one to one into mapping from the set of real fractions
to the set of natural numbers. So, you could say this is the first fraction, this is the second
fraction, this is the third fraction, this is the fourth fraction and so on. So, there is an
enumeration of fractions. So, let us say we have this enumeration. So, let us say 0. a1 a2 a3
a4 etcetera is the first fraction. So, a1 a2 a3 etcetera are all digits. This is the decimal
representation, and the second fraction is 0. b1 b2 b3, etcetera. Then, in the diagonalization
technique, we pick the diagonal digits. From the first fraction we pick a1, from the second
fraction we pick b2, form the third fraction we pick c3 and so on. In general, from the i th
fraction we pick the i th digit.

(Refer Slide Time: 17:51)

378
So, after picking digits in this fashion, we form a new fraction, which we write thus. 0. a1
prime b2 prime c3 prime d4 prime etcetera. Here x prime is defined as x plus 1 mod 10. For
example, if x is 0, x prime is 1, if x is 1 then x prime is 2 and so on and when x is 9, x prime
is 0. So, we can see that x prime certainly differs from x. So, here what we do is this, we
construct a new fraction which we write in this manner, 0. a1 prime b2 prime c3 prime
etcetera. So, this fraction differs from every single fraction in enumeration.

(Refer Slide Time: 18:29)

It differs from the first fraction. In the first fraction, we have a1 in the first position, whereas
in the new fraction that we have constructed we have a1 prime at the first position. It differs
from the second one because in the second one we have b2 in the second position but we

379
have b2 prime in the new fraction. It differs from the third fraction in the third position. In
general, it will differ from the i th fraction in the i th digit.

(Refer Slide Time: 18:56)

So, the new fraction is not in the enumeration. In other words, the enumeration that we had,
the hypothetical enumeration that we had is not exhaustive. So, that is a contradiction. We
assume that the set of all real fractions was enumerable and therefore we assume this
enumeration was exhaustive.

(Refer Slide Time: 19:06)

So, here is an example of the construction. So, if the fractions that we had were like this, then
the first fraction here has 3 in the first position, so in the new fraction that we construct we

380
will write 3 plus 1, 4. We had 8 in the second position of this second fraction. So, we would
write 9 in the second position of the new fraction. We had 4 in the third position of the third
fraction, so we will write 5 in the third position of the new fraction and so on. So, the new
fraction that we construct does not match any of the existing fractions. So, this technique is
called the ‘Diagonalization technique’.

(Refer Slide Time: 19:53)

Since, the set of all real numbers is a super set of the set of real fractions (0, 1), R is also
uncountable. So, what we find this that is the set of natural numbers N, the set of integers Z,
the set of rational numbers Q are all countable, but the set of all real numbers is uncountable
set. Discrete mathematics deals with the countable sets.

381
(Refer Slide Time: 20:24)

Now, we define a linear order, less than relation, between real numbers in this manner. We
say that a real number r is less than real number s, if and only if r is the subset of s. So, r is a
Dedekind cut here, which is a set and s is also a Dedekind cut. We say that r is a proper
subset of s that is precisely when r is less than s. So, when the less than relation is defined in
this manner, we can say that it is a linear order.

(Refer Slide Time: 20:43)

Then, we define the addition operator. The real number addition operation is defined thus.
For real numbers x and y, x plus y is defined as the set of all rational numbers q plus r, so that
q belongs to x and r belongs to y. Here, x and y are treated as Dedekind cuts.

382
(Refer Slide Time: 21:04)

Similarly, the multiplication operation is defined like this. For non-negative real numbers x
and y, the product of x and y is defined as the product of all rational numbers q and r. So that
q is greater than or equal to 0 and belongs to x and r is greater than or equal to 0 and belongs
to y.

(Refer Slide Time: 21:27)

383
If, x and y are both negative, then x into y is defined as the absolute value of x multiplied by
the absolute value of y. We do a real number multiplication here. If exactly one of x and y is
non-negative, then x into y is defined as the negative of the magnitude of x multiplied by the
magnitude of y. The multiplication here is the real number multiplication again.

(Refer Slide Time: 21:52)

So, we can see that the set of real numbers along with these operations, addition and
multiplication and the real 0 and real 1 is a field.

384
(Refer Slide Time: 22:02)

Now, let us see how we define sets. If what we have is a finite set. We can just enumerate the
members of the set. For example, the set 2, 5, 7 has three members. We can explicitly list
three members and enclosed them within the bracers and this is one representation of the set.
This is what is called an enumeration of the members of the set. So, the set can be represented
using an enumeration or we can represent the set using an abstraction.

(Refer Slide Time: 22:32)

385
So, let us say alpha is a first order formula with a free variable x. Then, we could write the set
of all individuals x, so that x satisfies alpha. So this is an abstract representation of the set.
But an abstract representation can lead to paradoxes. One such paradox is this. This is called
Berry’s Paradox. Let A be the set of all numbers x, such that x is a natural number that can be
defined in at most 100 characters.

(Refer Slide Time: 23:07)

On any finite alphabet, there is only a finite number of strings with less than 100 characters.
For example, if you have three members in the alphabet, if your alphabet is a, b, c, then how
many strings can have exactly 100 characters? You consider a string of 100 characters, so
there are 100 positions to fill. At each position, we have three choices. So we have 3 to the

386
power 100 ways in which 100 character strings can be constructed. So, here we are talking
about strings of at most 100 characters. So, we can count the total number of such strings. So,
that is a finite number.

(Refer Slide Time: 23:48)

Since, there is only a finite number of strings with less than 100 characters, we can talk about
the least natural number that cannot be defined using at most 100 characters. We can, cannot
we? Let that number be n, but then this is the definition of n, that uses at most 100 characters,
then we have this question, does will n belong to A or does it not? We have a paradox, but
then we can get rid of such paradoxes, if we are precise with the definition. So, here the
problem was with the use of the word definition here. So, there is an ambiguity here, either
by using multilayered notions of definitions we can avoid such paradoxes.

387
(Refer Slide Time: 24:40)

But does it allow us to get rid of all paradoxes? Infact not. Even if we use a precise first order
formula for representing alpha here, we can still have paradoxes.

(Refer Slide Time: 24:57)

Consider the set B defined in this manner. B is defined as the set of all x such that x does not
belong to x. So, here the abstraction is very precise. It is defined using the set membership
notion. So, B is defined as a set of all sets that do not contain themselves.

388
(Refer Slide Time: 25:24)

But then does B belong to B? If B belongs to B, then B should not belong to B. On the other
hand, if B does not belong to B, then B should belong to B. This paradox is called ‘Russel’s
Paradox’.

(Refer Slide Time: 25:40)

So, this is the fundamental paradox. This is because, not every collection of objects can be
deemed a legal set. What it means is that, the axioms of set theory have to be carefully
formulated. One such formulation is Zermelo Fraenkel Axioms. We shall study about the set
of axioms in the next class. That is it from this lecture. Hope to see you in the next. Thank
you.

389
Discrete Mathematics
Professor Sajith Gopalan,
Professor Benny George
Department of Computer Science & Engineering,
Indian Institute of Technology, Guwahati
Lecture 5
Set Theory

Welcome to the NPTEL MOOC on Discrete Mathematics.

(Refer Slide Time: 00:34)

This is the fifth lecture on Set Theory. In the previous lecture, we saw that naive set theory has
problems with paradoxes, for example Russell's Paradox. So, when we axiomatize naive set
theory we have to pay a great deal of attention. One such axiomatization is the one by Zermelo
and Fraenkel. So, we shall take a peek at the Zermelo-Fraenkel axiomatization today. Long
discussion about this is not within the scope of this course, we may have occasion only to look at
the axioms.

390
(Refer Slide Time: 01:08)

The first axiom is the Axiom of Extensionality. Axiom of extensionality says that, two sets are
equal if and only if they have the same extensions. In other words, two sets have exactly the
same members if and only if they are equal. Formally for all x, for all y, for all z, z belongs to x
if and only if z belongs to y, implies that x equal y. For any two sets x and y, x and y have
exactly the same extensions, that is the same z belongs to both of them, for every z. In which
case x is equal to y, this is the first axiom.

391
(Refer Slide Time: 02:25)

The second axiom is the Empty Set Axiom. The empty set axiom says that, there is a set with no
members. In other words, there exists an x so, that for all y, y is not a member of x. There is a set
x so, that for every y, y is not a member of x, in other words x does not have a member. So, x is
the empty set. So, this axiom asserts the existence of an empty set.

(Refer Slide Time: 03:12)

The third axiom is the Pairing Axiom. Pairing axiom says that, for every pair of sets, there exists
a set with these two as its only members.

392
(Refer Slide Time: 04:02)

In other words, for every x and for every y, for a pair x, y of sets, there exists z, so, that for every
u, u belongs to z if and only if, u equal to x or u equal to y. In other words, for every pair x, y of
sets, there is a set z, so that something is a member of z precisely when that something happens
to be either x or y. In other words, z contains exactly x and y and nothing else.

(Refer Slide Time: 04:47)

The fourth axiom is the Power Set Axiom. Power set of x, we know, is the set of all subsets of x.
So, the power set axiom asserts the existence of a power set. For every set x, there is a set which

393
happens to be the power set of x. In other words, for all x there exists y, which happens to be the
power set of x. So how do we state that? We have to say that for every z, z belongs to y,
precisely when z is a subset of x. In other words, z is a member of y, precisely when z happens to
be a subset of x or y will contain precisely the subsets of x or y is the power set of x. So, axiom
four asserts the existence of the power set for every set.

(Refer Slide Time: 06:02)

The fifth axiom is the Subset Axiom. This is in fact an Axiom Schema. For a formula alpha, with
its free variables among t1 through tk, y and u, so, alpha is a formula with free variables among
these. We have this following axiom. For all t1 through tk, for every tuple t 1 through t k and for
every u, there exists an x so, that for every y, y belongs to x if and only if y belongs to u and
alpha of t1 through tk, y, u is true. This will be an axiom for every formula alpha with its free
variables among t1 through tk, y, u. So, when t1 through tk, y, u are supplied as arguments to
alpha, you have to perform the substitution if one of them happens to be the free variable. For
example, if t1 is not a free variable, then the argument which is applied here will not be
substituted. So, what does it say? What it says is that, given any k tuple t1 through tk and a set u,
then we can pick out the members of u, which satisfy the formula alpha along with y and u, with
t1 through tk.

394
In other words, given the tuples, tuple t 1 through tk, and u, there exists an x or set which we can
synthesize from u and t1 through tk, so that membership in x of y will be precisely when y is the
number of u, in other words, x will be a subset of u. So, we are forming a subset of u, x is a
subset of u and moreover u and y will have to satisfy the condition alpha along with t1 through tk.
This is the way of forming subsets of a given set u.

(Refer Slide Time: 09:20)

An example would be, for all t, for all u, there exists an x so that, for every y, y belongs to x
precisely when y belongs to t and y belongs to u. Now, what does this assert? There exists the
intersection of t and u. So, for every set u, when t is supplied, we can form the intersection of t
and u. In other words, from u we can form the subset of members of u which are also members
of t. So, this is a way of forming subsets of u.

395
(Refer Slide Time: 10:16)

The sixth axiom is the Union Axiom. Union axiom says that, for every x, there exists a y so, that
for every z, z belongs to y precisely when, there exists a u, so that z belongs to u and u belongs to
x or in other words, given any set x, we can construct a set y which will contain precisely the
members of members of x, that is for a z to belong to y, that will have to belong to some u,
which in turn belongs to x. So, for any x, there is a set y that contains precisely the members of
members of x.

(Refer Slide Time: 11:30)

396
The seventh axiom in our list is the Axiom of Choice. Axiom of choice is that, for any relation R,
there exists a function F, which is a subset of R, such that the domain of F is the domain of R.
Why this is called the axiom of choice? Given a relation R, let us say from A to B, therefore, this
is a subset of A cross B, then consider some member x of A. Under the relation R, x may have
two images. Let us say y1 and y2, but what we construct here is a function, the domain of which
is identical to the domain of R.

Therefore, x will have to have an image under F as well. But then x has two images under R. To
form F, you will have to pick one of them, that is you have to exercise the choice y1 or y2. One
of them will be F of x for the function F that we are going to construct. Therefore, we are
exercising a choice, when we construct function F. So, what axiom of choice is that, for any
relation R, there exists a function satisfying this condition that is the domain of the function is
identical to the domain of R.

(Refer Slide Time: 13:29)

The eighth axiom is the Infinity Axiom. What this says is that there is an inductive set, or
formally, there exists an x, so that the empty set belongs to x and x is closed under the successor
operator. For every y, if y belongs to x then, the successor of y also belongs to x. That is when x
is closed under the successor operator. So, the infinity axiom says that there is an inductive set.

397
(Refer Slide Time: 14:19)

And the ninth axiom is the Replacement Axiom. Consider set u. Let’s suppose that, every
member of u has a nominee n of x. For y, the nominee is n of y. So, the nominee function defines
a unique nominee for every x, then what does axiom says is that, if every member of u has a
nominee then, there is a set that contains precisely the nominees of the members of u.

(Refer Slide Time: 15:33)

398
Or in other words, for any formula nu of (x, y), which in fact asserts that y is the nominee of x, in
which z is not free. The following is an axiom. For every u, for all x belonging to u, for all a and
b, nu of (x, a) and nu of (x, b) implies that a equal to b. So, this is the antecedent of an
implication. What this says is that, for every a, b, nu of (x, a) and nu of (x, b) implies that a equal
to b. In other words, there is exactly one a, for every x, so that nu of (x, a) is satisfied. In other
words, x has a unique nominee. Then, there exists z, so that for all y, y belongs to z if and only if
there exists x belongs to u, so that nu of (x, y). In other words, there exists a z which contains
exactly the nominees of the members of u that is for every y, y belongs to z precisely when y is
the nominee of some x which belongs to u. So, the assertion is exactly that we had in mind.
There is a set that contains precisely the nominees of the members of u.

399
(Refer Slide Time: 17:47)

And the final axiom is the Regularity Axiom. It says that, every non-empty set x has a member y,
with x intersection y equal to phi. You can show that this implies no set is an element of itself.
The paradoxes that are known within naive set theory will not arise within Zermelo–Fraenkel set
theory.

(Refer Slide Time: 18:53)

400
Now, let us consider the notion of Equinumerosity. We say that set A is equinumerous with set B
denoted in this fashion. A is equinumerous with set B if and only if there exists a one-to-one
mapping from A onto B.

(Refer Slide Time: 19:48)

We have seen that N cross N is equinumerous with N. The set of all ordered pairs obtained from
N is equinumerous with N itself. Similarly, N is equinumerous with the set of all even natural
numbers. N is the set of all natural numbers. Set of all even natural numbers is Ne. So, there is a
mapping from N to Ne which is 1 to 1 and onto, in which we map x to 2x.

401
(Refer Slide Time: 21:01)

Coming to real numbers, we can show that the interval from 0 to 1 is equinumerous with the set
of all real numbers. How do we show this? To show this, we consider the real line, let us say, this
is the origin and this is 1. So, we are considering the set of all points from 0 to 1 on the real line.
We want to show that this set is equinumerous with the points on the real line itself. To prove
this what we do is this. Consider the portion of the real line from 0 to 1. That is the line segment
from 0 to 1. We take it and bend it, so that it forms a semicircle and arrange the semicircle, so
that the real line is a tangent to the semicircle. So, the length of the semicircle is 1, because this
has been obtained from the interval 0 to 1 by bending the interval 0 to 1. So this has a radius of 1
by pi.

So, the origin of the circle would be 0 minus 1 by pi, on the real plane. So, this is what the origin
is, and then, let us say, we draw a line passing through the origin of the circle and some point on
the real line. This ray, the ray with the origin as its vertex will intersect the semicircle at some
point and it will intersect the real line at exactly one point. Then, let us define a function f, which
maps the circle point onto the real line point. So, f is defined in this manner. So, this function f
maps the interval 0 to 1 onto the real line, which is the set R.

402
Consider another line for example. This will pass through these two points. So, this point is
mapped to a negative real number. So, it shows that the interval 0 to 1 has a 1 to 1 onto mapping
to the set of all real numbers. Therefore, the interval 0 to 1 is equinumerous with the set of all
real numbers.

(Refer Slide Time: 24:23)

Let this be the notation, for the set of all functions from A to B. In particular, A 2 will denote the
set of all functions from A to the set 2 which according to our definition is this. We are
considering the natural number 2. In the embedding of the theory of natural numbers in set
theory we had defined natural number 2 as the set 0, 1, where 0 is the empty set and 1 is the
singleton containing the empty set. So, by superscript A 2 we denote the set of all functions from
A to 2.

403
(Refer Slide Time: 25:33)

Recall 2 power A is the power set of A. We claim that superscript A 2 is equinumerous with 2
superscript A. This is the set of all functions from A to 2 and this is the set of all subsets of A.
These two sets are equinumerous, but how do we show that they are equinumerous? Let us
consider any subsets of A. Let’s suppose B is a subset then B has a characteristic function. The
characteristic function of B, fB is defined in this manner, fB of x is 1 if and only if x belongs to B
or in other words it is 1 if x belongs to B, it is 0 otherwise.

The characteristic function is a binary function. So, fB happens to be a mapping from A to 2. So,
fB is a member of the set of all functions from A to 2. So, what we find is this, corresponding to
any subset B of A there exists a unique function fB of superscript A 2 and this is a unique
function. The characteristic function of B happens to be a member of superscript A 2. Therefore
these two sets are equinumerous.

404
(Refer Slide Time: 27:39)

It is a theorem. It says that A is not equinumerous with its own power set. No set is
equinumerous with its own power set. How do we prove this? Consider any mapping f. Consider
an arbitrary mapping f from A to 2 to the power A. So it maps the members of A to subsets of A.
So, for x, f of x is a subset of A, when x is a member of A. So, let us say that x owns the
members of f of x.

(Refer Slide Time: 28:47)

405
Then let us define a set B. B is the set of precisely those members of A, so that x does not belong
to f of x. In other words, B is the set of those members of A, that does not own themselves.

(Refer Slide Time: 29:29)

Then, by definition, B is a subset of A. For each x belongs to A, x belongs to B if and only if x


does not belong to f of x by definition.

(Refer Slide Time: 29:51)

406
For some x belonging to A, if B equal to f of x. Suppose, B happens to be the image of some x
under f, then let us consider the possibilities. One possibility is that x belongs to B, but then B is
the same as f of x, then x belongs to f of x. If x belongs to f of x then x should not belong to B
because B happens to be the set of precisely those that do not own themselves. So, if x belongs to
f of x, then x owns itself. So x should not belong to B. On the other hand, if x does not belong to
B, then x does not own itself. This implies that x belongs to B. So, either way we get a
contradiction. Therefore, what we have is that B is not equal to f of x, for any x.

(Refer Slide Time: 31:23)

But B is of course a subset of A, which means B is a member of 2 to the power A. Therefore,


there is a member of 2 to the power A, that is not an image under the function f, or in other
words f is not onto. Mind you, the function f that we considered as an arbitrary one. We have
considered an arbitrary function f here and what we have shown is that this function is not onto.
So, any function f from A to 2 to the power A is onto. So, what we have established is that, f is
not onto. Since f is arbitrary, we have that no function from A to 2 to the power A is onto, which
means A is not equinumerous to 2 to the power A. In other words, no set can be equinumerous
with its own power set.

407
(Refer Slide Time: 32:49)

We say that a set is finite, if it is equinumerous with a natural number.

(Refer Slide Time: 33:11)

Now, let us consider a theorem, which is famous under the name Pigeonhole Principle. What
Pigeonhole principle says is that, no natural number is equinumerous with a proper subset of

408
itself. Remember n minus 1 is a subset of n under the definition of our natural numbers. So, what
it says is that no natural number is equinumerous with any smaller natural number.

(Refer Slide Time: 34:09)

Therefore, as a corollary, we can argue that, any set equinumerous with a proper subset of itself,
has to be infinite. In other words, it is not equinumerous with any natural number.

(Refer Slide Time: 34:49)

409
For a finite set A, the natural number that is equinumerous with A is called the Cardinal number
of A. This is denoted as Card A.

(Refer Slide Time: 35:33)

For example, consider this set. This is equinumerous with 4. There is a one-to-one mapping from
the given set A onto the natural number 4. Therefore, A is equinumerous with 4 or in other
words, the cardinal number of A is 4, or we say the cardinality of A is 4. So, for every finite set
there is a natural number that forms its cardinality. Two finite sets are equinumerous means they
have exactly the same cardinal numbers.

410
(Refer Slide Time: 36:27)

Or in general, for any two sets A and B, finite or infinite, we say that the cardinality of A is equal
to cardinality of B. By the definition of cardinality, this is if and only if A is equinumerous with
B. So, there is a one-to-one, onto mapping from A to B that is precisely when the cardinality of
A and B are identical.

(Refer Slide Time: 36:59)

The cardinality of the set of natural numbers is denoted as aleph naught, using the Hebrew letter
aleph.

411
(Refer Slide Time: 37:19)

Using cardinal numbers, we can form what is called Cardinal Arithmetic. Cardinal Arithmetic
has several interesting properties. If kappa and lambda are cardinal numbers, which means there
is a set A with cardinality kappa and there is a set B with cardinality lambda, or let us say using
matching letters K and L with cardinalities kappa and lambda respectively, then kappa plus
lambda is the cardinal number of K union L. kappa into lambda is the cardinal number of K cross
L. kappa power lambda is the cardinal number of the set of all functions from L to K. When K
and lambda are finite, these of course function the way we expect.

For example, if K and L are finite sets K union L has K plus lambda elements at the most. If K
and L are disjoint sets, K into lambda is the cardinality of K cross L, which is indeed the case, K
has kappa elements and L has lambda elements. So, kappa into lambda is the cardinality of K
cross L. kappa power lambda is the cardinality of L to K. That is the set of all functions from L
to K.

412
(Refer Slide Time: 39:29)

We say that set B dominates A, which is denoted as, in this fashion, for two sets A and B, we
write A less than or equal to B, to indicate that B dominates A. We say this precisely when there
is a one-to-one mapping from A into B. Mind you, this is into mapping which means the
cardinality of A is less than or equal to the cardinality of B.

(Refer Slide Time: 40:26)

413
So, using this definition we can say that, A is countable if and only if A is dominated by the set
of natural numbers. In other words, the cardinality of A is less than or equal to the cardinality of
aleph naught.

(Refer Slide Time: 40:52)

A famous theorem of set theory called Schroeder-Bernstein Theorem, says that, if B dominates
A, and A dominates B, then, A and B are equinumerous. In other words, if the cardinality of A
and B have this property that the cardinality of A is less than or equal to the cardinality of B, and
the cardinality of B is less than or equal to the cardinality of A, then the two have identical
cardinalities.

414
(Refer Slide Time: 41:48)

Related theorem is that a countable union of countable sets is countable. Another theorem asserts
that, for a cardinal number kappa, kappa is less than aleph naught, if and only if kappa is finite.
In other words, in a sense, the set of natural numbers is the smallest infinite set.

(Refer Slide Time: 42:56)

So, we know that aleph naught and 2 power aleph naught are not identical. aleph naught is the
cardinality of the set of natural numbers and 2 power aleph naught is the cardinality of the set of
all real numbers. We know that this is a countable set and this is not a countable set. So, the two

415
have different cardinalities. But, can they have a cardinality between these two? Cantor
conjectured that, there is no set of cardinality between aleph naught and 2 power aleph naught.
This conjecture was called the ‘Continuum Hypothesis’.

(Refer Slide Time: 44:01)

In 1939, Gödel showed that, the Continuum hypothesis cannot be disproved from the Zermelo–
Fraenkel axioms of set theory. In 1939, Gödel showed that Continuum hypothesis cannot be
disproved from the axioms of set theory. In other words, the contradiction of continuum
hypothesis cannot be proved. Many years later, in 1963 Cohen showed that, Continuum
hypothesis cannot be proved either. So, the statement that is Continuum hypothesis is that, there
is no set with cardinality between aleph naught and 2 power aleph naught, can neither be proved
nor be disproved from set theory.

416
(Refer Slide Time: 45:29)

But, if you consider the two statements, the Continuums hypothesis and its contradiction, one of
them must be true. Therefore, either CH or CH bar is a statement that is true but unprovable in
Zermelo–Fraenkel axiomatization of set theory. So, this is a statement which is true but
unprovable in set theory. Hope to see you in the other lectures. Thank you.

417
Discrete Mathematics
Professor Sajith Gopalan,
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology, Guwahati
Lecture 6
Set Theory

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the sixth lecture on Set
theory.

(Refer Slide Time: 00:37)

Today we shall study Partially ordered sets and Partial Ordering relations.

(Refer Slide Time: 01:13)

418
In a previous lecture we saw equivalence relations. An equivalence relation is one which is
reflexive, symmetric and transitive. Here, we consider a relation which is reflexive, which
means for every x, if R is a relation that we are considering, we say that R is reflexive if for
every x in the domain it is the case that x R x and R is transitive, if for every x, y and z, x R y
and y R z implies x R z. These are definitions that we have seen before.

(Refer Slide Time: 01:56)

We say that a relation is anti-symmetric. We have seen symmetric relations before, they say
that relation R is anti-symmetric, if for every x and y, it is the case that x R y and y R x
implies x equal to y. That is if the relation holds both ways between x and y, then x must be
equal to y. Which means for distinct x and y the relation can hold only in one direction, either
from x to y or from y to x, not both.

419
(Refer Slide Time: 02:43)

So, here we consider relations that are reflexive, anti-symmetric and transitive. These
relations are called Partial Ordering Relations.

(Refer Slide Time: 03:19)

A generic symbol that we use for denoting partial ordering relations is this. We could write in
this fashion and read this as A precedes B. Of course this symbol is similar to the less than or
equal to relation that we use on natural numbers or real numbers or integers, which is not
accidental, because the less than or equal to relation on natural numbers integers, real’s
etcetera are also partial ordering relations.

420
(Refer Slide Time: 03:55)

Because we know that for every number a less than or equal to a, if a less than or equal to b
and b less than or equal to a, then, a is equal to b, which is the anti-symmetric relationship
and if a less than or equal to b and b less than or equal to c, then a less than or equal to c.
Therefore, all three properties are satisfied by the less than or equal to relationship.
Therefore, the less than or equal to relationship is a partial ordering relation, that is why the
symbol that we use is similar to the less than or equal to relation.

421
Since this symbol is rather difficult to write, I will interchange it with the less than or equal
to. So, depending on the context, you must realize that the less than or equal to relation might
refer to another partial ordering relation.

(Refer Slide Time: 04:56)

So, let us see some examples of partial ordering relations. Let us consider the divisibility
relation on natural numbers. For any natural number a, we know that, a divides a, therefore
the divisibility relation is reflexive. If a divides b and b divides a, then, a is equal to b. If a is a
multiple of b and b is a multiple of a, then a is equal to b. Therefore, the anti-symmetry
relation also holds. By the way, the vertical bar translates as divides. So, when we write like
this, what we mean is that a divides b. And thirdly, if a divides b and b divides c, then a
divides c. The transitivity relation also holds. Therefore, the divisibility relation on natural
numbers is a partial ordering relation.

422
(Refer Slide Time: 06:13)

We say that, the set of natural numbers along with the divisibility relation forms a partial
order. Why is it called a partial order will become clear soon.

(Refer Slide Time: 06:33)

Our second example is the divisibility relation on the set of integers. Here we find that, this is
not a partial order, not a partially ordering relation. Why is this? This is because, we know
that 7 divides minus 7 and minus 7 divides 7, yet 7 and minus 7 are not the same. Therefore,
the anti-symmetry relation is violated. The anti-symmetry property is violated by this
relation. Therefore, when we consider the same divisibility relation for integers instead of
natural numbers we find that we do not get a partially ordered set or a Poset.

423
(Refer Slide Time: 07:40)

A Poset for short stands for a Partially Ordered Set. A Poset is an ordered pair (S, R), where S
is a set and R is a partial ordering relation on S. This ordered pair is what is called a Poset.

(Refer Slide Time: 08:13)

Another example is the example of set inclusion. Let us say, we have a family F of sets. Then
for any A, we know that A is a subset of A. Therefore the reflexive property holds for the
subset relation. If A is a subset of B and B is a subset of A then, A is equal to B. Mind you,
we do not use the proper subset relation here, we use the subset or equal relation, and
transitivity also holds, if A is a subset of B and B is a subset of C then A is a subset of C.

424
Therefore, all three properties hold here. Therefore this is a POSET. The family of sets F
along with the subset or equal relation forms a POSET.

(Refer Slide Time: 09:32)

Another example is a relation a R b such that b equal to a power n for some positive integer
n, where, a and b are natural numbers. So, we are considering a relation on natural numbers.
We say that a R b, if b equal to a power n, this is also a Poset, because a is a power 1. If b is a
power n and a is b power m, then a is equal to b, where n and m are positive numbers. If b
equal to a power n and c equal to b power m, then c can be expressed as an integer power of
n, for a positive integer. Therefore this is also a POSET.

So, those are some examples of partial ordering relations, and the corresponding POSETs.

425
(Refer Slide Time: 10:55)

So, you would observe the duality here. The less than or equal to relation and the greater than
or equal to relation are duals of each other. They are inverses of each other. We can consider
subsets of Posets. Let us say we have a relation R on a set S. Suppose, this is a Poset. Then
consider a subset A of S, then the restriction of R to A, as you can verify is a partial ordering
relation. So A is an ordered subset of S.

(Refer Slide Time: 12:03)

So, we have so far been talking about the less than or equal to relation or the greater than or
equal to relation which is a dual of it. These we know are partially ordering relations. But,
what about the less than relation, we say that, a less than b or a strictly precedes b, if a

426
precede b and a not equal to b. We find that this is not a partially ordering relation because
the anti-symmetry property does not hold and moreover the reflexive property also does not
hold because it is not the case that a less than a. In fact for every a, we can say that is not less
than a. Therefore, this is irreflexive and for every a, b, c, we know that a less than b less than
c implies that a less than c. So, transitivity holds here. So, the less than relation has these two
properties reflexivity and transitivity. When these two properties hold, then we have what is
called a ‘Quasi-Order’.

(Refer Slide Time: 13:35)

So, there is always a quasi-order which is associated with a partially ordering relation. When
the precedes relation is a partially ordering relation, the strictly precedes relation is a quasi-
ordering relation.

427
(Refer Slide Time: 13:49)

Now, let us consider the issue of comparability. It could be that, a less than or equal to b, for
a pair of elements a and b, or it could be that b less than or equal to a for the same pair, or it is
possible that neither may hold. If neither this nor this holds, we say that a and b are
incomparable. Remember the less than or equal to symbol here in fact stands for the
precedence relation. If neither a precedes b, nor b precedes a then we say that a and b are
incomparable. Of course you will not find such a pair, when you consider the less than or
equal to relation on natural numbers.

(Refer Slide Time: 15:04)

428
But then in some other cases you would be able to find incomparable pairs. For example,
consider the divisibility relation on natural numbers. We find that 7 does not divide 11 and 11
does not divide 7, which means 11 and 7 are incomparable. Symbolically, we write using two
bars. 11 is incomparable to 7. So, it is possible for incomparable pairs to be there in some
partially ordering relations, that is precisely why it is called a partially ordering relation.

(Refer Slide Time: 15:48)

A partial order, a proper partial order has incomparable pairs. Therefore, the set of natural
numbers with the divisibility relation is a proper Poset. As opposed to a proper Poset is a total
ordering relation. In a total ordering relation, you will not be able to find a pair of elements
that are incomparable to each other. For example, if you consider the set of natural numbers
along with the less than or equal to relation, you find that for every pair of natural numbers
the less than or equal to relation holds. You take any pair of natural numbers a and b either a
less than or equal to b or b less than or equal to a. It is not possible that these two, neither
relation holds between the pair. Therefore, the less than or equal to relation is a total ordering
relation.

429
(Refer Slide Time: 16:58)

So, let us consider some examples. Consider this set. It consists of 3, 5, 30, 90 and 180. We
find that 3 divides, let us make this 15, we find that 3 divides 15, which divides 30, which
divides 90, which divides 180. Therefore, you take any two members in this, and you find
that there is a relation, the divisibility relation between them. For example, you take 30 and
180 there is the divisibility relation between them. 30 divides 180, so the divisibility relation
holds one way. Therefore, this is a totally ordering relation.

(Refer Slide Time: 17:53)

Let’s take another example. Suppose A is a set. Consider the power set of A. If mod A equal
to 1, then the power set of A with the subset or equal to relation is a totally ordering relation,

430
because there are only two subsets here. If A happens to be the singleton containing just a1,
then there are only two subsets in 2 power A, 2 power A consists of just the empty set and A
itself and the empty set is a subset of A. Therefore, we have a total ordering relation.

(Refer Slide Time: 18:56)

On the other hand, if the size of A is 2. Let us say A is made up of two elements a1 and a2,
then, we find that the empty set is a subset of the singleton containing a1. It is also a subset of
the singleton containing a2, and these two are subsets of a1 and a2. But we find that these two
singletons are not comparable to each other. The singleton a1 is not a subset of the singleton
a2 and the singleton a2 is not a subset of the singleton a1. So, these two are incomparable.

431
(Refer Slide Time: 19:48)

Considering various orderings, let us consider ordered tuples. First, let us consider ordered
pairs. Let us say A and B are posets with the less than or equal to relation, or any generic
precedence relation. So we have two Posets A and B. Let us consider ordered pairs from A
cross B. So, we consider any relation between A and B. We say that ordered pair (a, b) is less
than or equal to or precedes ordered pair (a prime, b prime), if a is less than or equal to a
prime and b is less than or equal to b prime. So, this is one possible ordering.

(Refer Slide Time: 21:01)

So, in this case we can say that (2, 3) and (3, 2) are incomparable according to this ordering.
Extending this, we can consider ordered n tuples. We can say that ordered n tuple a1 through

432
an is less than or equal to or precedes the ordered n tuple b1 through bn, if ai is less than or
equal to bi, for 1 less than or equal to i less than or equal to n. For every i, it is the case that ai
precedes bi. That is when we say that the ordered n tuple a1 through an precedes the ordered n
tuple b1 to bn. So, this is a straightforward generalization of the earlier ordering that we saw
for ordered tuple, ordered pairs.

(Refer Slide Time: 22:11)

Now, the order tuple a1 through an can be thought to be less than or equal to b1 through bn.
Here, we are considering another ordering relation. Let me denote as less than or equal to 2.
So, according to this ordering relation, we say that a1 through an is less than or equal to b1
through bn, if ai equal to bi for 1 less than or equal to i less than or equal to k minus 1, and ak
is less than or equal to bk. So, according to this, if you consider, ordered pair (2, 3) and
ordered pair (3, 2), since 2 is less than or equal to 3 in the first component, we can say that,
this relation holds between them. (2, 3) comes before (3, 2) in this ordering. So, this is
Lexicographic ordering.

433
(Refer Slide Time: 23:32)

For example, if you were to consider all strings of length three, made up of three symbols a,
b, c. If a less than or equal to b less than or equal to c, then in dictionary order you would
enumerate them like this. This is the first string of length three, the smallest string of length
three. Then this would be the next string of length three, this would be the next. Now, you
have a change in the second position and so on, ending with c, c, c. So, this is a lexicographic
ordering of all strings of length 3. So, that is precisely what we do here. We say that an
ordered n tuple a1 through an is less than or equal to an ordered n tuple b1 through bn.
According to this ordering, if ai equal to bi for 1 less than or equal to i less than or equal to k
minus 1, for some k, and it is the case that for that particular k, ak less than or equal to bk. So,
we do not look at the positions which are further to the right of k, that is k plus 1 through n

434
could be anything. So, these two ordered pairs, ordered tuples match in the first k minus 1
positions and when you look at the k th position, a1 through an has a smaller value. Therefore,
we say that a1 through an precedes b1 through bn. This is opposed to the previous ordering that
we saw, so these two are different orderings of order tuples.

(Refer Slide Time: 25:30)

And then we can consider strings in general. We can consider an alphabet sigma and we can
talk about all strings from alphabet sigma, which is denoted by sigma star. This is the Kleene
Closure of sigma. This would of course consider the null string, the string of no length, which
is made up of no character. Then it will have all strings of length one, all strings of length two
and so on. You can form an infinite set of strings from sigma, even if sigma is finite. So, this
set is sigma star.

435
(Refer Slide Time: 26:32)

Sigma star could be lexicographically ordered, which means in dictionary order, you would
prefer to order them in this way. We will say that, the null string epsilon is less than w, for
any non-empty w. Any non-empty string w from sigma star will have this property, it will
come after the null string. Secondly, if u is character symbol a followed by u prime and v is
symbol b followed by v prime, where, a and b are symbols, u prime and v prime are strings
from sigma star, then we would say that u is less than or equal to v, or u precedes b in this
ordering. If either a less than or equal to b, or a equal to b and u prime precedes v prime.
Well, these are strict precedences.

436
(Refer Slide Time: 28:20)

Of course, often we find a variant of lexicographic ordering, which would be lexicographic


ordering within the same length. So, this ordering would be like this. If sigma happens to be a
and b, just two symbols, then sigma star would be ordered thus. First you enumerate the null
string, then we have strings of length one a and b. Then we have strings of length two aa, ab,
ba and bb. Then we enumerate all strings of length three and so on. That is within the same
length, we enumerate the strings in lexicographic ordering. But strings will be enumerated in
the monotonic order of increasing length.

(Refer Slide Time: 29:26)

437
Now, we study Hasse diagrams. We say that, a immediately precedes b, this notation says a
immediately precedes b. What it means is that, a precedes b and there is no c such that, a
precedes c, which precedes b. So, there is no intermediate element between a and b. So, in
this case, we say that a is an immediate predecessor of b, or conversely b is an immediate
successor of a.

(Refer Slide Time: 30:38)

In a Hasse diagram, we draw the partial order using lines going from below to above. We do
not usually place an arrow. When we place a and b in this manner, what we mean is that a is
an immediate predecessor of b. So, the less than or equal to relation now flows from below to
above.

438
(Refer Slide Time: 31:07)

So, let us consider an example. Let us consider the example of Set Inclusion. We consider set
A, that consists of two elements a and b, and let us look at the power set of A which is 2
power A. The members of 2 power A are A itself, singletons a and b and the empty set as we
saw earlier. So, in the Hasse diagram we find that, the empty set is included in the singleton
b, the empty set is included in the singleton a as well. So, we draw lines in this manner to
show that phi is less than or equal to b. Here, less than or equal to stands for the subset
relation and similarly, we have lines of the sort too. So, this is the set inclusion partial order
for two member elements.

(Refer Slide Time: 32:15)

439
If A had three members, we would have elements arranged in this manner. There is only one
3 member subset. There are three 2 member subset, and we have immediate predecessor
relation between a, b and a, b, c. For example, a, b is an immediate predecessor of a, b, c.
Similarly, b, c is also an immediate predecessor of a, b, c. Then we have singletons a, b and c.
We have, edges of the sort, and then we have the empty set.

(Refer Slide Time: 33:31)

Let us consider the divisibility relation on the set 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12. The
diagram will be drawn like this. 2 and 3 are incomparable, so you cannot have an edge
between them, so they are placed at the same level. 2 does not divide 3 and 3 does not divide
2. Now, when we come to 4, we know that 2 divides 4. So, we have an edge from 2 to 4 in
the Hasse diagram. Now, 5 does not divide any of the previous numbers, it is a prime and the
previous numbers do not divide 5. Then, we have 6. 6 is divided by both 2 and 3, so we have
edges from 2 and 3 to 6. 7 is a prime. 8 is a multiple of 4. 9 is a multiple of 3. 10 is a multiple
of 5 and 2. 11 is a prime. 12 is a multiple of 6 as well as 4 that is there is an immediate
predecessor relation between 4 and 12. We do not draw an edge from 2 to 12 even though 2 is
a divisor of 12 that is because 2 is not an immediate predecessor of 12. Because 4 is in
between, 2 divides 4 and 4 divides 12, so there is a chain of divisibility is from 2 to 12, a
chain of length more than 1.

440
Therefore, there is an edge from 4 to 12 but there is no edge from 2 to 12. So, this would be
the Hasse diagram for the divisibility on this set. The resultant partial order will be drawn like
this.

(Refer Slide Time: 35:55)

Then, let us consider the less than or equal to relation on the set 3, 4, 5, 6, 7, 8. Here, we find
that, the Hasse diagram has a simple form. That is because this is a total order, which means
between any pair of elements the less than or equal to relation holds. But in the Hasse
diagram, we do not draw every possible line. We show only the immediate predecessor. So,
the immediate predecessor of 4 is 3, the immediate predecessor of 5 is 4, the preceding
number. In fact, to get the partial order you should take the, you should apply transitivity on
this, on this predecessor relation that is shown here. For example, there is a path from 4 to 6
here in this diagram. Therefore we know that 4 is actually a predecessor of 6. 4 is less than or
equal to 6.

441
(Refer Slide Time: 36:59)

442
A maximal element in a poset is an element a, such that no element is greater than or equal to
it, or no element is larger than it, strictly larger than it. So, if you look at these diagrams, here,
the set a, b, c is a maximal element, because no member in the diagram is above a, b, c. Here,
we find that 8, 12, 9, 10, 7, 11, these are all maximal elements and here we find that 8 is a
maximal element. Analogously, a minimal element is an element such that no element is less
than that. So, if you look at these diagrams, you find that phi is a minimal element. Here 2, 3,
5, 7, 11 are all minimal elements. In this diagram, 3 is a minimal element. Nothing is below
3. So, those are the maximal and minimal elements.

(Refer Slide Time: 38:54)

443
444
We say that, an element is the first element. a is the first element, if for every x, it is the case
that a is less than x or a is less than or equal to x. We say that, a is the last element of the
Poset, if it is the dual of this, which means for every x, x is less than or equal to a, x precedes
a. So, if you look at the diagrams here. Here we find that phi is a minimal element and phi is
also the first element. Set a, b, c is a maximal element and also the last element.

But when we come to this, we find that it does not have a first element or a last element. 8,
12, 9, 10, 11 are all maximal elements, but they are incomparable to each other. So, there is
no element which is a successor of everybody. Similarly, there is no element which is a
predecessor of everybody. 2, 3, 5, 7, 11 are all minimal elements but there is no first element
here. Here, 8 is a maximal element as well as a last element, 3 is a minimal element as well as
the first element. So, from this we know that, a first element is always a minimal element, not
necessarily vice versa. A last element is always a maximal element and not necessarily vice
versa.

445
(Refer Slide Time: 40:46)

Let us consider consistent enumerations. Say S is a finite Poset, a function f from S to the set
of natural numbers, such that a precedes b implies that, f of a is less than f of b, is a consistent
enumeration of S. So, this is the precedence relation, whereas this is the less than relation
over natural numbers. So, for a function to be a consistent enumeration of S, it should be a
mapping from S to the set of natural numbers and it should be show that a precedes b implies
f of a is less than f of b. So, we have effectively numbering the members of the Poset.

(Refer Slide Time: 42:13)

So, let us consider the Hasse diagram of a Poset. Let us say, we have set of elements like this.
So, a numbering so that, g gets number 1, a, c, d gets numbers 2, 3, 4; b, e gets a 6 and 5 and f

446
gets 7, is a consistent enumeration. g is a predecessor of a, g gets 1 and a gets 2 which is
consistent and g gets 1 and c gets 3 and g is a predecessor of c which is consistent and g is
less than or equal to f and 1 is less than or equal to 7, so once again it is consistent. So, you
can check that the precedence. Every precedence is satisfied. For any pair of elements x and y
so that x precedes y the number given to x is less than or equal to the number given to y.

(Refer Slide Time: 43:21)

So, here we have a theorem, which says that there exists a consistent enumeration for any
finite poset A. So, the proof of this theorem is by induction on the size of A. When A is a
singleton, we will define f thus. f of a equal to 1 and there is only one element here, so there
is no conflict, the enumeration is a consistent one.

447
(Refer Slide Time: 44:23)

Now, by Induction hypothesis, let us assume that the statement holds for all sets with n minus
1 elements. Now, for the induction step, consider a set A with n elements. Then imagine the
Hasse diagram for A, when you look at the Hasse diagram, you will be able to find a maximal
element of A. There could be many maximal elements. Let us pick one out. Suppose that is
small a. Then, if you consider A minus a, which I call set B. We find that the size of B is n
minus 1, therefore B should have a consistent enumeration. So, let us say g is a consistent
enumeration for B. Then, a consistent enumeration for A can be constructed very easily.

(Refer Slide Time: 45:48)

448
Let us define a function f. f of x will be defined like this. If x equal to a, we will put f of x as
n. If x is not a, then x is a member of B as well. Therefore, g is defined for x, so we will
define f of x as g of x. So, we are using the same enumeration as we got before, that is we got
an enumeration for B. We assume B x is inductively, we take that B or if we take that
enumeration and then extend that enumeration by setting the function value for a to n. So, a
will now be the nth element. So, in the Hasse diagram, we pick out one maximal element. So,
in this case f is the maximal element and then inductively we number the rest of the diagram,
after that we put f back in place and give f the largest number. So, in this case that largest
number is 7. 7 is the number of elements in the original set. So, the numbering that you
obtain in this fashion is going to be a consistent enumeration.

(Refer Slide Time: 47:02)

449
Now, let us study Chains and Anti chains. So, let us consider a Poset A and less than or equal
to, a subset of A, a subset B of A is a chain if every pair of elements from B are related,
which means B along with the less than or equal to relation forms a total order.

(Refer Slide Time: 48:19)

In a finite chain, there is the first element and the last element. So, a chain is a sequence of
elements of the sort. So, you can always find the first element and the last element. If you
look at the Hasse diagram of a chain, it would look like this. So, the bottom most element is
the first element and the topmost element is the last element.

(Refer Slide Time: 49:12)

450
A subset B of A is an anti-chain, if no two elements of B are related. So, that is the definition
of a chain and anti-chain.

(Refer Slide Time: 49:46)

So, let us take an example now. Let us consider the divisibility relation. This set {1, 2, 3, 4, 6,
7, 12, 14, 21, 28, 42, 84} is the set of divisors of 84. Then we have, starting from the bottom,
we have 2, 3, 7 are all primes and 1 is a divisor of 2, 3 and 7. So, 1 will be a predecessor of 2,
3 and 7. Now, 4 is a successor of 2, 6 is a successor of both 2 and 3. 12 is a successor of 6
and also a successor of 4. 14 is a successor of 7 and also of 2. 21 is a successor of 7 and also
of 3. 28 is a successor of 14. 28 is also a successor of 4, because 28 is 4 into 7. So, there is no
number, so that 4 divides that number and that number divides 28.

So, there is no intermediate element between 4 and 28. So, you have to draw a line from 4 to
28. Then, 42 is a successor of 21, which is also a successor of 14 and 42 is a successor of 6 as
well, and then we have 84, which is a successor of 42, 28 and 12. So, this would be the Hasse
diagram for this Poset. So, in this, if you consider 1, 2, 4, 12 and 28, this is a chain. Similarly,
1, 2, 6, 1, 2, 4, 12 and 84, sorry 1, 2, 6, 42 and 84 is another chain. What would be an anti-
chain? 4, 6, 14, 21 will form an anti-chain, that is because no two of these are divisors of each
other. Similarly, 2, 3, 7 will also form an anti-chain. 12, 28, 42 also will form an anti-chain.
In fact, 1 alone will form an anti-chain. 2, 3, 7 will form an anti-chain. 4, 6, 14, 21 will form
an anti-chain, this is also anti-chain. We saw some of the anti-chain.

451
(Refer Slide Time: 53:43)

So, I will state this theorem without proof. Consider Poset P with the less than or equal to
relationship, say the length of the longest chain, is n. Then the elements of P can be
partitioned into n disjoint anti-chains, that is the length of the longest chain is n, and the
elements of P can be partitioned into n disjoint anti-chains. As we have done in this case, the
longest chain has a length of 5, so we have managed to partition this into five disjoint anti-
chains. These are not the only anti-chains, by any means. But this is one set of five disjoint
anti-chains.

452
So, the proof is easy, I am leaving it as an exercise to you, but to give a hint, you can look at
the maximal elements here. In this Hasse diagram there is only one maximal element which is
84, so you can take that out then the rest of the partial order has chains of smaller length and
we can apply the statement to that inductively. So, once you partition the rest of the diagram
into n minus 1 disjoint partitions, then you can add the maximal elements that you have
removed in the first step. And once again, the bases will be formed by the singleton set, or the
partial orders in which there is a chain of length at most one.

(Refer Slide Time: 55:59)

As a corollary, we find that for a poset P and less than or equal to, consisting of nm plus one
elements, either there is an anti-chain consisting of m plus 1 elements or there is a chain of
length n plus 1. So, let us contradict the assumption that, there is a chain of length n plus 1.

453
(Refer Slide Time: 57:01)

So, let us say the longest chain has a length of n, at the most. Then, we can partition this into
at most n disjoint anti-chains. So, this is a partition of the given set. We are partitioning it into
at most n disjoint anti-chains. If no anti-chain has n plus 1 elements, then each anti-chain has
at most m elements. So the number of elements would be less than or equal to m into n,
which is a contradiction, because we have assumed that we have mn plus 1 elements.

(Refer Slide Time: 58:16)

So, as an example, consider 10 women. Among them, there are 3 women, who form a
grandmother-mother-daughter triplet. So, that is a chain of descendants of length three. So,
among these three women, either there are 3 women that form a grandmother-mother-

454
daughter triplet, or there are 3 women, none of whom is a daughter of another. There are 3
women, none of whom is a descendent of another among those 3. That is because 10 equal to
3 into 3 plus 1.

So, we take n equal to 3 and m equal to 3 here, and from the corollary we find that among 10
women, one of these two possibilities must hold. So, that is it from this lecture. Hope to see
you in the next. Thank you.

455
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering,
Indian Institute of Technology, Guwahati
Lecture 22
Natural Number, Divisor

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the first lecture on number
theory. In number theory, we study the theory on integers. Integers along with the two operators,
multiplication and addition and the two constant 0 and 1, you would see in the module on
algebraic structures that form what is called an integral domain. So, number theory is the study
of this integral domain.

(Refer Slide Time: 1:04)

So in number theory, we deal with a set of integers, and operators: multiplication and addition
along with 0 and 1. Together these form an integral domain.

456
(Refer Slide Time: 1:37)

For two integers a and b, where is a not equal to 0, we say that a divides b, if there exists an
integer x, so that b equals ax, that is a board multiplying a with some integer x, we would get b,
that is when we said that a divides b and this is denoted in this fashion, using a vertical bar. This
notation asserts that a divides b. The negation of this is the negation of a divides b is often
written like this. A cross across the vertical bar to indicate that a does not divide b.

(Refer Slide Time: 2:43)

457
So let us see some results related to division. If a divides b, then for every c, which is an integer,
it is the case that a divides bc. This is easy to show. If a divides b then b equal to ax for some
integer x, then for any c we have b equal to ax. Therefore, bc is equal to ax into c which by
associativity of multiplication can be written as xc times a. This implies that a divides bc. So that
was easy to show.

(Refer Slide Time: 3:42)

Another result is this. If a divides band b divides c, then a divides c. In other words, the divides
relation is transitive. This is also easy to show. a divides b implies that, for some integer x, b
equal to ax. Similarly, b divides c implies that, for some integer y, c equal to by. Therefore c can
be written as the product of xy and a, which means, there is an integer so that, a multiplied with
that integer is c. So that implies that a divides c.

458
(Refer Slide Time: 4:56)

The third result is this. If a divides band a divides c, then for any p, q that are integers, a divides
bp plus cq. That is if a divides band a divides c, then a will be divide any linear combination of
band c, where the linear combination has integer coefficients. p and q are the coefficients of the
linear combination, these are integers. So any such linear combination of band c will be divided
by a.

(Refer Slide Time: 5:48)

459
How do we show this? What is given is this, a divides band a divides c. If this is the case, then
by the definition we know that there are integers x and y, so that b equal to ax and c equal to ay.
b is a multiple of a and c is also a multiple of a. In that case for any x, y, we can say for every p,q
which are also integers bp plus cq is a into px plus yq. For any pair of integers p and q, we can
write bp plus cq as apx plus ayq, because b is ax and c is ay, which implies that for all integers p
and q, there exists x and y such that bp plus cq equals a into xb plus yq. That is because this
statement is a weakest treatment in comparison to the one above or in other words for every pair
of integers p and q, bp plus cq is az, for some integer z. In other words, for all p, q, which are
integers, bp plus cq is a multiple of a or a divides b plus cq and that is precisely what we wanted
to show. For any pair of integers p and q, the linear combination of b and c is a multiple of a.

(Refer Slide Time: 8:19)

Our fourth statement is this. If a divides b and b divides a, then a equals plus or minus b. If a
divides b, then a multiplied with x is b for some x, which is an integer, and if b divides a, then by
is a, for some integer y. Then axy is by which is a, where both x and y are integers. So there exist
integers x and y so that, axy equal to a, or in other words xy equals one.

If for integers x and y, x into y happens to be 1, then we have only two possibilities, either x
equal to 1 and y equal to 1 or x equal to minus 1 and y equal to minus 1. In the first case, we
have a equal to b. In the second case we have a equal to minus b. So combining these two we can
assert that is a plus or minus b.

460
(Refer Slide Time: 9:35)

If a divides b for positive a and b, then a less than or equal to b. Prove this yourself.

(Refer Slide Time: 9:59)

And another property of the divides relation is this. If an integer x that is non-zero is given and a
divides b then xa divides xb. This also you can try out. So those were some results about the
divisibility relation.

461
(Refer Slide Time: 10:36)

Now let us see what is called the division algorithm. At the heart of this algorithm, we have this
theorem. For any two integers a and b, where a is greater than 0, b need not be greater than 0.
There exist unique integers q and r, such that b is qa plus r, where 0 is less than or equal to r,
which is less than a. So when you find such a unique ordered pair (q, r), q is called the quotient
and r is called the remainder.

(Refer Slide Time: 11:50)

462
So how do we prove that there exits such a unique pair (q, r)? That is what we want. We consider
the real line. On this real line, consider point b. b is an integer it may be positive or negative, and
we have a, which is greater than 0. From b, let us start marking points that are at distance a. So
we get an arithmetic progression. b plus a is the next point, b plus 2a is the one after that, b plus
3a is one after that and so on. That is going to the right side. If you go to the left side, we have b
minus a, b minus 2a, b minus 3a and so on. So starting from b, we are going to the right, jumping
at a distance of a every time. Similarly, we can also move to the left jumping at a distance of a
every time. Now on this real line, 0 is somewhere. Let us say this is where 0 is.

In that case, a will be here. At a distance of a from 0 to the right. So let us consider this interval,
the interval from 0 to a. You start from b, and start jumping at a distance of a either to left or to
the right. In one of the directions you would jump into this interval exactly once. That is there
will be exactly one point falling in this interval which is within this arithmetic progression. This
interval has exactly one point of the, in particular, what we need to know is that there is one
point.

(Refer Slide Time: 14:26)

Suppose that one point corresponds to (q, r). Ordered pair (q, r) corresponds to the one point that
we find. Then at this point we have, b minus qa equal to r or b equals qa plus r, and here 0 less
than or equal to r less than a. Now we have to argue that this ordered pair is unique.

463
(Refer Slide Time: 15:16)

Suppose, otherwise. Suppose, (q prime, r prime) is another such ordered pair. Clearly r prime is
not equal to r, because otherwise q prime is the same as q and therefore this ordered pair would
not be distinct from the earlier one. So r prime is not equal to r and r prime is b minus q prime a,
that is because (q prime, r prime) is an ordered pair which satisfies our requirement. So r prime is
a non-negative member of a prime, that is because we assume that 0 less than or equal to r prime.
(q prime, r prime) is another ordered pair where r prime is greater than or equal to 0.

(Refer Slide Time: 16:29)

464
But then, r is the least non-negative member of a. So, r prime is greater than or equal to r plus a.
But what is r plus a. This is b minus qa plus a. But this is then greater than or equal to a. And
therefore our prime will not qualify. Because we want a (q prime, r prime), such that b is q prime
a plus r prime and 0 less than or equal to r prime less than a. This is violated here. Therefore,
there cannot be another ordered pair (q prime, r prime).

(Refer Slide Time: 17:30)

So (q, r) is unique, and hence, our claim. If a does not divide b, then 0 less than r less than a. In
that case, none of the points in the arithmetic progression will be divisors of a. Therefore, r will
be strictly greater than 0.

465
(Refer Slide Time: 18:09)

Now it is easy to show this theorem. For any two integers a and b, where a is not equal to 0, there
exist unique integers q and r, such that b is qa plus r and 0 less than or equal to r less than mod a.
See here we only say that a is not equal to 0. We do not assume that a is greater than or equal to
0. So you can prove this theorem yourself.

(Refer Slide Time: 19:01)

466
We say that a is a common divisor of b and c, if a divides b and a divides c, where a, b, c are all
integers. So a pair of numbers b and c can have multiple common divisors. Every nonzero
integer has only a finite number of divisors.

(Refer Slide Time: 19:56)

Therefore, the common divisors of two numbers, b and c, which are integers form a finite set.
Therefore, we can talk about the greatest of them. The greatest common divisor happens to be
the largest of this finite set.

(Refer Slide Time: 20:41)

467
So we will denote this by GCD of b and c if b not equal to 0 or c not equal to 0. We can extend
this notion to multiple integers. We can talk about GCD of b, c, d, which is the GCD of GCD of
b and c and d, as we shall see.

(Refer Slide Time: 21:16)

Now we study an interesting theorem which says that, if g is the GCD of two numbers b and c,
then there exist integers x0 and y0, such that g is bx0 plus cy0. In other words, if g is the GCD of b
and c, then g can be expressed as a linear combination of b and c with integer coefficients.

468
(Refer Slide Time: 22:03)

So let us see how to do this. In particular consider two numbers. Let us say b equal to 3 and c
equal to 7. Then we want to express the GCD of these two, which we know is 1. GCD of 3 and 7
is 1. We want to express 1 as a linear combination of 3 and 7. So we could write this as 3x plus
7y. We have to find x and y so that 1 is equal to 3x plus 7y, that is precisely what the theorem
says, The GCD of two numbers can be expressed as a linear combination of those two numbers
with integer coefficients x and y.

So let us consider the various possible values of x and various possible values of y. When x
equal to 0, y equal to 0, we have the linear combination evaluating to 0. When x is 1 and y is 0
we have 3. When x is 2, y is 0 we have 6. On the negative side we have minus 3, minus 6. When
y is 0 and x is 1, when x is 0 and y is 1, we have 7. When x is 0, y is 2 we have 14. On the other
direction, we have minus 7 and minus 14. Here we have 10 and 17. This is how the values would
look like. For various integral values of x and y, the linear combination 3x plus 7y would have
these values. So you find that, indeed there is one particular choice of x and y, for which the
linear combination evaluates to 1. When x equal to minus 2, and y equal to 1, we have 3x plus 7y
evaluating to minus 6 plus 7 which is 1. So, there is a choice of x and y, for which the linear
combination evaluates to 1. So how do we generalize this? We want to bring the assertion for
every pair of integers, b and c.

469
(Refer Slide Time: 25:01)

So we have this pair of integers, b and c. Let us define the set S as the set of all integers bx plus
cy, where x and y are integers. So it is precisely this set that we depicted here for integers 3 & 7.
So this is clearly an infinite set. Let d be the least positive member of S. Depending on the
various choices for x and y, we have different values in S. We are picking out the least positive
member of S. We call it d.

(Refer Slide Time: 25:57)

470
Say d is bx0 plus cy0. Every member of S is a linear combination of b and c, for some choice of x
and y. So, d is also the same. So there is a choice of x and y namely x0 and y0, for which d is bx0
plus cy0. If d does not divide b, then there exist unique r and q, such that b is qd plus r, where r
is strictly between 0 and d.

(Refer Slide Time: 26:58)

So we have r is b minus qd, which is b minus q into dx0 plus cy0. Rearranging we get that, this is
b into 1 minus qx0 minus c into qy0, or I can put plus here and move the negative sign here. So
we have two integers 1 minus qx0 and minus qy0, so that r is a linear combination of b and c with
these as the coefficients. But we know that r is strictly between 0 and d. Therefore what we have
found is that r belongs to S and r is positive. But we had picked d as the least positive member of
S and here we find r which is a positive is a member of S but is less than d. Therefore, we have a
contradiction, and from did we derive this contradiction? We assume that d does not divide b,
and then got this contradiction. Therefore it must be that d divides b.

471
(Refer Slide Time: 28:28)

Similarly, we can also argue that d divides c. So if d divides b and d divides c, then d is a
common divisor of b and c. Now, consider the GCD of b and c. Let g be the GCD of b and c.
Since d is bx0 plus cy0, we have that g divides d. g divides b and g divides c, so g divides bx0
plus cy0 as per the theorem we saw earlier. So g divides d.

(Refer Slide Time: 29:18)

472
g and d are both positive. So since g divides d, g is less than or equal to d. But then g is the GCD
of b and c, and d is a CD, a common divisor. g is the greatest common divisor. Therefore, clearly
g is greater than or equal to d. In other words g is equal to d.

(Refer Slide Time: 30:01)

In other words, the GCD of b and c, this is the smallest positive integer, that can be written as a
linear combination of b and c, with integer coefficients.

(Refer Slide Time: 30:37)

473
Now, let’s see another theorem. For any two integers, b and c not both 0, the GCD is the positive
common divisor that is a multiple of every common divisor. For any pair of integers, b and c, not
both 0, the greatest common divisor happens to be the positive common divisor that is a multiple
of every common divisor.

(Refer Slide Time: 31:33)

To prove this, suppose d is a common divisor of b and c, then, d divides every linear
combination of b and c. If d divides b and d divides c, then d divides bx plus cy for any pair of
integers x and y. So d divides every member of the set. In particular d divides the least positive
member of the set. This is what we call S. But then what is the least positive member of S. That
happens to be the GCD of b and c. So if d is a common divisor of b and c then d divides g.

474
(Refer Slide Time: 32:45)

So every common divisor of b and c divides g. If g and g prime are both positive common
divisors, that are divided by every common divisors, then, g and g prime are themselves common
divisors. Then, we have g divides g prime and g prime divides g, which implies that g equal to g
prime. But in other words, g is the only common divisor with this property, the only common
divisor that is divided by every common divisor of b and c. In other words, the only common
divisor of b and c which is divided by every common divisor of b and c is the GCD of b and c.

(Refer Slide Time: 33:58)

475
Another theorem regarding GCD’s, for every positive integer d, we have that GCD of bd, cd
equals GCD of bc multiplied by d. What is the GCD of bd and cd? This happens to be the least
positive member of the set bdx plus cdy, where x and y are integers, which is d times the least
positive member of bx plus cy where x and y are integers. This is the case when d is a positive
integer, which is indeed the case here. But this is the GCD of b and c that is precisely what we
wanted to show. So you can remove common factors from bd and cd. d is a common factor of bd
and cd and then find the common GCD of the remnants.

(Refer Slide Time: 35:48)

Now related theorem is this. For every positive common divisor d of b and c, GCD of (b/d, c/d)
is GCD of (b and c divided by d). How do we prove this? In the previous theorem you put d as
d, b by d as b and c by dc. If you substitute this in this theorem, we get the new theorem as a
corollary.

476
(Refer Slide Time: 36:48)

A yet another theorem is, If GCD of a and d is 1 and GCD of b and d is 1, then GCD of ab and d
is 1. In other words, look at the fraction a by d, you cannot reduce this fraction anymore. a and d
do not cancel. Similarly, b and d also do not cancel. b and d do not have common factors other
than 1. Therefore if you consider ab by d, then d cannot cancel against ab. d and a do not have
common factors. d and b do not have common factors other than 1. Therefore d and ab also will
not have common factors other than 1. Of course intuitively clear to you. But how do you prove
it?

477
(Refer Slide Time: 37:57)

We know that GCD of a, d equal to 1. GCD of b and d is also equal to 1. Therefore we have
integers x0, y0, x1, y1, so that 1 is ax0 plus by0 and 1 is b x1 plus dy1. So, there exists x0, y0, x1, y1,
all integers. So, that this is satisfied.

(Refer Slide Time: 38:52)

Let us define z1 as dy0 y1 minus y0 minus y1, and z0 as x0 x1. If this is the case, then we readily
find that, ab into z0 is (1 minus dy0) 1 minus dy1. Replacing x0 and x1 with 1 minus dy0 and 1

478
minus dy1, we find that ab into z0 is this, which is 1 plus dz1. That is ab into z0 plus d into minus
z1 equal to 1.

Therefore, if you consider the linear combinations of ab and d with integer coefficients, the least
positive member of that set is going to be 1. If 1 is present in that set, certainly 1 has to be the
least among them. In other words, GCD of ab and d will have to be 1. That is it from this lecture.
We will see more properties of GCD's in the next class. Hope to see you in the next. Thank you.

479
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering,
Indian Institute of Technology, Guwahati
Lecture 23
Lattices
Welcome to the NPTEL MOOC on Discrete Mathematics. This is the seventh lecture on set
theory. Here we continue the discussion on partial ordering relations that we started in the
sixth lecture.

(Refer Slide Time: 0:48)

Let us say S is a poset. Let us say A is a subset of S. We say that m is an upper bond of A, if
for every x belonging to A, it is the case that x is less than or equal to m or x precedes m. An
upper bound of A, that is less than or equal to every other upper bound of A, is a least upper
bound, or LUB for short.

(Refer Slide Time: 1:59)

480
A least upper bound is also called a Supremum. Similarly, a lower bound of A is some m
belonging to S, such that, for every x belonging to A, it is the case that x is less than or equal
to, x is greater than equal to m. It is the case that m is less than or equal to x. The greatest of
all lower bounds is the greatest lower bound. It is also called the infimum.

(Refer Slide Time: 3:06)

In particular for some A, for an arbitrary A, supremum may not exist. But if it does exist, it is
unique. We can say the same thing about an infimum. The infimum may not exist, but if it
does exist, it is unique.

481
(Refer Slide Time: 3:39)

For example, consider this partial order denoted by a Hasse diagram. Let the set A equals {d,
e, b}. Then the upper bounds of A are f and g. A has two upper bounds. But since, f and g are
incomparable there is no LUB. The set A has no supremum. On the other hand, the lower
bounds of A are b, a, and c. All these three are less than or equal to every member of A. So,
these are all lower bound of the set A. The greatest lower bound is the greatest of them, that
is unique which happens to be b. So, the set has a greatest lower bound, but it is without a
least upper bound.

(Refer Slide Time: 5:03)

482
Consider the divisibility partial order. Here gcd of a and b is the greatest lower bound of the
two member set a, b. Lcm of a and b is the least upper bound of a and b.

(Refer Slide Time: 5:44)

We can, of course, extend those to more elements. For example, the gcd of a, b, c would be
the greatest lower bound of the triplet, the three member set a, b, c. Similarly, lcm of a, b, c,
the least common multiple of a, b, c would be the least upper bound of this set.

(Refer Slide Time: 6:21)

Now, let us study what are called lattices.

483
(Refer Slide Time: 6:27)

Suppose, L is a non-empty set, closed under two binary operators, operators or functions. So
these two operators are named meet and join. It is customary to represent meet using this
symbol, and join using this symbol. So, they are similar to the AND and OR symbols in
Boolean Algebra. That is not without reason. Therefore, in our subsequent discussions, I will
use the word meet and, AND as synonyms and join and OR as synonyms. But meet is with
more general connotations than AND and similarly join. So, we consider a nonempty set L
with is closed under two binary operators.

(Refer Slide Time: 7:40)

484
We say that L is a lattice, if the following axioms hold. The first axioms says that for every a
and b, belonging to L a meet b is equal to b meet a and a join b equals b join a, which means
meet and join are commutative operators. This is the first axiom. Let me call it L1, the first
axiom of lattices.

(Refer Slide Time: 8:36)

The second axiom says that, for every a, b, c belonging to L, it is the case that, a meet b meet
c is equal to a meet b meet c, and a join b join c is equal to a join b join c, which means meet
and join are both associative. This is the second axiom.

(Refer Slide Time: 9:29)

485
The third axiom says that for every a and b belonging to L, a meet a join b is equal to a and a
join a meet b, is a again. This is the absorption law. So, let us assume that our nonempty set
L, which is closed under the two operators meet and join satisfy these three axioms,
commutativity, associativity and absorption.

(Refer Slide Time: 10:16)

Now, you can see that all axioms have duality. That is in the commutativity axiom, when you
substitute meet with join and join with meet, we again get another commutativity axiom, and
similarly, for associativity, and absorption as well. So, all axioms satisfy duality. Therefore,
any theorem that is proved from these axioms will also satisfy duality. That is, you take a
proof in this system, and in this proof if you substitute meet with join and join with meet
throughout, then you find that you conclude the dual of the original conclusion.

486
(Refer Slide Time: 11:19)

Let us derive the idempotent laws. The first idempotent law says that, a or a equal to a. You
can prove this way a or a is a or a and a or b, that is because the underlined portions are
equivalent by the absorption law. But here we have a or a and something, which by the dual
of the previous absorption law, the absorption law which we used in step one is a indeed.
Therefore, we have proved the idempotent law for join. The dual of this proof will give us the
idempotent law for meet, a meet a is a.

(Refer Slide Time: 12:23)

Now, let us define a partial order on set L. Let this relationship be called less than or equal to
or the precedes relation. In this, we say that a is less than or equal to b, if a meet b equal to a.

487
This is how we define the partial order. Then we can see immediately that a is less than or
equal to b if a or b is b.

(Refer Slide Time: 13:12)

Why is this so? We know that a less than or equal to b if and only if a meet b is a by
definition. But if a meet b is a, then b join a meet b is b join b meet a, by commutativity,
which is then b by absorption, which means b join a is a join b which is b. Therefore, we
conclude that a join b is b.

(Refer Slide Time: 14:17)

Then, we see that this less than or equal to relationship, along with L forms a poset. That is
because the less than or equal to relationship is reflexive, antisymmetric and transitive.

488
(Refer Slide Time: 14:46)

Why is this so? a meet a is a by the idempotent law ,we have just shown. So a is less than or
equal to a by the definition of the less than or equal to relationship, this is the case for every a
in L. Therefore, the less than or equal to relationship is reflexive.

(Refer Slide Time: 15:16)

Suppose, a less than or equal to b and b less than or equal to a, then by the definition of the
less than or equal to relationship, we know that a meet b is a and b meet a is b. But then a
meet b is b meet a, by the commutativity the meet operation, which is b. So we have a equal
to b, therefore, we have the antisymmetric property. We have concluded that a equal to b, if a

489
is less than or equal to b and b is less than or equal to a, then a equal to b. So the less than or
equal to relation, that we are talking about is antisymmetric.

(Refer Slide Time: 16:16)

Now, let us say a less than or equal to b, and b less than or equal to c, then by the definition
of the less than or equal to relationship, we have a meet b equal to a and b meet c equal to b.
Therefore, a meet c would be, since, a is a meet b, those would be a meet b meet c. But by
associativity, those would be a meet b meet c. But then from above we know that b meet c is
b, so this is a meet b. But a meet b is a, so a meet c is a, which implies that a is less than or
equal to c. Thereby establishing transitivity. So, the less than or equal to relationship is
reflexive, antisymmetric and transitive.

490
(Refer Slide Time: 17:23)

Now we started out by saying that meet and join operations are defined for every L. We
started out by saying that L is closed under the operators meet and join. In other words, for
every a and b belonging to L, a meet b belongs to L and a join b belongs to L.

(Refer Slide Time: 18:02)

For an arbitrary a and b, let us say a meet b equal to c. Then a meet c is a meet a meet b,
which is a meet a, by associativity. This is a meet a meet b. But a meet a is a, by the
idempotent law. So this is a meet b, which is c. So, we find that a meet c is c. Or in other
words, c is less than or equal to a. Similarly, we also have that c is less than or equal to b.

491
(Refer Slide Time: 18:54)

So, c is a lower bound of a and b. But, of course, we do not know that c is the greatest lower
bound. What we know is that c is a lower bound. Suppose c prime is another lower bound. c
prime is a lower bound of a and b, then clearly c prime meet a equals c prime meet b equals c
prime.

(Refer Slide Time: 19:36)

Then, what would be c prime meet c. This would be c prime meet a meet b. Remember c is a
meet b. But this would be c prime meet a meet b. But c prime meet a is c prime. So this is c
prime meet b, which is the same as c prime again or in other words, c prime is less than or

492
equal to c. So, if there is another lower bound to a and b, then that lower bound is less than or
equal to c.

(Refer Slide Time: 20:15)

Or in other words, c is the greatest lower bound. By the dual of this argument, we know that
if d is a join b, then d is the least upper bound of a and b.

(Refer Slide Time: 20:44)

So, what we have concluded is that, in the partial ordering relation less than or equal to,
defined on L, we have this property. For every a and b, the greatest lower bound and the least
upper bound are defined. They happen to be the meet and join of a and b respectively.

493
(Refer Slide Time: 21:23)

On the other hand, suppose P is a poset with less than or equal to relationship, let us say LUB
of a and b and GLB of a and b are defined for all a and b. Let a meet b equal to GLB of a and
b, and a join b equal to LUB of a and b. So, if you do this, then for every a and b, GLB and
LUB are defined, or meet and join are defined. So meet and join are closed operators for P.

(Refer Slide Time: 22:22)

Then consider the Hasse diagram for this poset. From this we find that a meet b is b meet a
and a join b is b join a. Therefore, commutativity holds for both meet and join. Similarly,
form the diagram we can also conclude that associativity hold, and the dual of it too. So
associativity also holds.

494
(Refer Slide Time: 23:16)

And absorption also would hold. So arguing this from the Hasse diagram is straight forward.
I leave it as an exercise.

(Refer Slide Time: 23:35)

So with this we find is that, if L is a lattice with meet and join, then, greatest lower bound and
least upper bound are defined for every pair of elements. This is, of course, and if and only if
relation. L is a lattice if and only if the greatest lower bound and least upper bound are
defined for every pair of elements.

495
(Refer Slide Time: 24:21)

A lattice is a poset in which every pair has a greatest lower bound and least upper bound.
Some examples, any linearly ordered set is a lattice. In particular, consider the set of integers.
Take any two integers i and j, either i less than or equal to j or j less than or equal to i or both.
So, the greatest lower bound of i and j would be the smaller of the two, and the least upper
bound would be the larger of the two. So GLB and LUB are defined for every pair of
elements. Therefore, this is indeed a lattice.

(Refer Slide Time: 25:29)

The divisibility partial order on positive integers: For any a and b, gcd of a and b happens to
be the greatest lower bound, and lcm of a and b, as we saw is the least upper bound.

496
Therefore, GLB and LUB are defined for every pair of integers, the positive integers.
Therefore, this partial order is a lattice.

(Refer Slide Time: 26:20)

Consider ordered pairs of natural numbers. We say that ordered pair (i, j) is less than ordered
pair (i prime, j prime), less than or equal to, if i is less than or equal to i prime and j is less
than or equal to j prime. This is also a lattice, because given two ordered pair (i, j) and (i
prime, j prime), minimum of i, i prime, minimum of j, j prime is the greatest lower bound.
Similarly, maximum of i, i prime and maximum of j, j prime will form a least upper bound.
So, every pair of ordered pairs has a greatest lower bound and a least upper bound. So this is
a lattice.

497
(Refer Slide Time: 27:29)

Set inclusion: Let us say we have a family of sets. Given two sets A and B, A intersection B
is a subset of A, and A intersection B is a subset of B as well. Therefore, A intersection B is a
lower bound of both A and B and it also happens to be the greatest lower bound. A union B is
a super set of A, and A union B is a super set of B as well, and this happens to be the least
upper bound. So, every pair of sets has a greatest lower bound and least upper bound. So this
is a lattice again.

(Refer Slide Time: 28:17)

498
Now, let us define what is called a sub lattice. Let us consider nonempty set M, which is a
subset of L, where L forms a lattice. Suppose M is a lattice by itself. Then we say M is as sub
lattice of L.

(Refer Slide Time: 28:54)

For an example, we saw the set of natural numbers, and the less than or equal to relation, is a
linear order and therefore a lattice. Let us consider Dm and the less than or equal to relation,
where Dm is the set of divisors of m, so clearly Dm is a subset of N. Therefore, Dm is a sub
lattice.

(Refer Slide Time: 29:44)

499
We say that a lattice is Bounded Lattice, if L has a lower bound 0. We say that L has a lower
bound 0, if for every x belonging to L, it is the case that 0 less than or equal to x, and we also
say that L has an upper bond I, if for every x belonging to L, it is the case that x is less than or
equal to I.

(Refer Slide Time: 30:25)

L is bounded if it is both a lower bound and an upper bound. Every finite lattice is bounded. I
will not prove it here. I am leaving the proof as an exercise to you.

(Refer Slide Time: 30:53)

Now, to our final topic in this module, that is Distributive lattices. Distributivity is the fourth
axiom. If the fourth axiom which is called the distributive axiom is satisfied, then the lattice

500
is called a distributive lattice. That is this axiom should be satisfied in addition to the first
three axioms. So, this says that meet distributes over join, and join distributes over meet. So
the duality still holds. Meet distributes over join and join distributes over meet.

(Refer Slide Time: 31:45)

Look at this lattice. This is non-distributive. That is because in this, a join b meet c is a join, b
meets c, in this case happens to be 0, the greatest lower bound of b and c. That is, the
downward path from b and c meet in 0, therefore, b meet c is 0 and a join 0 is a. Let us
consider the right hand side of the distributivity axiom, which happens to be a join b meet a
join c. What is a join b? That is I, and a join c happens to be a meet c. a join c happens to be
c, which is c. So, we find that the left hand side and the right hand side are not identical.
Therefore, the distributivity law does not hold in this case.

501
(Refer Slide Time: 33:09)

Similarly, this lattice is also not distributive, that is because a join b meet c, in this case is a
join the meet of b and c is 0. That is on the downside b and c would be meeting at 0, so b
meet c is 0, a join 0 is a. The right hand side of the distributive law is a join b meet a join c.
But a join b is I and a join c is again I. Because the upward path from a and c join at I, this is
I. So, once again the left hand side and the right hand side are not the same. So this is non-
distributive again.

(Refer Slide Time: 34:21)

There is an interesting theorem, which says that, a lattice is non-distributive if and only if it
contains a sub lattice that is isomorphic to one of these lattices. That is, one of these two

502
lattices will have to be isomorphic to some sub lattice of a non-distributive lattice, and that is
a two way implication. The proof of this theorem is outside the scope of this discussion here.
So, we come to the end of the discussion on set theory here. Hope to see you in the other
modules. Thank you.

503
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Lecture 24 – GCD, Euclid's Algorithm

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the second lecture on
number theory.

(Refer Slide Time: 00:38)

When the GCD of two numbers, two integers a and b is 1, we say that a and b are co-prime or
that they are relatively prime.

(Refer Slide Time: 01:03)

504
Let us see a theorem, which we call theorem 2.1. For integers a, b and x, GCD of a and b is
the same as the GCD of a and b plus ax, that is an integer multiple of a added to b, along with
a, will have the same GCD as a and b.

(Refer Slide Time: 01:35)

The proof goes like this. GCD of a and b plus ax, according to what we saw in the last class,
is the least positive member of set S, where the set S is the set of all linear combinations of a
and b plus ax using integer coefficients. So GCD of a and b plus ax, is the least positive
member of this set. But then, this set is identical to, that is this definition can be re-written as,
a into y plus xz plus bz, where y and z are in the set of integers. This can be written as, that is
because, given u and z, any number au plus bz can be written as a into y plus xz plus bz for
integers y and z. That is for every such u, we can find such a y. u will be y plus xz. Therefore
y will be u minus xz. But the least positive member of this set in the final form, we know is
nothing but GCD of a and b.

505
(Refer Slide Time: 03:38)

Therefore, GCD of a and b plus ax is nothing but GCD of a, b, and hence, the theorem.

(Refer Slide Time: 03:50)

Another theorem, which we number 2.2, states this. If a and c are co-prime or relatively
prime, which means their GCD is 1 or they do not have a common factor and a divides bc,
then a divides b.

506
(Refer Slide Time: 04:18)

To prove this, we consider GCD of ab and bc. This we know, from the theorem proved in the
last class, is b times GCD of a and c. But a and c are co-prime, therefore GCD of a and c is 1.
Therefore this is b. So GCD of ab and bc is b, which means all common divisors of ab and bc
divide b. Now, a divides bc and a divides ab. Therefore, a is a common divisor of bc and ab.
a divides bc is given. ab is a product of a and b, therefore a divides ab. Therefore we know
that a is a common divisor of bc and ab, which means a divides b, which is indeed the
conclusion that we have been seeking, and hence, the theorem.

(Refer Slide Time: 05:54)

The next theorem which we number theorem 2.3 is the one which talks about Euclid's
Algorithm. What this theorem says is this, for r0 and r1 greater than 0, both of which are

507
integers, if we apply the division algorithm repeatedly. If you apply the division algorithm
repeatedly as follows:

(Refer Slide Time: 06:48)

The division algorithm is applied like this. We set variable i to 1 and then we have a do loop,
in which what we do is this. We form ordered pair (ri+1, qi+1) using division algorithm applied
on ri-1 and ri and then we increment i. We do this while ri is not equal to 0. So we apply
division algorithm in this sense.

(Refer Slide Time: 07:33)

First let me complete the statement of the theorem. After that we will revisit the division
algorithm. Then, what the theorem says is this. GCD of r0 and r1 is ri-1, which is the last
nonzero remainder of the process, and in particular integers x0 and y0, such that the GCD that

508
we find, namely ri-1 is a linear combination of r0 and r1, with x0 and y0 as the coefficients can
be obtained by writing each rj, where j varies from 2 to i minus 1, as a linear combination of
r0 and r1.
(Refer Slide Time: 09:06)

Now let us take a look at the division algorithm. The algorithm begins with r0 and r1 and then
we find r2, in the sense, we write r0 as q1 r1 plus r2, q2 r1 plus r2. That is we find a unique
ordered pair (r2, q2), so that when the division algorithm is applied on r0 and r1, this ordered
pair is what we get. Then r0 would be expressible in this form. So we have now r2. So, r2 is
the remainder of the division of r0 with r1. So now we have r1 and r2. Then what we do is to
recurse with r1 and r2. The division algorithm is next invoked on r1 and r2. Then we would get
an ordered pair q3 and r3. r3 would be the new remainder. Then we would recurse with r2 and
r3. So when you continue like this, the remainders would keep decreasing in value, as we go
along in absolute value. Finally when we come to a stage where ri equal to 0, we come out of
the loop. At this point, the last non-zero remainder that we had, namely ri-1, that would be the
GCD of the two given numbers r0 and r1.

509
(Refer Slide Time: 10:51)

Let us work out an example first and then we will prove the theorem. In the example, we took
a, consider two numbers 512 and 432. Let us say we want to find the GCD of 512 and 432.
What we do is this. We find the remainder of 512 on division with 432. Remainder is 80.
Then we discard the largest number, which is 512 and then recurse with 432 and 80. So now
we want to find GCD of 432 and 80. That is to find the GCD of 512 and 432. It is enough to
find the GCD of 432 and 80. 432 mod 80 is 32.

Therefore, it is enough to find the GCD of 80 and 32. 80 mod 32 is 16. 64 plus 16 is 80. To
find the GCD of 512 and 432, it is enough to find the remainder of the division of 512 with
432 which is 80. So now we should recurse with the smaller of the two original numbers
namely, 432, and the new remainder which is 80. So 512, the ordered pair (512, 432) reduces

510
to the ordered pair (432, 80) which then reduces to the ordered pair (80, 32) which then
reduces to (32, 16) and which reduces to 16 and 0. The GCD of 16 and 0 is 16, and this
would be the GCD of the original pair of numbers.

(Refer Slide Time: 12:57)

The theorem also says how the GCD 16 can be expressed as a linear combination of 512 and
432 with integer coefficients. If you go through the steps, you find that in the first step, we
take the modulus of the division of 512 with 432. Therefore we have 512 is 1 into 432 plus 80
or 80 is 1 into 512 minus 1 into 432. Let us call this equation 1.

(Refer Slide Time: 13:59)

And then in the next step we have a division of 432 with 80. We find that 432 is 5 into 80
plus 32. Which means 32 is 1 into 432 minus 5 into 80. Let us call this equation 2. The next
step is on the ordered pair 80 and 32. 80 is 2 into 32 plus 16. Or in other words 16 is 1 into 80

511
minus 2 into 32. Let us say this is equation 3. So, 16 can be expressed as a linear combination
of 80 and 32 with coefficients 1 and minus 2 respectively. But, then, 32 can be expressed as a
linear combination of 432 and 80 with 1 and minus 5 as the coefficients.

(Refer Slide Time: 15:13)

So, 16 is 1 into 80 minus 2 into 32. But, 32 in turn, we find is expressible in this form, which
upon simplification gives us 11 into 80. This is 1 into 80 and then minus 2 into minus 5 is 10.
So, 10 plus 1 is 11, minus 2 into 432. So 16 is expressible as a linear combination of 80 and
432 with 11 and minus 2 as the respective coefficients. And then 80, we know, is 1 into 512
minus 1 into 432 from the first equation. So substituting that here, we find that 16 is 11 into
512 minus 13 into 432. So, now 16 has been expressed as a linear combination of 512 and
432 with 11 and minus 13 as the coefficients. So this is what the theorem talks about.

(Refer Slide Time: 16:42)

512
To prove the theorem, what we do is this. We know that by theorem 2.1, GCD of (r0, r1) is the
same as GCD of r1 and r0 minus r1 [Link] use integer minus q2 here, which is GCD of r1 and
r2. That is because the second term is nothing but r2. r0 minus r1 q2 is the same as r2. So GCD
of (r0, r1) is indeed GCD of (r1, r2). So if you continue like this, you can show that this is the
same as GCD of (r2, r3), which is the same as GCD of (r3, r4) and so on, until we come to
GCD of (ri-1, ri), where r i is 0. But GCD of r i minus 1 and 0 for nonzero ri-1, is nothing but ri-
1. So GCD of r naught r1 is ri-1 as the theorem claims.

(Refer Slide Time: 18:14)

Now do as an exercise, the rest of the proof. That is by induction show that rj is a linear
combination of r0 and r1, for all j varying from 2 to i minus 1. So in particular ri-1, which is
the result of the algorithm, which is the GCD of r0 and r1 is also a linear combination of r0
and r1. So I leave this to you as an exercise. So, we have been talking about common divisors
and greatest common divisors.

513
(Refer Slide Time: 19:04)

Now let us talk about common multiples. For integers a and b, we know that a divides ab and
b divides ab. So, ab is a multiple of a, and ab is a multiple of b, which means ab is a common
multiple of a and b. So, a and b do have common multiples. The least positive of the common
multiples of a and b is called the least common multiple, or LCM of a and b. So, for any pair
of integers a and b, we can define the least common multiple of a and b.

(Refer Slide Time: 20:25)

So we have a theorem for this notion, which we number 2.4. Let small l denote LCM of (a, b)
for integers a and b. If c is a common multiple of a and b, then l divides c. Also, 0 plus or
minus l, plus or minus 2 l, plus or minus 3 l, etcetera, that is the integer multiples of l are all
the common multiples of a and b. So, these are the only common multiples of a and b, and all
of these are common multiples of a and b.

514
So what we assert here is this. If l is the LCM of a and b, and c is a common multiple of a and
b, then l divides c. In other words the LCM divides every other common multiple. So
common multiples are all multiples of LCM.

(Refer Slide Time: 21:45)

So we prove as follows. Let c be a common multiple of a and b, an arbitrary common


multiple of a and b. Then by the division algorithm, we know that there is a unique pair (r, q),
such that c equals q1 plus r, for 0 less than or equal to r less than l. Suppose, this r is nonzero.
Then a divides l, a divides c. a divides l because l is LCM of a and b. So l in particular is a
common multiple of a and b. So l divides l. l divides c because by definition c is a common
multiple of a and b. Similarly, b divides l and b divides c. If a divides l and a divides c then a
must divide r as well, because then r is c minus q l. Then a divides r and similarly b divides r.
Or in other words r is a common multiple of both a and b. r is between 0 and l minus 1,
inclusive of both the limits. Therefore, r is less than l, or in other words, l is not the LCM of a
and b as we assumed. So this is a contradiction. We started with the assumption that l is the
LCM of a and b. So we get a contradiction. Therefore, it must be that, the assumption we
made, namely that r is not equal to 0 must be false.

515
(Refer Slide Time: 24:04)

So we conclude that r equal to 0. But c is q1 plus r, which is now q1, since r equal to 0. In
other words, l divides c. So, c is a multiple of the LCM. So every common multiple of a and
b, is among these as the theorem claims.

(Refer Slide Time: 24:47)

Another theorem about LCM is that, for every positive integer d, LCM of (bd, cd) is LCM of
b and c multiplied by d.

516
(Refer Slide Time: 25:25)

So to prove this, let us define small l as the LCM of b and c, and capital L as the LCM of bd
and cd. So we have these two numbers small l and capital L. Then bd divides ld and cd
divides ld. l is LCM of b and c. So in particular l is a multiple of b. Therefore ld is a multiple
of bd. Similarly ld is multiple of cd as well. So ld is a common multiple of bd and ld, bd and
cd. So, L divides ld, because we have just shown that the LCM divides every common
multiple. So capital L is the LCM of bd and cd. Therefore capital L divides small ld.

(Refer Slide Time: 26:38)

Also b divides the number L by d, and c divides L by d. That is because the capital L is a
multiple of bd and capital L is a multiple of cd. Therefore b divides the number L by d. L by
d is an integer. So L by d is a common multiple of b and c. In other words the LCM of b and

517
c, namely small l divides L by d, which implies that small ld divides capital L. So we have
just shown that capital L divides small ld and small ld divides capital L. Therefore, it must be
that capital L is the same as small ld, which is precisely what we want to show. We wanted to
claim that LCM of bd and cd which is capital L is the same as d times LCM of b and c, which
is d times small l, and hence, the theorem.

(Refer Slide Time: 27:58)

Let’s see another theorem dealing with GCD and LCMs, which we name theorem 2.6. This
asserts that, for any pair of integers a and b, the GCD of a and b multiplied by the LCM of a
and b is the absolute value of ab, the product of a and b. To prove this, first we assume that,
both a and b are positive. So, first, let us assume a and b are co-prime. We consider the case
of a and b being co-prime first. Then GCD of a and b must be 1. They are relatively prime.
So they do not have a common factor other than 1. Let the LCM of a and b be, d times a, for
some d, some integer d. Then b divides da. da is the least common multiple of a and b. So, in
particular, da is a multiple of b or in other words b divides da. But then b does not divide a. b
and a are co-prime. So GCD of b and a is 1. Therefore b does not divide a.

This therefore implies that b divides d. So ba is less than or equal to da. But ba is greater than
0, that is because, both a and b are positive. So b is greater than 0. This is a common multiple
of a and b. b is greater than 0 and is also a common multiple of a and b. So, ba is greater than
or equal to da. da is the LCM of a and b, as we assumed. Therefore ba must be greater than or
equal to da. So we have these two inequalities here. ba is less than or equal to da. Similarly ba
is greater than or equal to da.

518
(Refer Slide Time: 30:41)

Combining these two, we have da equal to ba. Which means, the LCM of a and b is a into b,
which is the same as the magnitude of a, b, since a and b are both positive. So this is the case,
where a and b are co-prime.

(Refer Slide Time: 31:10)

So now let us assume that a and b are not co-prime, which means, they do have a common
divisor, which is greater than 1. Let, g be the GCD of a and b. So g is greater than 1.
Therefore, from the theorem that we saw in the last class, we have the GCD of a by g and the
GCD of b by g is 1. But we know that the GCD of (a/g, b/g) multiplied by the LCM of (a/g,
b/g) is a by g into b by g. Multiplying both sides by g squared, we can write the equation in
this manner. g into GCD of (a/g, b/g) multiplied by g into LCM of (a/g, b/g) is a into b). But

519
this is nothing but GCD of (a, b), and this is nothing but LCM of (a, b). Therefore, the
product of the two, GCD of (a, b) multiplied by LCM of (a, b) is a into b. Once again, we
assume that a and b are positive. Therefore this is the same as the magnitude of ab.

(Refer Slide Time: 32:54)

So the only remaining case is when a or b is negative. When one of a or b is negative, then
GCD of (a, b) is GCD of (mod a, mod b). And LCM of (a, b) is LCM of (mod a, mod b).
Now, mod a and mod b are both positive integers. Therefore the product of GCD of (mod a,
mod b) and LCM of (mod a, mod b) would be mod of a into mod of b, which is the same as
the magnitude of ab. Now, the left hand side here is GCD of (a, b) multiplied by LCM of (a,
b). GCD and LCM of any pair of integers are positive. Therefore, the left hand side is GCD
of (a, b) multiplied by LCM of (a, b), and the right hand side is mod ab, and that’s exactly
what we wanted to prove. So, that is it from this lecture. Hope to see you in the next. Thank
you.

520
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati
Lecture 25 - Prime numbers

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the third lecture on number
theory. Today, we study prime numbers.

(Refer Slide Time: 00:40)

A non-zero integer, a non-zero, non-negative integer p is called a prime, if 1 and p are only
the non-negative divisors it has.

(Refer Slide Time: 1:35)

521
A non-prime is called a composite number. So the prime numbers are 2, 3, 5, 7, 11, 13, 17,
and so on.

(Refer Slide Time: 1:55)

So, let us see theorem that we call theorem 3.1. The theorem says that, every integer n greater
than 1, is a product of one or more primes. The proof is easy. Consider the number n greater
than 1. If n is a prime, then n alone forms the product that we look for. The theorem states
that n can be written as a product of one or more primes. So in those case, n is a prime. So n
is a product on its own.

(Refer Slide Time: 2:55)

522
So, if n is a composite, then by definition n is n1 into n2, where n1 and n2 are both less than n
and greater than 1. Now, let us inductively assume that n1 and n2 are products of primes.
Then, n is a product of those products. Therefore, n is also representable as a product of
primes. So, either way, every positive integer can be expressed as a product of multiple
primes.

(Refer Slide Time: 3:55)

Another theorem, which we call theorem 3.2. For prime p, and integers a and b, if p divides
the product ab, then either p divides a or p divides b. The proof is easy again. If p divides ab,
but p does not divide a, then by a theorem we saw in the last class, p divides b. Therefore, we
have that p divides a negated, implies p divides b. Which is equivalent to saying that either p
divides a or p divides b. Hence, the theorem.

523
(Refer Slide Time: 5:07)

Extending this. We can say, if p is a prime and p divides the product a1, a2 to an, where a1, a2
to an are all integers, then p divides a1 or p divides a2 or p divides a3 and so on. p should
divide one of those integers. We can prove this using induction, from the previous theorem.
The previous theorem will form the basis. That is theorem 3.2 will form the basis. For n
greater than 2, let a equal to a1 and b equal to a2 to an in the theorem. Then we have that either
p divides a 1 or p divides b which is a2 to an. Now, by induction hypothesis, if p divides a2 to
an, since it is a product of smaller number of integers, we can say, in this case p divides a2, p
divides a3 and so on. Therefore, putting together, we have either p divides a 1 or p divides a 2
and so on and the theorem follows.

(Refer Slide Time: 6:23)

524
The next theorem is a famous one. This is called the fundamental theorem of the arithmetic.
What it says is that, every integer n greater than 1, has a unique prime factorization. Unique
canonical prime factorization. But what is a canonical prime factorization?

(Refer Slide Time: 7:19)

If n is expressed as a product of this form, p1 power e1 into p2 power e2 etcetera up to some


prime pn power en, where p1, p2, etcetera are all primes, e1, e2, etcetera are non-negative
integers, and p1 is the smallest prime, p2 is the next prime, p3 is the next prime and so on. So
p1 is 2, p2 is 3, p3 is 5 and so on. So, when n is expressed as such a product, we say that this is
a canonical prime factorization of n. The primes in this product appear in increasing order.

(Refer Slide Time: 8:24)

525
For example, 120 is 2 power 3 into 3 into 5. So, the primes here are p1 equal to 2, p2 equal to
3 and p3 equal to 5, e1 is 3, we have 2 power 3, e2 equal to 1 and e3 equal to 1.

(Refer Slide Time: 8:46)

Consider 2 power 3 into 3 power 1 into 7 power 1, which is equal to 168. So in this prime
factorization, or in this canonical prime factorization, we have p1 equal to 2, p2 equal to 3, p3
equal to 5 and p4 equal to 7. E1 is 3, e2 is 1, e3 in this case is 0 because 5 has an exponent of 0
in this case and e4 equal to 1. We do not consider primes which are greater than 7, because
the exponents of all of them are 0. So such a representation of numbers is called a canonical
prime factorization. So what the fundamental theorem of arithmetic says is that every non-
negative integer has a unique canonical prime factorization. So, we will prove this in this
manner:

526
(Refer Slide Time: 9:44)

Suppose, a positive integer n has 2 canonical prime factorizations. Then, n is p1 power e1 p2


power e2 etcetera up to pn power en, which is also q1 power f1 q2 power f2 and so on up to qm
power fm. So, n has two distinct canonical prime factorizations. But, then, let us consider this
equation. In this equation, on either side of the equality we have a product. So let us cancel
the common factors from both sides. Since these two prime factorizations are distinct,
everything will not cancel out.

(Refer Slide Time: 10:50)

So finally, we will be left with some k primes on the left side and some l primes on the right
side. So there will be now no common prime on the left side and the right side. Every prime
on the left side will be distinct from the primes on the right side. Now, in particular, consider

527
r1 and the right-hand side s1 to s1. We know that r1 divides s1 through s1. That is because, r1
multiplied by r2 through rk is s1 through s1. So, there is an integer so that r1 into that integer is
right hand side. So r1 divides the right-hand side. But then, by theorem 3.3, r1 divides s1 or r1
divides s2 and so on. It should divide one of the primes, at least one of the primes on the right
hand side. Now, r1 is a prime and if r1 divides s1, which is also a prime. s1 is also a prime,
then r1 is equal to s1. So, r1 must equal one of the primes on the right hand side, which is a
contradiction. That is because, we have already canceled all primes that appear on both sides.
So, here we get a contradiction.

(Refer Slide Time: 12:26)

Therefore, the two prime factorizations that we began with, cannot be distinct. In other
words, every number n has a unique canonical prime factorization. Now, this is the case with
integers. But in every system this need not be the case.

528
(Refer Slide Time: 12:50)

In particular, let us consider the system of even non-negative integers. So we consider these
numbers. In this, we say that, a number is prime if it cannot be expressed as the product of
two numbers in the system. So let us call this system E. So, this is a system of even non-
negative integers. So, a number in this system will be considered a prime if it cannot be
expressed as the product of two numbers in E.

(Refer Slide Time: 13:49)

For example, 50. We will say that 50 is a prime in the system. Because 50 is, if you factorize
50, you have the various factorizations, 1 into 50, but 1 is an odd number, so this is not a
product of two even numbers. Then we have 2 into 25, 25 is an odd number. So this also does
not qualify. Then, we have 5 into 10. 5 is an odd number so this does not qualify. The rest are

529
all the same factorizations. We have 10 into 5, 25 into 2, and 50 into 1. So, 50 cannot be
expressed as the product of two smaller even numbers. Therefore, 50 is a prime.

(Refer Slide Time: 14:36)

But 12 is not a prime. That is because, 12 belongs to E and 12 can be expressed as 2 into 6,
where 2 is an even number and 6 is also an even number. Therefore 12 is not a prime, 12 is a
composite within this system.

(Refer Slide Time: 15:04)

But then, consider a number 100. 100 can be expressed as 10 into 10. 10 belongs to E. So we
are now expressing 100 as a product of two numbers both of which are even and smaller than
100, but 100 is also 2 into 50, 2 is an even number and 50 is also an even number. So 2
belongs to E and 50 belongs to E. So 100 has two prime factorizations, two canonical prime

530
factorizations within the system. Canonical, because here the prime numbers are appearing in
increasing order. Therefore, within this system every number need not have a unique prime
factorization.

(Refer Slide Time: 15:58)

As another example, consider the set of some complex numbers. We consider all complex
numbers of the form a plus i root 6 b, where a and b are integers. Now, C is closed under
addition and multiplication. That is because, a plus i root 6 b plus c plus i root 6 d is, a plus c
plus i root 6 into b plus d. Therefore, C is closed under addition and…

(Refer Slide Time: 16:54)

In the case of multiplication, we have a plus i root 6 b into c plus i root 6 d, which is ac minus
6 bd, the real part, i root 6 into ad plus bc. Since a and b are integers, ac minus 6 bd is also an

531
integer, ad plus bc is also an integer. So, here we express the product in the, a plus i root 6 b
form again. Therefore, this is also a member of C. So C is closed under multiplication as
well.

(Refer Slide Time: 17:45)

Let us define the norm of one such number, as a squared plus 6 b squared, the square of its
absolute value. 0, 1 and minus 1 are the only members of C, with a norm of value less than or
equal to 1.

(Refer Slide Time: 18:29)

We say that, in this system, a number, a plus i root 6 b is a composite, in other words, not a
prime, if it can be expressed as the product of two members of C of norm greater than 1.

532
(Refer Slide Time: 19:10)

So you can verify that, the norm of a product of two members of C is equal to the product of
the norms. In other words, for two members n1 and n2 of C, the norm of n1 into n2 is the norm
of n1 into the norm of n2. So if you consider a composite number, it factorizes into factors of
smaller norm, and the norm is always an integer greater than 0.

(Refer Slide Time: 20:08)

And a proper complex number in C, which means, in this, b is not equal to 0 has a norm
greater than or equal to 6. So even if b equal to 1, the i root 6 b part will contribute 6 to the
norm. So norm will be greater than or equal to 6.

533
(Refer Slide Time: 20:47)

So, in this system, 5 is a prime. Because 5 does not have real factors. So if 5 has factors, n1
and n2, then n1 and n2 are complex, are proper complex numbers. Which means the norm of
n1 is greater than or equal to 6 and the norm of n2 is also greater than or equal to 6. But the
norm of 5 alone is 25. Therefore, we have that 25 greater than or equal to 6 into 6, which is
36, which is a contradiction. Therefore, 5 cannot be expressed as n1 and n2 where n1 and n2
are both complex numbers belonging to C. Therefore, 5 is a prime. That means there are
prime numbers in the system.

(Refer Slide Time: 22:01)

Now, consider 10. 10 can be expressed as 2 into 5. Now, 2 belongs to the system and 5 also
belongs to the system. So 10 has a norm of 100, 2 has a norm of 4 and 5 has a norm of 25.

534
But 10 can also be expressed as the product of 2 plus i root 6 into 2 minus i root 6. 2 plus i
root 6 has a norm of 10 and 2 minus i root 6 also has a norm of 10. So now we find that 10
has two prime factorizations. So within the system again, there are numbers with multiple
canonical prime factorizations. But then, what we find is that, within the system of integers,
there is unique prime factorization. That is, what the fundamental theorem of arithmetic says.

(Refer Slide time: 22:52)

Now, let exp of (a, p) denote the exponent of prime p in the prime factorization of a. Since
the prime factorization is unique, this is well defined. exponent of (a, p) is well defined.
Therefore, number a can be expressed as the product over all primes p, of p power exp of (a,
p). Every integer a can be expressed as a product in this fashion.

(Refer Slide Time: 23:43)

535
Now, let us say integer c is integer a multiplied by integer b. Then c is the prime factorization
of a, which is this product multiplied by the prime factorization of b. So, in the prime
factorization of c, the exponent of p is going to be the sum of the exponents of p in the prime
factorizations of a and b respectively.

(Refer Slide Time: 24:40)

So, it is easy to say that GCD of a and b is the product over all prime p of this. Similarly,
LCM of a and b is the product over all prime p, of p power max of exp (a, p) and exp (b, p).

(Refer Slide Time: 25:32)

As an example, consider 1260, which can be prime factorized into 2 power 2 multiplied by 3
power 2 multiplied by 5 multiplied by 7, and consider the number 3000, which is, for
uniformity let us include 7 here. Then to find the GCD, we have to take the respective

536
minima of the exponents. So in 2160, the exponent of 2 is 2. In 3000 exponent of 2 is 3. The
minimum exponent here is 2. Therefore, in the case of GCD, we have to take 2 as the
exponent of 2. For 3, the exponent is the smaller of 1 and 2. For 5 it is 1 and for 7 it is 0,
which is 4 into 3 into 5 that is 60. So the GCD of 1260 and 3000 is 60.

(Refer Slide Time: 26:44)

And then LCM can be obtained by taking the larger of the exponents. So the largest exponent
of 2 among these two numbers is 3. The larger exponent of 3 among these two numbers is 2.
For 5 it is 3 and for 7 it is 1. So, it is 8 into 9 into 125 into 7, that is 1000 into 63, 63000. So,
this is the LCM of 1260 and 3000.

(Refer Slide Time: 27:30)

537
We say that an integer is a square, if it can be written as n squared for some integer n, and we
say that an integer n is square-free, if 1 is the largest square dividing it.

(Refer Slide time. 28:24)

Consider 210 for example. 210 is 42 into 5, which is 6 into 7 into 5 or, can be written as 2
into 3 into 5 into 7, the canonical prime factorization. The exponent is 1 everywhere. So, here
we find that, it does not have a square factor. So, then it is immediately clear that a number is
square-free if and only if every exponent in its prime factorization is either 0 or 1. So in this
case, 2, 3, 5, 7 are the prime numbers with an exponent of 1. Every larger prime number has
an exponent of 0.

(Refer Slide Time: 29:39)

538
Number 12 is not square free. Its prime factorization is 2 power 2, which is 4 into 3 power 1.
So, 2, in this case, has an exponent greater than 1. So, 12 is not square free. In particular, the
square 4 divides 12.

(Refer Slide Time: 30:01)

The next theorem is called Euclid’s theorem. What it says is that the number of primes is
infinite. That is, we can keep on finding ever-larger primes. The proof goes like this. Suppose
not. Suppose, the number of primes is finite. In which case let us say p1 through pr are the
primes. So there are only r primes and when they are listed in increasing order they are p1
through pr. So pr is the largest prime, let us say.

(Refer Slide Time: 30:59)

539
So, let us consider integer n, which is the product p1 through pr+1. Then we find that, p1 does
not divide n. Because n is 1 mod p1. That is when n is divided by p1, we would get a
remainder of 1. Similarly, p2 also does not divide n. p2 divides p1 through pr. Therefore, n is
not divisible by p2. Similarly, none of these primes would divide n. So primes p1 through pr
do not divide n. Therefore, either n is prime, which is a contradiction because we assume that
p1 through pr are the only primes. Now n is a number which is larger than pr. Therefore, this
cannot be, or n has a prime factor other than p1 through pr, which again is a contradiction
because we have assumed that pr is the largest prime and p1 through pr are the only primes
available. So, if n has a prime factor then it must be a prime which is larger than pr. So either
way, we are finding a prime which is larger than pr, which is a contradiction. So either way,
we get a contradiction. Therefore, there is no largest prime. We can keep on finding larger
primes. But then, how dense are the primes.

(Refer Slide Time: 32:41)

540
The next theorem says that, there are arbitrarily long gaps in the series of primes. In other
words, for any integer k, there exist k consecutive integers, all of which are composite. If you
consider a few initial primes, you find that, the gap between them is not much. So, the primes
are quite dense in the smaller integers. But then, for integer k, consider the sequence, k plus 1
factorial plus 2, k plus 1 factorial plus 3 etcetera, k plus 1 factorial plus k plus 1. So that is a
sequence of k integers, k consecutive integers. And we can see that all of them are composite.

That is because, if you consider this number, k plus 1 factorial plus 2. In this number, 2
divides k plus 1 factorial and 2 divides 2 too. Therefore 2 divides this number. When you
come to the second in the sequence. 3 divides k plus 1 factorial and 3 divides 3. Therefore, 3
divides the sum as well. So this number is divided by 3. When you come to the last of the
sequence, k plus 1 divides k plus 1 factorial and k plus 1 as well. Therefore k plus 1 divides
this sum. So in particular if you consider k plus 1 factorial plus j. We find that j divides k plus
1 factorial plus j, for j ranging from 2 to k plus 1. So these are k consecutive numbers, all of
which are composite. So, using this technique we can find arbitrarily large gaps in the series
of primes.

541
(Refer Slide Time: 35:44)

I will not prove the next theorem. This is called the prime number theorem. What this
theorem states is that, limit of n tending to infinity, pi of n divided by n by log of n, equal to
1. Where, pi of n is the number of primes not larger than n. That is, the number of primes not
larger than n divided by n by the natural log of n tends to 1 as n tends to infinity. In other
words, pi of n, the number of primes not larger than n is approximately n by log n. So out of
the n numbers that we consider, 1 to n, approximately n by log n are primes.

(Refer Slide Time: 37:06)

Next, we study congruences. We say that for integer m not equal to 0, if m divides a minus b
for integers a and b, then we say, a is b mod m or we say a is congruent to b modulo m. In
short, we write a is congruent to b mod m.

542
(Refer Slide Time: 38:03)

Let us see a theorem related to congruences. This theorem shows several properties of
congruences. The first property says that, a is congruent to b mod m and b is congruent to a
mod m and a minus b congruent to 0 mod m, for all equivalent statements. You find that, all
these follow from the definition itself. If a is congruent to b mod m, then a minus b is
divisible by m. But if m divides a minus b, m divides b minus a as well, which is the negative
of it. If m divides b minus a, then we have b is congruent to a mod m. But then this can be
written as m divides a minus b minus 0. a minus b is an integer, when a and b are integers and
0 is an integer. Therefore, when we say m divides a minus b minus 0, what it means is that a
minus b is congruent to 0 mod m.

(Refer Slide Time: 39:31)

543
The second part of the theorem says, if a equal to b mod m and b equal to c mod m, then a is
congruent to c mod m. Once again it follows from the definition. If a is congruent to b mod
m, then a minus b is divisible by m. From the second assumption, we have m divides b minus
c. If m divides a minus b and b minus c, then m should divide their sum too, which is a minus
b plus b minus c. Which means m divides a minus c. If m divides a minus c, where a and c
both are integers, we have that a is congruent to c modulo m, which is the conclusion. Okay,
that is it from this lecture. Hope to see you in the next. Thank you.

544
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering
Indian Institute of Technology, Guwahati
Lecture 4 - Number Theory

(Refer Slide Time: 00:34)

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the fourth lecture on
Number Theory.

(Refer Slide Time: 00:40)

545
In the last class, we were discussing congruences. We say that, for integer m not equal to
zero, and integers a and b, if m divides a minus b, then we say that a is congruent to b mod m.
We were looking at some properties of congruences. So, we will see some more properties.

(Refer Slide Time: 01:18)

One property is that, if a is congruent to b mod m, and c is congruent to d mod m, then ax


plus cy is congruent to bx plus dy mod m, for integers x and y.

(Refer Slide Time: 02:10)

So, to prove this, we start with our assumptions that, a is congruent to b mod m, and c is
congruent to d mod m. It means, a minus b is mk, for some integer k and c minus d is mj, for

546
some integer j. Therefore, ax plus cy minus bx minus dy would be m into kx plus jy. kx plus
jy is an integer. Therefore, m divides ax plus cy minus bx minus dy.

(Refer Slide Time: 03:09)

Or in other words, ax plus cy is bx plus dy mod m, which is precisely what we seek to prove.

(Refer Slide Time: 03:21)

Another result is that, if a equal to b mod m, and c equal to d mod m, then ac is equal to bd
mod m. If a is equal to b mod m, then, let us say a is q1 m plus r1, r1 is the remainder. In that
case b will also produce the same remainder. b would be some q2 m plus r1. Let us say c is q3
m plus r2 and d is q4 m plus again r2. That is because c and d are congruent mod m. Both of
them will produce the same remainder. Therefore, if you take ac, you find that ac would be r1

547
r2 mod m. Similarly, bd is also r1 r2 mod m. Every other term of the product would be a
multiple of m.

(Refer Slide Time: 04:53)

Therefore, ac is congruent to bd mod m, as is required. The third statement is that, if a is b


mod m, and d divides m, and d greater than 0, then a equal to b mod d. If d divides m, and m
divides a minus b, which would be the case, if a is congruent to b mod m. Then, by
transitivity of divisibility, d divides a minus b, which means a is congruent to b mod d, as is
required.

(Refer Slide Time: 05:55)

548
If a is congruent to b mod m, then ac is congruent to bc mod m, for any c greater than 0. So,
say a is qm plus r and b is q prime m plus r. The two produce the same remainder, and that is
why they are congruent to each other mod m. So, here 0 less than or equal to r less than m,
and then, ac is qmc plus rc and bc is q prime mc plus rc, which means ac is congruent to bc
mod m. Both of them produce the same remainder rc. We have that 0 less than or equal to rc
less than mc if c greater than 0. So, if c greater than 0, we do have the result that we want.

(Refer Slide Time: 07:32)

The next theorem says this. If f is a polynomial with integer coefficients, then for any two
integers a and b, and the non-zero integer m, if a is congruent to b mod m, then f of a is
congruent to f of b mod m. This follows from the previous theorems.

(Refer Slide Time: 08:44)

549
Let us say, f of x is C0 x power n plus C1 x power n minus 1 so on up to Cn. C0, C1, C2
etcetera are all integers. If a congruent to b mod m, then from the previous theorem, we know
that a squared is congruent to b squared mod m, a cube is congruent to b cube mod m and so
on. a power n congruent to b power n mod m.

(Refer Slide Time: 09:33)

Then Cn-1 a is congruent to Cn-1 b mod m. Since Cn-1 is constant, Cn-1 a is congruent to Cn-1 b.
Cn-2 a squared is congruent to Cn-2 b squared, since Cn-2 is an integer, and so on. Therefore,
adding all of them together, we have f of a is congruent to f of b mod m, the desired result.

(Refer Slide Time: 10:31)

550
For integers a, m, x, and y, ax is congruent to ay mod m if and only if x is congruent to y mod
m, that is y mod m divided by GCD of (a, m). That is if you choose to cancel a from either
side of congruence, then m will have to be divided by the GCD of a and m.

(Refer Slide Time: 11:26)

For example, 150 is congruent to 80 mod 14. So, if you divide both sides by 10, we have 15
congruent to 8. But then, 14 will have to be replaced by GCD of 10 and 14, the number that
we are seeking to cancel. But GCD of 10 and 14 is 2. Therefore we will have to replace this
with 7 which is indeed the case.

551
(Refer Slide Time: 12:18)

So, how do we prove the theorem? Let us say, ax is congruent to ay mod m. But this is if and
only if ay minus ax is m into z, for some integer z. Then, both sides of equation can be
divided with GCD of (a, m). But then, this is if and only if m divided by GCD of (a, m)
divides the left hand side, which is a divides GCD of (a, m) multiplied by y minus x. Now, m
by GCD of (a, m) and a by GCD of (a, m) are relatively prime, the GCD being 1. Therefore,
since m does not divide the first factor here, it should divide the second factor.

(Refer Slide Time: 13:58)

Therefore, this is if and only if m divided by GCD of (a, m) divides y minus x. But this is
precisely the condition for x being congruent to, y mod m divided by GCD of (a, m), as is
required in the theorem, and hence, the theorem.

552
(Refer Slide Time: 14:34)

As a corollary we find that, if a, m, x, y are integers such that GCD of (a, m) is 1. a and m are
relatively prime. Then, ax is congruent to ay mod m if and only if x is congruent y mod m.
So, this is when, a can be cancelled from each side of the congruence without affecting the
modulus. The cancelled number should be relatively prime to the modulus.

(Refer Slide Time: 15:32)

The next theorem says that, for integers x, y, m 1 through m r, if x is congruent y mod m i,
for every i, from 1 to r, this is if and only if x is congruent to, y mod LCM of m1 through mr.
m1 through mr are integers. x is congruent y modulus each of them. Then x is congruent y
modulo the LCM of these numbers.

553
(Refer Slide Time: 16:49)

To prove this, we know that x is congruent y modulo mi, for each mi. Then mi divides y
minus x for each i. That means y minus x is a common multiple of m1 through mr. That is
LCM of m1 through mr, which then must divide every common multiple of m1 through mr
divides y minus x. That is, x is congruent y mod LCM of m1 through mr, as is required.

(Refer Slide Time: 18:05)

Conversely, if x is congruent to, y mod the LCM, then x is congruent y mod mi. That is
because mi divides the LCM, and hence, the theorem.

554
(Refer Slide Time: 18:45)

If x is congruent to, y modulo m, then we say, x is a residue of y modulo m. For example, 1,


13, 31, 43 are all residues modulo 3 of 10. That is because, 1 is 10 minus 9, a multiple of 3.
13 is 10 plus 3 a multiple of 3. 31 is 10 plus a multiple of 3 namely 21. 43 is 10 plus 33 again
a multiple of 3. So, these are all residues mod 3 of 10.

(Refer Slide Time: 19:53)

A set of integers is called a complete residue system modulo m, if for every integer y, there
exists a unique x i, so that, xi is congruent to, y modulo m. So, here the set of integers
considered as x1 through xn. So, a set of integers x1 through xn is called a complete residue
system modulo m if for every integer y, there is unique xi in the set such that xi is y mod m.

555
So, for every single integer you will find the residue within the system. In that case it is called
a complete residue system.

(Refer Slide Time: 21:07)

0, 1, 2 is a complete residue system modulo 3. Take any integer that will be one of these three
modulo 3. Equivalently, 2, 15, 10 is also a complete residue system mod 3. The mapping
goes like this. 2 is 2 mod 3. 15 is 0 mod 3. 10 is 1 mod 3. So, we essentially have the same
integers modulo 3 and similarly, 100, 101 and 102. 102 is 0 mod 3. 101 is 2 mod 3, and 100
is 1 mod 3. Therefore, this is also a complete residue system mod 3.

(Refer Slide Time: 22:22)

556
A reduced residue system, let's call it RRS modulo m is a set of integers r1 through rn, where
ri is not equal to rj mod m, when i is not equal to j. That is, no two members are congruent
mod m, and for any integer relatively prime to m, there is a unique rj, so that, rj is y mod m.

(Refer Slide Time: 23:53)

If you take the CRS, that is the complete residue system, delete from it, all members not
relatively prime to m. We get a reduced residue system. All reduced residue systems mod m
have the same size. This is denoted as phi of m.

(Refer Slide Time: 24: 55)

This is called Euler’s phi function or totient of m. In other words, phi of m is the number of
positive integers less than m that are co-prime with m.

557
(Refer Slide Time: 25:43)

Let us consider reduced residue systems for various values. To find phi 1, the singleton 1 is
the reduced residue system for 1. Therefore, phi of 1 equals 1 and the reduced residue system
modulo 2 is again the singleton 1. Phi of 1 is defined as 1 by default, and phi of 3, there is,
residue system would be obtained from 0, 1 and 2, and then the numbers which are relatively
prime with 3 are deleted. So what remain is 1 and 2. Therefore, phi of 3 is 2. The reduced
residue system would consist of this 1 and 2. To find phi of 4, we consider the complete
residue system, which would contain 0, 1, 2 and 3. Of these 0 and 2 are not relatively prime
with 4. Therefore they are deleted, and what remains are 1 and 3. Therefore phi of 4 is 2.

558
(Refer Slide Time: 27:21)

Coming to 5, we consider the complete residue system 0, 1, 2, 3 and 4. Of this, 0 is not


relatively prime with 5. So, that is removed. Therefore phi of 5 is 4. For 6, we consider all
integers less than 6. Delete all numbers which are not relatively prime with 6, and what
remains are 1 and 5. So phi of 6 is 2. When we come to 7, we have 6 remaining. 7 is
relatively prime with all of these, therefore phi of 7 is 6. Coming to 8, we have, we find that
every even number is not relatively prime with 8. So, deleting them, we have 4 elements
remaining. So, phi of 8 is 4.

(Refer Slide Time: 28:53)

559
If GCD of (a, m) is 1, and r1 through rn is a complete residue system modulo m, then ar1
through arn is a complete residue system mod m as well. This is the case when GCD of (a, m)
is 1, that is a and m are relatively prime.

(Refer Slide Time: 29:52)

To prove this, suppose S is r1 through rn, and T is ar1 through arn. By the way, this theorem
will hold even if CRS is replaced with RRS. That is even if we are considering a reduced
residue system, r1 through rn. Then, ar1 through arn would be a reduced residue system mod
m, when a and m are relatively prime with each other.

560
So, let S be r1 through rn and T be ar1 through arn. If S is either a CRS or an RRS modulo m,
we have that, ri is not congruent to rj mod m, when i not equal to j. If ari is congruent to arj
mod m, assume there is one such pair within T, one such pair i, j so that ari is congruent to arj,
even when i not equal to j. Then, since GCD of a and m is 1, we can cancel a from both sides
and we would have ri congruent to rj mod m. Since GCD of a and m is 1, the modulus does
not change, which is a contradiction.

(Refer Slide Time: 31:49)

Therefore, ari is not congruent to arj mod m, when i not equal to j. Hence, T is also a set of
distinct residues, exactly the way S is.

(Refer Slide Time: 32:32)

561
If S is a CRS, then T is a CRS as well. S has a size of m, then T also has a size of m. On the
other hand, if S is an RRS modulo m, then each ri is co-prime with m. We obtain an RRS by
taking a CRS, and cancelling out every ri, which is not co-prime with m. So, whatever that
remains would be co-prime with m. So, if S is an RRS, then each ri is co-prime with m.
Therefore, ari is co-prime with m. That is because a is co-prime with m and now ri is also co-
prime with m. So ari is co-prime with m. Therefore, T is also an RRS, and hence, the
theorem.

(Refer Slide Time: 33:55)

The next theorem is famous as Fermat’s Theorem, which says that, for any prime p and
integer a, if p does not divide a, then a power p minus 1 is congruent to 1 modulo p. For any
prime p, and an integer a, if p does not divide a, then a power p minus 1 is congruent to 1
modulo p.

562
(Refer Slide Time: 34:44)

We will prove a generalization of this, which is called Euler’s Generalization of Fermat’s


Theorem. Fermat lived in the sixteenth century. Euler lived almost a century later, so Euler
had a generalization of Fermat’s theorem. Euler’s generalization states this. For any two
integers a and m, that are relatively prime with each other, so their GCD is 1, then a power
phi m is 1 mod m. So, phi m is the size of the reduced residue system modulo m.

(Refer Slide Time: 36:01)

So, we prove it this way. Suppose S, which is denoted as r1 through r phi m, is an RRS that is
reduced residue system modulo m. Then, so is ar1 through ar phi m, as we have just seen. For

563
each i, where i is from any integer between 1 and phi m, there exists a unique j, in the range 1
to phi m, so that ri equals arj mod m.

(Refer Slide Time: 37:14)

Hence, a power phi m multiplied by the product of rj for j varying from 1 to phi m, let us
compute this product. Taking a power phi m inside, we can write this as the product with j
varying from 1 to phi of m of arj. But this is congruent to the product with i varying from 1 to
phi m of ri. That is because for every j, there is an i so that arj is congruent to ri mod m. So
this congruence is mod m. But ri is co-prime with m, for every i. Therefore the product of ri 's
is also co-prime with m. Now, this product appears here too. So, you can cancel this from
both sides of the equation. Since, the cancelling quantity is relatively prime with m, the
modulus does not change when we cancel.

564
(Refer Slide Time: 39:14)

Therefore, what I have is this, a power phi of m is congruent to 1 mod m, as the theorem
claims. So, that proves the generalization of Euler’s for Fermat’s theorem. Now, coming to
Fermat’s theorem, let’s suppose p is prime and a is an integer such that p does not divide a.
Then, GCD of (a, p) is equal to 1. So, p is a prime and a is an integer, so that p does not
divide a. So GCD of (a, p) is 1. Now, consider the complete residue system modulo p. This
will contain these numbers. Of this, 0 is not relatively prime with p. Therefore what remains
are these. This would then be the, phi value of p. Phi of p would be the cardinality of this,
which is p minus 1.

(Refer Slide Time: 40:58)

565
Therefore, plugging this in Euler’s generalization, we find that, a power p minus 1 is 1 mod
p. This is precisely what Fermat’s theorem says. So, Fermat’s theorem can be obtained as a
corollary of Euler’s theorem. So, that is it from this lecture. Hope to see you in the next.
Thank you.

566
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati
Lecture 27 - Pigeon Hole Principle

In this lecture, we will see a very useful principle from Combinatorics known as pigeonhole
principle. The principle is extremely simple, so simple that, it is difficult to imagine that, this
can be of use. Okay, the principle is very simple to state.

(Refer Slide Time: 0:45)

Suppose you have n objects, say n balls, and they are distributed into k bins. If n is greater
than k, then, there will exist a bin with more than 1 ball. So, that is the principle. So if you

567
have n balls and they are distributed into k bins, where n is greater than k, then, there will be
a bin, at least 1 bin, which has more than one ball. It is almost self-evident. If the number of
bins is small and if all of them had less than or equal to one ball, then the total number of
balls is going to be only, it is going to be less than or equal to k. But the number of balls is
greater than k and therefore the statement must be true.

So let us, let us see some applications of this. So, pigeonhole principle of course, is very
simple to state. The power of this principle comes from the wide variety of ways in which
you can set up balls and bins. So, let us look at a problem which can be solved using
pigeonhole principle. So, let us look at this sequence. So, consider this following sequence,
consisting of numbers 7, 77, 777 and so on. So the ith element of the sequence would be 777
and so on i times. Now, let us look at, the theorem or the result states that, there will surely be
an element in the sequence which is divisible by, let us say 1903. Ok, so one element is this.
There will be an element in the sequence which is divisible by 1903. Now, how do we prove
this? What is the method? Is this, first of all, is this statement even be true? Why should there
be an element in the sequence consisting of 7s, the ith element being 7 repeated i times? So,
one of this is going to be divisible by the number 1903. There is nothing sacrosanct about
1903. There is a wide, you can substitute it with a lot of other numbers. The few numbers that
you cannot, you can look at the proof carefully to figure out what all numbers can come in
place of 1903.

So, let us try and work out a proof for this. So we will set up the balls and bins in a certain
way, such that we can argue that there will be a bin which contains more than one ball and
that is going to be used to construct the proof of divisibility. So, here one of these numbers to
be divisible by 1903. So, what we can do is try and divide these numbers by 1903. You will
get infinitely many numbers. So let us just take, let us say 1903 elements of the sequence, 7,
77 and so on. We will look at the first 1903 elements of this sequence and divide each one of
them by 1903. You will get some remainder, ok. If any one of them is 0, then we know that
we have some numbers in the sequence which is divisible by 3. What if none of them is 0? So
that is the case that we are interested. So let us write down the steps.

Consider the first 1903 elements of the sequence. So, divide the elements by 1903 and note
the remainder. If any of the remainders were 0, then we have done. We have a proof that,
look one of the, we have an element which is divisible by 0. So, we may assume that all the
remainders we get are numbers between 1 and 1902. So, remainders are between 1, both

568
numbers inclusive 1902. So these are 1902 numbers, the remainders, and we have 1903
numbers in total. So if you denote the remainder by r1, r1903, so on, these are 1903 remainders
and each of them is one of these numbers between 1 and 1902. So, one of these remainders
must repeat. So let us say, ai leaves remainder ri, and aj, where j is greater than i, leaves
remainder rj. So, we have this case that ri is equal to rj. So, what we have is ai is equal to some
number k1 into 1903 plus ri, and aj also has a similar equation, this is equal to k2 into 1903
plus rj.

We have ri and rj as the same, and therefore, if you do aj minus ai, what we will get is k1
minus k2 into 1903. Now, this 1903, therefore must divide ai minus aj. We are almost done,
what we wanted to show is there is an element in the sequence which is divisible by 1903 and
not the difference of two elements. But this is good enough for us, because what is ai minus
aj? So if you look at aj, aj is equal to 777….j times and ai is equal to 777….i times. If you
subtract them, what you get is aj minus ai, that is going to be 777 and the last items are going
to be 00000. So this is equal to a j minus i, which will consist of j minus i 7’s into 10 to the
power i, because there are i 0s, this equation is true. So, we can write k 1 minus k 2 into 1903
is equal to a j minus i into 10 power i. So, now if you look at 1903, 1903 does not have any
common factor with 10 power i, because the only factors of 10 power i are 2, and the only
prime factors of 10 power i are going to be 2 and 5, and they do not appear in 1903.

So 1903 must divide aj-i, and aj-i is one of the elements in the sequence. So we can say
therefore aj-i is the required element. So, that was one application of pigeonhole principle. So,
the balls are the pigeons, where the remainder that is left by each element of the sequence
when divided by 1903 and the pigeonholes are the bins where numbers from 1 to 1902. We
took 1903 remainders to fit in 1902 bins. So, now let us see another application of pigeonhole
principle. It’s a little more involved application.

569
(Refer Slide Time: 10:12)

So, here we start with an irrational number, so let us say e, you can replace it with any other
irrational number and we consider it's multiples, 2e, 3e, 4e and so on. So, this is the sequence
that we are considering. Instead of the irrational number, if you had started off with a rational
number, then we know that, at some point of time we will get an integer. When we start with
an irrational number, and if you consider this sequence e to 3e and so on, the multiples of e,
So, we know that none of these elements in the sequence is going to be equal to an integer.
But what we will show is, one of these elements is going to be arbitrarily close to an integer.

There will be some element which is, you give a measure of closeness, one of those elements
is going to be arbitrarily close to some integer. So, we will make the, I mean, we will make

570
the statement little more concrete. So, let us say, a degree of closeness we will call this 10 to
the power minus 100. So we will, what we want to, so, this is what is delta. We will claim the
following. For some n, n times e is delta close to an integer. So, delta close means, so if I
write the square bracket alpha, this will denote the nearest integer to alpha. And if it is
exactly midway, then we will choose either one of them. In that case, let us say we will
choose a larger one.

So this is the definition of the nearest integer. So what we want to say is ne, so ne minus ne,
the number minus its nearest integer, its absolute value is going to be less than or equal to
delta. This is the claim. I mean, this is the definition of delta close and what we are claiming
is, for some n ne is delta close to an integer. So how do we prove this, what should be the
balls and what should be the bins? So let us look at each of these elements e to e3 and so on.
Here we have delta is equal to 10 raised to minus 100. So 1 by delta, is what we will choose
as the number of bins. So, we have not yet told what is the bin, but the number of bins will
be, we are going to set up balls and bins analogy, wherein the number of bins will be 1 by
delta or 10 raised to 100. And we are going to look at 10 raised to 100 plus 1 numbers of this
sequence. And each of those numbers is going to be put in one of the bins, and since there are
10 raised to 100 bins, one bin is going to be containing 2 numbers.

Now, let us decide how we are going to put these numbers in the bins. So, look at any
number. If you consider, let us say, m times e and if you look at the floor of that, I mean, this
is going to be an integer and me minus floor of me is a number which belongs to [0, 1]. It
cannot be 0 or 1, so we could have said the open interval 0 1. But this is me minus floor m
times e, is going to be some number between 0 and 1. Now let us just look at this interval 0 to
1 and split it into 10 raised to 100 equal parts. So let us call this as f of m. So for the mth
element, f of m denotes the amount by which me exceeds the integer part of the number. So f
m, if it, now we have split this into 10 raised to 100 parts, equal parts, so each part is going to
be of size 10 raised to minus 100. So, these parts are going to be our bins. And if f m falls in
one of these parts, then we will say that the number falls in that particular bin. So if you had
let us say 2.78, e is approximately 2.78. So look at the, I mean there is some bin which
contains 0.78 and that will be where the first element will be there.

Now if you take 2e that is going to be something like 5.5 something or 5.6 something, so that
is going to be fall in a different bin and so on. So, each element we are going to put in one of

571
these bins. Clearly, since we took 10 raised to 100 plus 1 numbers, some 2 bins is going to
contain the same, I mean some 2 bins is going to contain 2 numbers. So, let us say n1 e and n2
e fall in the same bin. Therefore n1 e is equal to some integer plus let us say delta 1 and n2 e is
equal to k2 plus delta 2.. And delta 1 and delta 2 are numbers which fall in the same bin.

So if you look at n1 minus, so let us say n1 is greater than n2, so n1 minus n2 e is going to be k1
minus k2, that is going to be an integer, let us say k plus delta 1 minus delta 2. Now, delta 1
minus delta 2, both these numbers by virtue of being in the same bin of size at most 10 to the
power minus 100, their difference can be at most 10 to the power minus 100. So, that means
n1 minus n2 times e, if you call this as, let us say re, where r is some integer, re is going to be
k plus, this could be negative as well, plus or minus some number epsilon, where epsilon is
guaranteed to be less than 10 to the power minus 100. So plus or minus is there, so epsilon if
you assume as positive, and if delta 1 is greater than delta 2, then we can say it is k plus, the
other k, will say it is k minus epsilon. But whatever is the case, re is another element of the
sequence and re is going to be some integer plus a very small number and by very small here
we mean that it is going to be less than 10 to the minus 100. So, we have seen 2 cases of
pigeonhole principle. Now we will look at little more, slightly more general version of
pigeonhole principle.

(Refer Slide Time: 19:29)

So here we will say that, this is a generalized pigeonhole principle. So, if you have, let us say
n balls and k bins, then one bin will have, I mean, at least one bin will have greater than or
equal to n by k balls. The same reasoning as to why pigeonhole principle is true will be the
reason why the generalized pigeonhole principle is true. So if we distributed n balls into k

572
bins, then there surely should be a bin, which has at least n by k balls, because if all bins
contain less than n by k balls, then the maximum is going to be less than n by k into k, which
is less than n.

So, now let us see another example involving generalized pigeonhole principle. So, here we
will look at an example from geometry. A one sided, I mean we have a square whose sides
are 1 centimeter long. And we are putting, let us say 10 points. Place 10 points inside S, so let
us call this as a square S, and we are placing 10 points inside the square. If we have large
number of points, then there will be 2 points which are close to each other, ok, that is a
general statement. Here we want to say that when we have 10 points inside S, there will be 2
points, so claim, no matter how you place the points, there will exist, which are, there will 2
points, which are within 0.5 cm of each other.

Now, how do we prove such a statement? Okay, there will surely be 2 points, which are
within 5 centimeters of each other. Now how do we try to see this as the generalized
pigeonhole principle? One way to think about it would be, to take, let us say circles. So
consider circles of diameter 0.5, and if you can somehow cover the entire sphere using 9
circles, whose diameter is less than 5, then, we know that there will be 2 points in the same
circle. So what we want to try and do is, we want to somehow place 10 circles or 9 circles, so
that they cover the entire region. Now, here, if you want a circle, maybe that is little tricky
and there is going to be lot of overlap, so it is not very straight forward to see how this can be
done, but there is a simpler way wherein you can split it into 9 parts. So, you can look at this
particular square, and let us say we split it into 9 parts, natural split. So it is split into 9
squares. Now if you look at these 9 squares, and no matter how you place 10 points, there is
going to be one particular square which contains 2 points. And the maximum distance
between any 2 points in the smaller square is going to be, so smaller square is of side 1 by 3,
so this into root 2 will be the maximum distance. And that is 1.414 something divided by 3,
that is going to be less than, this is surely less than 0.5. So if we have split this in this
particular manner, so you can get 2 points, which are within 0.5 centimeters of each other.

573
(Refer Slide Time: 24:51)

We will end our discussion on pigeonhole principle by looking at one more example. So here
we are looking at a combinatorial problem, where we have let us say 100 boxes and we have
200 balls or 100 bins and 200 balls and we are distributing these balls into these boxes
following 2 rules. The first rule, no box, no bin is empty. The second rule; no bin contains
more than let us say 100 balls. Now when you distribute it in this way, there are many ways
of distributing it in this way. When you distribute, none, all of those distributions would have
the property that, there will be a set of bins containing, so when I say a set of bins
containing, it take all the balls in those bins, together whatever they contain. So the set of bins
containing 100 balls, exactly 100 balls. So, no matter how you distribute the balls into bins,
there will be a subset of bins such that together they will account for exactly 100 balls. Now
suppose you had, let’s say 700 balls, and we had done this.

We could let us say give 7 balls in each bin and that is never going to, if you distribute in that
particular fashion, there is no way that you can get 100 by combining few bins. So natural
question would be to wonder, why such a claim is true when you have 200 balls and 100
boxes. So we will give a proof of this via pigeonhole principle. Again, the techniques are
very similar to the techniques that we had seen. We are going to, now when we want to say
that some bin contains 100 balls, even if we can say that some bins together contains a
multiple of 100 balls, the only allowable multiples under these conditions are 100 and 200.
But if we skip some boxes, then you are going to get something less than 200.

574
So if you just say that, there is a collection of bins which, I mean, there is a subset, strict
subset which contains a multiple of 100 balls in them, then that would mean that it is exactly
100. When we want to look at that multiple of 100 we could try and look at the remainders,
the remainder of what? This is where the cleverness comes in. So the proof, so let us look at
these boxes, they will contain some number. Let us say a1, a2, a100, so i denotes the number of
balls in bin i.. We have just arranged in some particular way, ok. Now we will insist one thing
on this arrangement. We will insist that a1 is not equal to a2. Can this always be done? I mean
if there are two bins which contain different number of balls in them, we could of course get
those 2 bins and do this entire thing.

So, the only case where we cannot get such an arrangement is, when every bin contains
exactly the same number of balls. But that would be the case, when every bin contains
exactly 2 balls. But in that case, our claim is trivially true, because we can just take any 50
bins and together they are going to be 100. So we will assume that is not the case, and
therefore we have 2 bins a1 and a2, which has a distinct number of balls in bin 1, that is going
to be different for the number of balls in bin 2, and those numbers we will call it as a1 and a2.
Now, let us consider this sequence of 100 numbers. So, the first is a1, the next is a1 plus a2,
next is a1 plus a2 plus a3 and last is a1 plus up to a100 and this is going to be surely 200. So let
us look at these numbers and ask, sorry this, the last sum is going to be 200. The total number
of numbers that are there in the sequence is 100. So this is a sequence of 100 numbers,
numbers varying from let us say a1 to a200. Now, let us divide, so let us call these numbers as
let us say b1, b2, b100. Let me consider the remainders r1, r2, r100, so this is bi mod 100.

There are 100 remainders, If any of these were zero, then we are done, so except r100, the last
one of course is going to be divisible by 100, or the other, so let us skip the 100th element,
we will just look at the 99 elements. Let's say r99, so these are 99 elements. If any one of them
gives a zero, then we have a set of bins which contain 100 balls. So we can assume that none
of them is zero and therefore they are numbers between 1 to 99. If none of them is zero, then
they have to 1 to 99. Now let us ask this question, is there any repetitions in them? There
could be repetitions, but that would be the good case for us. So if ri equals rj, that means the
bi’s are increasing in order. So if ri equals rj, then bi plus 100 is equal to bj. Therefore if you
look at the numbers or look at the bins numbered, let us say i plus 1 to j, they are going to be
of sum 100. So, now that would validate our claim, so we may assume that is also not the
case. That means all these numbers are distinct, and therefore they are numbers from 1 to 99.
Now what do we do? Here is where the assumption that a1 is not equal to a2 comes handy.

575
So let us look at these numbers and we will add a2 into the mix and if you look at a2 mod 100,
that should be, now you have 100 elements, a2 itself is not 100, because we said that no bin
contains greater than or equal to more than 100 balls. If a2 was exactly 100, we have one bin
with exactly 100. There should be some sum here which is exactly equal to a2. So if a2 is
added, there will be a repetition. And that repetition, without a2, there was no repetition, so
that sum is going to be, so that remainder is going to be a2. And by virtue of a1 not equal to
a2, so this is not equal to b1, it is going to be some bj. Therefore bj is going to be equal to 100
plus a2. Since bj is equal to 100 plus a2, if you look at the set consisting of all elements from 1
to j, and remove the second element, we will get a set which contains exactly 100 balls. So
that concludes the proof, and we will stop here.

576
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati
Lecture 28 - Stirling Numbers, Bell Numbers

In today's lecture, we will look at the problems involving balls and bins. The basic question
that we are trying to study is how to distribute n balls into k bins, and counting the number of
ways of doing this. Now stated in this format it is a bit under specified. So, we could say
things like the balls are distinguishable. The bins are distinguishable, or the bins need to be
nonempty and you can pose lot of conditions on how the distributions should be. And we will
look at many variants of that and that will give us different kind of problems.

(Refer Slide Time: 01:08)

577
So, the first type of problem that we will discuss is counting the number of compositions. So
here what we will assume is that we have n balls which are indistinguishable or n identical
balls. And there are k bins. So these bins, you can think of them as numbered from 1 to n or 1
to k, and they are distinguishable. You can think of distributing identical toffees to children
being distinguishable. You can say that is a similar problem.

Okay, so, how do we do this? So let us, there are two variants of these as well, wherein the
bins could be empty and nonempty. So the first problem that we will address is what is
known as weak compositions. So a weak composition is basically a split of n identical balls
into k distinguishable bins and the bins could have zero balls in that. So, basically we need to
find out numbers a1 to ak such that a1 plus a2 plus all the way up to ak is equal to n, they add
up to n. And each ai, is greater than or equal to zero. So the weak comes from the fact that
some of these ai's could actually be zero. And if we are talking about compositions, what we
mean is, we still want k numbers, k positive integers such that they add up to n, and further
each ai is greater than 0 or we can say is greater than or equal to 1. I mean, every bin should
be nonempty. There should not be any empty bin. And we need to look at the number of ways
of counting this.

So, let us first look at the problem of weak compositions. So, we look at this problem in
following way: We have all these balls, let us say n of them. And in order to split them into k
bins, we basically draw partitioning walls in between them. So, if we have to split it into k
bins, k minus 1 partitioning walls are required. And once the partitioning walls are put in

578
place, whatever is between the start and the first partitioning block can be thought of as a1.
And between the first and second can be thought of as a2. The last can be seen as ak, between
the k minus 1 towards the end can be seen as ak.

And you can see that this is a one to one correspondence. If you draw these balls in this
particular way, any arrangement, any distribution of and balls into k bins can be viewed in
this particular format. Now, so as an example, we say that there are 10 balls. And if we
wanted to split it into three blocks, all that we have to do is two partitioning walls. And this
gives one particular split. So, basically we want to count the number of ways of placing
partitioning walls when you have an arrangement of balls along a straight line. We could also
think about the same thing in a slightly different manner. So note that after this partitioning
walls, have been placed, there are precisely n plus k minus 1 objects in the arrangement. So,
this n corresponds to balls plus k minus 1 corresponding to the partitioning walls. So there are
n plus k minus 1 objects placed along a line. Okay, so we will exploit that observation to
count the number of ways of distributing balls into bins. So let us say we have n plus k minus
1 blanks. And out of these blanks, some blanks would be selected as positions where you can
put balls and the others will be where you can place the partitioning walls.

So, if we were just placing the partitioning walls, we just need to identify k minus 1 positions
to place walls or equivalently n positions to place balls, so they are identical in some sense. I
mean the count or the number of ways of doing this are exactly the same. So the number of
ways of choosing k minus 1 positions out of n minus k plus 1 position that count is, the
number of ways for doing this is equal to n plus k minus 1 positions are there, out of which
you have to choose k minus 1 positions. So, if we choose one set of positions then you have
the remaining positions as positions of balls. So the total number of ways of doing this would
be n plus k minus 1 choose k, and this is as you can see it will be also same as n plus k minus
1 choose n, these quantities are equal. So the number of weak compositions of n into k parts,
is n plus k minus 1 choose n minus 1. So we can write this as a theorem.

579
(Refer Slide Time: 7:37)

Number of weak compositions of n into k parts is n plus k minus 1, choose k minus 1. This is
also equal to n plus k minus 1, choose n. So now we are in a position to find or to count the
number of compositions. So when you are counting compositions, what we were interested in
is a1 plus a2 plus ak should be equal to n. And each ai should be greater than or equal to 1. We
define bi to be equal to ai-1. Then, summation bi should be equal to n minus k. So, we can look
at a different problem. Instead of trying to count the number of ways of obtaining a1 to ak
such that they add up to n and each ai greater than or equal to 1, we could look at count the
number of weak compositions of n minus k into k parts.

So, look at a weak composition of n minus k into k parts and for each of them, for each of the
part of the composition, if you add 1 what you will get is a composition. And if you have a
composition of n, you can convert that into a weak composition of n minus k, so their
numbers are equal. So, and this counting, that is counting the weak composition is something
that we have already done and this is equal to n minus k, that is the, n minus k plays the role
of n now, plus k minus 1, choose k minus 1 and that is equal to n minus 1 choose k minus 1.
So this is the number of compositions of n. So we can write it as a corollary. Number of
compositions of n into k parts is equal to n minus 1 choose k minus 1.

This would also mean that, the total number of compositions, if you do not say how many
parts you have to split them into, you are just splitting it into arbitrary number of parts. Now
when we are splitting it into arbitrary number of parts, we cannot really count the weak
compositions because if each part will allow it to be empty, you can have say one empty part,

580
two empty but there are infinitely many ways of doing that. So the total number of
compositions makes sense whereas the total number of weak compositions does not really
yield anything meaningful.

So, if you look at the total number of compositions, the number of compositions of n that is,
is equal to summation n minus 1 choose k minus 1, where k varies the number of parts at
least 1, 1 to n and that is going to be equal to 2 raised to n minus 1, that is just the binomial
identity. So, now that we have done compositions, we will move onto slightly different
problem.

(Refer Slide Time: 11:27)

So, this is known as set partitions. So here, so in the earlier case what we had was, the balls
were indistinguishable whereas the bins were distinguishable, means you are distributing
identical objects to k people. Here the balls are distinguishable in the sense as red balls, green
ball. They are different colours, But we are putting them in let us say cartons which are
indistinguishable from each other, ok. So, there is no first box, second box and third box, all
the boxes look identical. After you have, so suppose you have two boxes. And let us say we
had put three items here and four items here.

Now it is crucial as to which three items you had put. Suppose you had 7 balls, we could
choose 3 out of 7 and put that into here. Each of those choices would give a distinct
arrangement. But whether these three that you have chosen goes in the first box or the second
box is not, it does not really matter because they look identical. So you can think of them as

581
identical boxes and they are shuffled around after the balls have been distributed. So how
many ways are there of doing this? So again we will have n balls and k bins. So, the problem
we are looking is, distribute n distinguishable balls, so n distinguishable balls into k identical
bins, or partition n objects into k bins. So you can think of it as let us say there is a set of
objects. We can think of them as set of objects because each object is distinguishable from
the other. Okay. And then, we need to split it into some number of, say partitions. So this will
be a split into 1, 2, 3, 4 parts, six objects are being split into four parts. So we will define the
count or the number of ways of doing this as S(n, k). So by S(n, k), we mean what we mean
is the number of ways of partitioning n into k non, so we will write square bracket n to denote
the set of numbers from 1 to n. So we can either think of them as distinguishable balls or
when we write square bracket n what we mean as a set of numbers from 1 to n. So, that set is
being partitioned into k nonempty subsets.

So, let us see some example. If we look at S (n, 1), that is a number of ways of partitioning n
into one subset. So there is precisely one way of doing it. If we look at S(n, n), that means the
number you have n objects and split them into n nonempty bins and there is only one way of
doing it. Whereas if you look at S(n, n minus 1), this is little more interesting. So here you
have n objects and you need to split it into n-1 boxes which are nonempty boxes. So we can
think of it as all except one box would contain single element and there will be precisely one
box which contains two elements. Okay, so which two elements goes into that box containing
two elements, that can be decided in n choose 2 ways. So that is the total number of ways as
well. So this is equal to n choose 2. Whereas if you look at S(n, 2), this is going to be
something different, it's the total number of ways of splitting n objects into two parts.

So, you can think of one part as a set, as a subset and the other as a complement. So there are
two raise to n ways of choosing a subset. But this subset had to be nonempty. So we have to
remove 2, because if you take the full subset or the empty subset, then they would result in
one of the parts being empty. So that has to be avoided. And does not matter whether you
take the subset or the complement, so if the first part is, that is if the set that you have chosen
is A, and the complement is A complement or of the set that you have chosen A complement
and the complement is A, both are counted as the same. So, there is a by 2. So this will be 2
raised to n minus 1 minus 1.

582
What is a general formula for S(n, k)? We could say that S(n, k) is going to be zero if k is
greater than n. So the only interesting case is where k is less than n and by convention we
could choose S(0, 0) as 1. This will just make our formulas look better later on. So, this
means, if you have zero objects and you want to split it into zero parts, by convention we are
saying that there is only one way of doing it. So, we need to find a formula for the S(n, k). So,
S(n, k) is also known as the Stirling number of the second kind. We need to find a formula
for S(n, k).

(Refer Slide Time: 18:18)

So the following theorem gives a recursive formula to obtain S(n, k). So, S(n, k) is equal to
S(n minus 1, k minus 1) plus k into S(n minus 1, k). We need to prove that this theorem is
correct. So like many other problems in Combinatorics when you have, so there is a
combinatorial identity. When you have a combinatorial identity, what you can do is try to
find sets whose cardinality is same. So, if you can find a set and find the number of, and find
two different ways of counting those sets and show that these two distinct two different ways
essentially counts the same object. Ok. So, let us try to do that. So if we look at S(n, k), S(n,
k) is basically the number of ways of partitioning n into k parts. Now when you are splitting n
into k parts clearly the number n has to go into one of the parts, so, based on that we are
going to count, so this is a set of numbers 1 to n.

Now if you look at the number n, this might either be in a part of its own. Or it might be with
some other elements. These are the only two choices. So, there can n be in the partition, first,
alone, second in company. If so if you look at the total number of ways of partitioning such
that n is alone and the number of ways of partitioning such that n is in company and you add

583
them up, we will get the total account. If it were alone then it means the remaining n minus 1
objects have to be split into k minus 1 parts. So this alone corresponds to S(n minus 1, k
minus 1.) So, in company should basically be k times S(n minus 1, k).

Let us see why that is the case. So, we know that n is in company. So let us forget about n
and the remaining elements and, if you just take n out of the partition, what you can see is
that the remaining n minus 1 elements are being sent into k distinct parts. They are sent into k
parts and into any of these parts if you put k, if you put n, you will get a different partition.
So, in company the total number of ways of doing it is split n minus 1 into k parts and choose
a part as n's company. So, splitting can be done in S(n minus 1, k) ways and the choosing of
n’s company can be done in k different ways because there are k parts. So the product is the
total number of ways of doing it and therefore the total number of ways of partitioning and
into k parts is the sum of these two ways and that, so that proves the theorem. We will see
some interesting consequences of this identity.

(Refer Slide Time: 22:45)

584
So, S(n, k) is equal to S(n minus 1 choose k minus 1) plus k into S(n minus 1, k). So, let us
look at the number of surjective functions. So, we will view this as applications, surjective or
onto functions. So, we are interested in functions from an n element set to a k element set,
such that it is an onto function. Every element of the image, every element of the codomain is
part of the image. So how many ways are there of doing it? So if you look at any such
function we can think of it in the following way. So look at any element. That has a pre-
image. So let us say all these map to this particular element and maybe these map to some
other element and so on.

So, if you look at the pre-images, they basically split the domain into different parts. So, there
you obtain a partition of n. So, if you want to count the total number of surjective function
you can basically count the partitions. So you take one particular partition it can be converted
into a function, of course you have to decide this particular part is assigned to which
particular number. If you look at any particular surjective function it basically induces a
partition on the domain. So, the way of, the one way of constructing surjective functions
would be first choose the partition of the domain and then for each partition assign a
particular number from the range. So split or partition the domain into k non empty parts. For
each set, assign a distinct element of k. You cannot assign the same element because that
would not make the function surjective.

So, the number of ways of splitting the domain is basically S(n, k). And if you have to assign
distinct numbers from 1 to k to these parts, that is going to be k factorial ways. So the total

585
number of surjective functions from n to k is k factorial times S(n, k). And as a corollary of
this, we can say. We can prove the following polynomial identity. So x to the power of n is
equal to summation S(n, k) xk, where xk is the following factorial. So that is x into x minus 1
into x minus k plus 1.

Now, note that this is a polynomial identity. This is not just for integers, or just natural
numbers. So, this would also say that pi raised to n is equal to, summation is over, k equals 0
to n. This would also say that pi raised to n is summation k equals 0 to n, S(n, k) into Pi into
Pi minus 1 into pi minus k plus 1. So, this is a fairly complicated expression. But this is true
for, what this means is that, this Polynomial identity it is true for all real numbers. Ok. So,
from basic so in the sense it is an interesting formula. What we want to prove is something, it
is from the Combinatorics of finite objects, we will show something is true for, say much
larger class of objects. The idea is very simple. If you want to show that two polynomials are
equal, the only thing that you have to show is they are equal at some large number of points.
So, here you have a polynomial of degree n. If so the LHS is a polynomial of degree n. And if
you can show that the RHS which is also a polynomial of degree n, if you can show that they
agree at n distinct positions then they must be same for all different positions. So, how do we
show that they are the same at n different positions? We will show that, these equations hold
at all positive integer value. Once it is true at all positive integer value it must be true for all
real numbers.

So, let us say x, we will, we can now assume that x is a positive integer and n is a positive
integer, so therefore we can use our combinatorial insights into proving this. x raised to n is
just nothing but functions from n to x. Choose numbers from 1 to n and for each of them
choose an image, that would be a particular function. And that in that way you can find all
functions. So the total number of ways of doing that is x into x into x n times, so that is x
raised to n. So this is the LHS, number of functions from n to x. Now we want to find these,
we want to count this set in a different way. So, if you look at the set of all functions, we
could look at functions whose image is of size 0, 1, 2 and so on. 0 will of course be 0, there
are no functions of size 0. I mean there no functions whose image is a set of size 0.

586
(Refer Slide Time: 30:36)

So, if you look at functions from x, from n to x whose image is of size is i, how do we count
that? So, the first step would be count the number of functions from n to x, whose image is a
set of size i. So basically that quantity would be the ith term in the RHS. So, this we will show
is the ith term in the RHS. So we want to count the number of functions from n to x, whose
image is a set of size i. So, x is this particular set. And we were looking at functions which
map into this particular set.

Now if the image is of size i, then so that is a subset of size i. That can be chosen in x choose
i ways, so, number of ways of choosing the image is equal to x choose i. Once that has been
chosen we had the set n and we wanted to map it to x. We first restricted that the subset that it

587
maps into a set of size i, that could have been chosen in x choose i ways, and then we need to
find a surjective function from this particular set n to the subset that we have chosen.

So the total number of ways of doing that would be S(n, i) into i factorial. Because that is the
total number of ways of, that is the total number of surjective functions, and into the image
could have been chosen in x choose i ways. So, this is equal to S(n, i) multiplied by falling
factorial x. So when, so this is the ith term in the summation. So total number of ways, so
what we did is, instead of looking at all possible functions from let us say n to x, we summed
over the size of the images. So if the image is of size 1, then how many functions are there?
The image is of size 2, then how many functions are there? So, sum over sizes of image.

So, that size of the image can vary from 1 to n, or maybe we may assume that x is less than n,
so the size of the image varies from 0 to size of x. Once we have fixed the size of the image,
we could choose the image in x choose i ways. The image set could be chosen in x choose i
ways and then we just need to look at the surjective functions from n to that particular set and
the surjective function is equal to S(n,i) into i factorial into x i. And that is equal to S(n, i)
falling factorial i and this summed up for all values of, i is basically equal to the total number
of function and that is equal to x raised to i. So, that concludes the proof of this particular
identity.

(Refer Slide Time: 34:43)

588
The next object that we will see is the total number of partitions. So, when we talked about
S(n, k), S(n, k) was nothing but the number of ways of splitting n into k parts. If we sum this
up over all parts, all values of k, so k going from say 0 to n, that is the total number of ways
of partitioning and this we will denote it by a special number called as B(n) or the Bell
number. So, the next thing that we have on agenda is to show that the Bell numbers satisfy
the following identity. So B(n plus 1) is nothing but summation over i going from 0 to n, n
choose i, B(i.). Okay, why is this true? So, LHS is the total number of ways of partitioning.
So, we need to look at the number of ways of partitioning n plus 1 and show that the RHS
also counts exactly that. Again, we can look at the element n plus 1 and there are many
possibilities, the element n plus1 could be in some block with many other elements.

589
So, let us, let us say that so if we were partitioning n plus 1 in some particular block and the
others are in some other block. So, the possibilities are that the complement is a set of size 1
to n. So, what we are looking at is the block in which n plus 1 is present and its complement
block. So in order to prove this theorem, we look at the element n plus 1. In any partition, n
plus 1 should be in one of the parts and the remaining elements if we consider, they can be of
size 1 to n. So, this is what we are saying. We have split n plus 1 into some number of parts.
And n plus1 is in one of those parts.

If you look at all the other elements together, that is going to be a set of size k. So, k is the
size of the complement of the part containing n plus 1. And this k can vary from, say 0,
because everything could be, n plus 1 could be in a block which contains everything else or it
could go all the way up to size n. It cannot be n plus 1 because one element is taken off,
namely n plus 1. So, and these other elements now has to be split into some number of parts.
They have to be partitioned, that is all. So, there are how many ways of doing this? So if the
complement is of size i, so let us say if the complement is of size i, the total number of ways
of splitting that is going to be B(i), and whatever was the complement, if it's size was i, then
the number of ways of splitting it is B(i). But these i elements could have been selected into n
choose i ways. n plus 1 surely is not there. The remaining n elements, out of them i had to be
selected. So B(i) into n choose i is the total number of ways of splitting the n plus1 elements
into different parts provided n plus 1 is in a different block. And if you sum this over all
possible values of i going from 0 to n, we will get the total number of ways of splitting. So
that basically proves this combinatorial identity. Let us stop here.

590
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Lecture 29
Generating Functions

In this lecture, we will learn about generating functions. Generating functions is a tool used to
solve recurrence relation.
(Refer Slide Time: 00:36)

Let us first see some examples of recurrence relations. The first recurrence relation that we
will see is a very simple recurrence relation. So, the first term a0 is equal to 1 and an is
defined in terms of the previous value and it is 3 times an-1. This is the familiar geometric
series, where the ratio of successive terms happens to be 3. So, this recurrence relation
basically gives a sequence of numbers. So, if you write down the sequence, the first number
is 1, the next number is 3 times 1, that is going to be 3, 9 and so on. So, in this recurrence
relation, it is very easy to give what is the nth term, the nth term by means of definition is
going to be 3 times the previous term or the n minus first term, and we can therefore write a n
in a closed form as equal to 3 to the power of n. So this can be viewed as a solution for this
recurrence relation.

We could take another recurrence relation, where say we take a0 is equal to 1 and an is equal
to let us say an-1 plus 5. So, now what is the nth term? This is the arithmetic progression, this
is also easy to determine the the nth term, if you write down the sequence you get the

591
sequence 1, 6, 11, 16 and so on. Therefore, an is just going to be 5 into n plus 1. So, you can
think of this is a closed form solution to the recurrence relation that we are given. We will
look at a slightly more complicated recurrence relation. Let us say a0 is equal to 100 and a n
is defined in terms of a n minus 1 as let us say 4 into a n minus 1 minus, let us say 100. Now
it is not so straightforward to determine, just by mere inspection it is difficult to say what the
closed form solution is. We could say that a0 is 100, the next term is going to be 400 minus
100 that is 300, and the next term is going to be 1100, next term is going to be 4400 minus
100, that is 4300 and so on.

So, the general term is going to be slightly difficult to write down by mere inspection. So,
how do we find the solution to the nth term? For this, we will use a very powerful method
known as generating function that can be used to solve these kinds of recurrence as well as
much more complicated recurrence.

(Refer Slide Time: 04:04)

592
So, let us see the basics of generating function. So what basically is a generating function.
You can think of the following analogy, you can think of it as a clothes line. So, let us say the
starting point of the clothes line is known, okay, this is an infinite clothes line and the first
number of the sequence is put at the first position, the second, then a1 is attached somewhere,
a2 is attached somewhere when immediately after a1 and then a3 is attached and so on. The
reason why we are thinking about the clothes line analogy is, we could wrap around the
clothes line, we could tangle it and then you could pass it on somebody else and all they need
to do is to stretch the clothes line and look at its starting point and from the clothes line they
can just read of the values of the sequence.

So, we have an infinite sequence and we can think of the generating function as a clothes line
on which various values are held at specific position. More formally, the mathematical way to
think of this is as a formal power series. So, a generating function is a formal power series.
So, if you have a sequence, let us say a0 followed by a0, a1, a2, so on. So, in general, let say
an, so given a sequence, the generating function corresponding to this is the sequence, is the
formal power series given by a0 plus a1 x plus a2 x square plus a3 x cube and so on. So, an x
raise to n plus 1. So, given any sequence we can associate a formal power series with it. In
case of some sequences, the formal power series can be compressed and written in nice
formats whereas for some other things the formal power series, it may not have a closed form
expression.

For example, if your sequence was let us say, the factorial sequence, so a n is equal to n
factorial, and if you look at this sequence, that is going to be 1 plus x plus 2x plus 6, sorry 2x

593
square plus 6x cube plus 24x raise to 4 is 120 x raise to 5 and so on. So, it is difficult to
imagine some place where this sequence or this particular power series can be thought of as
converging to some particular value. But we will not bother about the convergence aspects.
This expression is basically what we will call as the formal power series or the generating
function corresponding to the sequence a0, a1, a2 and so on.

So, let us take some more examples. If you look at let's say the sequence 1, 1, 1 so on. So, in
other words a n is equal to 1, corresponding to it the formal power series would be 1 plus x
plus x square and so on. So, this if we assume x to be less than or equal to 1 over the absolute
value of x to be less than or equal to 1, we can just simply write it as 1 by 1 minus x. So, this
you can think of it as, say, rolling together the generating function into a simpler expression,
okay. So, instead of telling the whole power series, we can just say that we are looking at the
generating function 1 by 1 minus x, when we were thinking about the sequence 1, 1, 1, 1 and
so on. And if you had let us say the sequence 1, 2, 4, 8, so on, so the nth term you can think of
as 2 to the power of n, that will correspond to the sequence 1 plus 2 x plus 4 x square plus so
on, and this we can think of it as the, I mean if you assume suitable values for x, you can say
that it will be 1 by 1 minus 2x.

Okay, so, this is the generating function corresponding to 1 and this is the generating function
for 2 to the power of n, n greater than 0. So, now that we have the definition of generating
functions in place, we can think of how to solve this particular recurrence relation using
generating functions.

594
(Refer Slide Time: 09:01)

So, little bit about the notation, whenever we think of a sequence a0, a1, an, its generating
function we will denote by a of x. So, a of x is equal to a0 plus a1 x plus a2 x square and so on.
So, now what we have is, we have this following recurrence relation, a0 is equal to 100 and an
is equal to 4 times an-1 minus 100. Let us call this as equation 1, if we multiply both sides by
x raise to n we will get a raise to n, x raise to n is equal to 4an-1 times x raise to n minus 100 x
raise to n. And then we will sum this equation 2, for values of n starting from 1 to infinity.
So, both sides if we sum up, the LHS and the RHS summation n is equal to 1 to infinity a n x
raise to n is going to be equal to summation 4a n minus 1 x raise to n, n is again varying from
1 to infinity. The reason why we took n is equal to 1 to infinity, instead of 0 to infinity is, the
term a n minus 1 would not have made sense when n is equal to 0, and furthermore this, I
mean, in other words, the equation 1, which we started off holds only for values of n greater
than or equal to 1.

So, n minus summation n is equal to 1 to infinity, 100 x raise to n. So, if we look at look at
this equation closely, we can see that some parts of these equations is starting to resemble
A(x), for example, the LHS term is almost the term on the LHS, is almost equal to A(x), it's
just that the first term is missing. So, this we can rewrite as, so this is equal to A(x) minus a0,
whereas the second term here, this is equal to, if we take x outside and 4 outside, what we
have is summation n is equal to 1 to infinity, an-1 x raise to n minus 1. Now, since n is starting
from 1, so the first term of the summation is a0 x raise to 0, the next term is a1 x raise to n and
so on, and so this will basically be equal to 4x A(x).

595
And the the last term on the LHS is summation n is equal to 1 to infinity, we can take 100x
outside, and we will get x raise to n minus 1 and that is going to be equal to minus 100x into
the summation is 1 by 1 minus x. So, a0 as 100, we will get A(x) minus 100 is equal to 4x
times A(x) minus 100x by 1 minus x. So, whatever recurrence we had, we had used that to
write a new equation involving the generating functions of the sequence. This we can rewrite
as A(x) into 1 minus 4x is equal to 100 minus 100 x by 1 minus x and therefore A(x) is equal
to 100 by 1 minus 4x minus 100x by 1 minus x into 1 minus 4x. So, now instead of thinking
about the recurrence relation, we have generating function of x, which basically contains all
the information corresponding to the sequence in a succinct form.

(Refer Slide Time: 13:50)

596
So, what we know is A(x) is equal to 100 by 1 minus 4x minus 100x divided by 1 minus x
into 1 minus 4x. Now, from this, how do we get the value of the nth term? So, we can look at
this generating function carefully and we can try to determine the x raise to, the coefficient of
x raise to n, if you were to write this as a0 plus, if you were to write this as, let us say alpha 0
x raise to 0 plus alpha 1 x raise to 1, and so on. If you could write it in such a format and
alpha n must be equal to x raise to n, so try to write A(x) as alpha 0 plus alpha 1 x plus alpha
2 x square and so on. So there are two terms, 100 by 1 minus x we can simply write it as 100
by 1 plus 4x plus 4x whole square plus so on, because 100 by 1 minus 4x is a sum of a
geometric progression, if you think of 1 plus 4x plus 4x square and so on that is a geometric
progression that will sum up to 1 by 1 minus 4x.

The next term is little tricky, but we will use what is known as the partial fractions method.
So, we will write this as, this is a by 1 minus x plus b by 1 minus 4x. If you write it in this
format, this expression easily can be thought of as a geometric series and their expansion is
easy to write. So we want 100 x by 1 minus x into 1 minus 4x to be equal to a by 1 minus x
plus b by 1 minus 4x. Now, what are the values of a and b, which will make this equation
true? We can multiply both sides by 1 minus x and substitute x is equal to 1, that will give the
value of a. So when you multiply that the left hand side will be 100x into 1 minus x by 1
minus x into 1 minus 4x, these gets cancelled.

Now, when you substitute x equals 1, you will get 100 by 1 minus 4, that is minus 3, okay,
and on the right hand side 1 minus x term cancels off with, so 1 minus x terms will cancel off,
whereas b by 1 minus 4x into 1 minus x when you put x equals 1, the 1 minus x term
becomes 0 and hence the right hand side of this equation is going to be equal to a. So, this
100 by minus 3 is going to be a and similarly if you multiply both sides by 1 minus 4x and
put x is equal to 1 by 4 what you will get is the value of b. Therefore, we can write b is equal
to 100 into 1 by 4 divided by 1 minus 1 by 4, that is going to be 100 into 1 by 4 by 3 by 4 that
is 100 by 3, and therefore we can simply write this generating function A(x) as 100 into 1
plus 4x plus 4x square so on, minus 100 by 3 say plus 100 by 3 into 1 by 1 minus x minus 1
by 1 minus 4x.

So, this can be written as, so if you look at the nth term, so from this you can read off the nth
term, so a n is equal to coefficient of x raise to n, that is going to be equal to 100 into 4 raise
to n plus 100 by 3 into 1 by 1 minus x, the term is going to be 1, minus 1 by 1 minus 4x, that

597
is again will be minus 4 raise to n. So, this is going to be equal to 100 minus 100 by 3 into 4
raise to n plus 100 by 3. So the nth term is going to be 200 by 3 into 4 raise to n plus 100 by 3.
So, just for this particular case, we will verify that our answer is actually agreeing with
generating function, I mean with the recurrence relation.
(Refer Slide Time: 19:06)

598
So, we got an is equal to 200 by 3 into 4 raise to n plus 100 by 3. We will verify by using
induction. So if you look at a0, which is given to be 100 by the recurrence relation, if we
substitute in formula, we will get 200 by 3 into 4 raise to 0, that is going to be 1 plus 100 by
3, that is 300 by 3, that is going to be equal to 100. So, that is induction base case and we, if
we assume that the formula is true up to n minus 1, we need to check that a n is equal to 4
times a n minus 1 minus 100, so up to n minus 1 we will believe that expression was correct
and therefore this is going to be equal to 4 into 200 by 3 into 4 raise to n minus 1 plus 100 by
3 minus 100.

The recurrence relation gives this as the answer and that is going to be equal to 200 by 3 into
4 raise to n plus 400 by 3 minus 100. So, that is going to be equal to 200 by 3 into 4 raise to n
plus 100 by 3 that agrees with the formula that we had computed via the generating function
method. So, we have seen the generating function method used to solve a particular
recurrence relation. We will see one more recurrence relation just for practice, slightly
different from the one that we had seen. Here, we had just one term 4 times n minus 1 and
there is a constant term.

599
(Refer Slide Time: 21:07)

So, we will look at a more famous recurrence relation, namely the Fibonacci recurrence. So if
we think about the Fibonacci sequence, the sequence is given as a0 is equal to 1 and a1 is
equal to 1 and an is given by an-1 plus an-2, where n is greater than or equal to 2. So, the
sequence if you write the first few terms, it is going to look like 1, 1 next term is the sum of
these two 2, next term is going to be 3, 5, 8, 13 and so on. So, add up the two terms their sum
is the next term 21 and so on. So, we can find the nth term by just repeating this process, if
you were to write a computer program to do this.

If n is input and if we want to write the nth term of the sequence, it is going to take a long
time, it is going to take time. It is an exponential time algorithm, because the input is given as
n which is given in decimal, so this will have some k digits and the time taken is going to be

600
proportional to n, which is something like 10 raise to k. So, if you just apply the recurrence
relation, we are going to get an exponential time algorithm to compute the value of an, or fn
here, the nth Fibonacci number. So, let us see how we can do this via generating functions.
Method is identical to the one that we had looked earlier. We look at the main recurrence
relation, which says a n is equal to a n minus 1 plus a n minus 2.

So, multiply both sides by x raise to n, and we will get an x raise to n is equal to an-1 x raise to
n plus an-2 x raise to n. So this equation is valid only for n greater than or equal to 2. So, we
sum this up for all values of n greater than or equal to 2, so sum over n greater than 2 an x
raise to n, that is going to be equal to the summation an x raise to n, n minus 1 x raise to n the
n greater than or equal to 2 plus summation n greater than or equal to 2 n minus 1 x raise to n.
Now, the first term is going to be, if you assume that A(x), so let A(x) be the generating
function, then the first term in this expression is going to be equal to A(x), there are two
terms missing namely a0 and a1 x, so this can be written as A(x) minus a0 minus a1 x.

This is going to be equal to the first summation in the right hand side. It resembles A(x), if
you take an x outside, you will get this as summation n greater than or equal to 2, an-1 x raise
to n minus 1, and the second term if we take x square outside, this is going to be summation n
greater than or equal to 2, this is an-2 x raise to n minus 2, and this is going to be equal to x
times and this summation, every term in generating function is present except, the, when n
equals 2 what you get is a1 x raise to 1, the next term is a2 x square, only term missing is a0.

So, this is going to be A(x) minus a0 plus x square into, here n starts from 2 and an-2 is going
to be 0 times x raise to 0 is a1 x raise to n, a1 x raise to 1 and so on, so this is going to be equal
to A(x). We can bring all the A(x)’s together and substitute the values for the constants a0 and
a1. So, look at the equations involving the generating functions and by rearranging terms and
plugging in the value of a0 and a1, we will get A(x), the generating function as equal to 1 by 1
minus x minus x square.

601
(Refer Slide Time: 26:00)

Now, what remains is to determine the coefficient of x raise to n, it is, if we expand out this
particular function 1 by 1 minus x minus x square, what will be the coefficient of x raise to n?
So, we will use the method of partial fractions. So, we will write 1 by 1 minus x minus x
square as A divided by alpha minus x plus B divided by beta plus x. Now, A, B, alpha and
beta, they have to satisfy some conditions, in particular Ax plus, I mean if you multiply them
out, what you get is A beta plus B alpha plus A minus Bx divided by alpha beta plus alpha x
minus beta x minus x square. So, comparing the terms, we will get A is equal to B, because
Ax minus Bx should be 0, because there is no coefficient for x in the numerator. So, A is
equal to B, and therefore we can simply write this as, this is equal to A by alpha minus x plus
A by beta plus x, where alpha and beta has to satisfy additional conditions by comparing the
coefficients we will get alpha beta should be equal to 1 and alpha minus beta should be equal
to minus 1.

So, those will be the roots of this particular denominator polynomial. Therefore alpha will be
equal to root 5 minus 1 by 2, and beta will be equal to root 5 plus 1 by 2. Now, if we plug in
these values, we can compute the coefficient of x raise to n. So, the coefficient of x raise to n
will be the coefficient of x raise to n in A by alpha minus x plus coefficient of x raise to n in
B by beta plus x, and this is equal to A by alpha into 1 by alpha raise to n minus 1, and for the
other term, the coefficient will be B by beta that is going to, be B is same as A, so that is A by
beta into 1 by minus beta, because here the term is beta plus x, so minus beta raise to n minus
1.

602
(Refer Slide Time: 29:48)

And we can further infer the value of A by observing that A times alpha plus beta, that is this
particular term, that should be equal to 1. So, A is equal to 1 by alpha plus beta and alpha
plus beta is equal to root 5. So 1 by root 5, that is A, times minus 1 by minus beta raise to n.
We know that alpha and beta are inverses of each other. So this can be written as 1 by root 5
into beta raise to n minus minus alpha raised to n. Because 1 by beta is minus alpha, and
observe that beta is a quantity, which is greater than 1 and alpha is a quantity whose absolute
value is less than 1, because root 5 minus 1 by 2, you can show that this is going to be less
than.

603
So, the dominant term is going to be beta raised to n. So although both these numbers alpha
and beta are irrational numbers, they will, the irrational part will cancel out with each other
and if you want an approximation, you can even ignore the part alpha raised to n after a
certain point, because alpha being less than 1 it will quickly go towards 0 and if you just
round off whatever is the integer part of beta raised to n, if you round off to the nearest
integer, you will get the correct answer, the exact answer without any approximations. Now,
we were counting some combinatorial object, namely we were estimating the Fibonacci, the
nth Fibonacci number. We saw that the strings numbers alpha and beta are coming in.

(Refer Slide Time: 32:00)

604
We can also see that the same thing arises in another way as well. So, let us look at the linear
algebraic formulation. So, now instead of thinking of one particular Fibonacci number,
instead of thinking of the nth Fibonacci number, we will think of a vector xn which denotes
the pair of Fibonacci number, namely fn s and fn-1. So, this pair is what we will denote by a
vector of size 2, and note that this vector will be equal to fn-1 plus fn-2, and fn-1 will be just fn-1
and therefore this can be written as 1, 1, 1, 0 into fn-1 fn-2, because you can say that this matrix
A multiplied by this particular vector will give you fn fn-1. So, we can simply write this as xn
is equal to A into xn-1, and therefore you can repeatedly apply this and say that xn is equal to
A raise to, let us think of xn+1, so xn+1 is equal to A raise to n times x1. So, A raise to n is
going to be a matrix, it is going to be 2 cross 2 matrix. If we can somehow estimate that
matrix, from that information, we can determine the nth Fibonacci number.

Now, in order to determine the nth Fibonacci, I mean the nth power of this matrix, we can
diagonalize it. So suppose we can write A as PDP inverse, then A raise to n is going to be
equal to P D raise to n P inverse, and then we can write xn+1 is equal to P into D raise to n
times P inverse times x1. And D is now going to be a diagonal matrix, and this diagonal
matrix, its diagonal values, if it is, lambda 1 and lambda 2, this entire expression will
basically be something of the form A lambda 1 raise to n plus B lambda 2 raise to n. Of
course A and B are not going to be the same A and B from there, but it is going to be
something which is dependent on what is P D etc.

But you can see that, the lambda 1 and lambda 2 which comes here will be equal to, will be
the same as what we saw earlier. In fact, it will be our alpha and beta, and the coefficients

605
will be 1 by root 5. We will stop here, and learn more about generating functions in the next
lecture.

606
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati
Lecture 30:
Product of Generating Function

(Refer Slide Time: 00:28)

So earlier we learnt about generating functions. So, if you have a sequence, let's say, we
denote it by an, n greater than or equal to 0. We can capture the information that is there in
that particular sequence of real numbers by its generating functions, which we denote by
A(x). So, A(x) is nothing but, the formal power series a0 plus a1 x plus a2 x square and so on.
So, we saw an example where we looked at a sequence coming out of some combinatorial
property, computed its generating function and used the generating function to obtain a
closed from solution for, the nth term of the sequence. So, today, we will look at combining
generating functions, in particular we will look at products of generating functions, and
understand their significance from a combinatorial stand point.

So, let us look at the following sequence. Let's say we look at 2 sequences, one is an and its
generating function, let see we denote it by A(x) and there is another sequence bn and its
generating function we denote it by B(x). Now, if we define a new sequence cn which is equal
to an plus bn, so, the nth term of the sequence is an plus bn and the sequence is denoted as cn.
So for this cn, what will its generating function be? So, will C(x) be equal to A(x) plus B(x),
and it will be because if you look at A(x) plus B(x), the nth term, the coefficient of the nth

607
term, of x raise to n is equal to, so, an x raise to n and bn x raise n are both present in A(x)
plus B(x), and their sum, which is an plus bn is going to be the company efficient of x raise to
n. So that so to add to generating functions is simple, so if we add 2 sequences point wise
addition of the sequences will result in a new sequence whose generating function is simply,
the sum of the generating functions of the previous sequences.

Now, if you take the product of generating functions, what happens? So instead of looking at
A(x) plus B(x), let us say we have D(x) is equal to Ax into Bx, so this is going to be, D(x) is
clearly a generating function of some sequence. So if we denote dn by the nth term’s
coefficient, will this be equal to an multiplied with bn? So, this unfortunately is not the case.
But whatever is dn, we will understand that in more detail today. And see some examples of
using this understanding to solve some combinatorial problems. So, the math is very simple.
We need to determine the nth term, of A(x) times B(x), or rather the coefficient x raise to n in
A(x) times B(x). So, if we look at, so we need to compute the coefficient of x raise to n in
A(x) times B(x). So, if A(x), by definition it was a0 plus a1 x plus a2 x square, and so on and
B(x) is equal to b0 plus b1 x plus b2 x square and so on.

When you take the product A(x) times B(x), which is what we call as D(x), this is going to
be, the coefficient of the constant term is just a0 times b0. The coefficient of x raise to 1 can
come from a0 b1 plus b0 a1, both these products are going to result in x and their sum is going
to be the coefficient of x raise to 1. So, we will write this as a0 b1 plus a1 b0 times x and the
next term, the second term or the coefficient of x square will essentially be a0 b2 a1 b1 and a2
b0, so there is nice pattern in the sequence. So, the next term will be a0 b2 plus a1 b1 plus a2 b0,
the whole times x square. And x cube term would essentially be a0 bn plus a1 bn-1 up to an b0
multiplied by x raise to n and so on. So, D(x) can be written as summation of di x raise to n,
where di is equal to summation k going from 0 to i ak bi-k. Okay, so and this is, so the
sequence obtained in this particular manner is also called as the convolution of a and b. So,
when you multiply the generating functions, the sequence that you will get from the product
of the 2 generating functions will be a sequence which is the convolution of the 2 underlying
sequences. So now, let see some particular applications of this.

608
(Refer Slide Time: 09:10)

We will see combinatorial use. So let a say, for us an, we will denote the number of ways of
forming some particular object, some particular combinatorial object or structure. Using, say
n elements and the number of ways of forming a certain structure using these n elements. So,
this n could be the letters, means, 10 letter or n letters of some particular alphabet and you
want to look at words which can be formed without containing let us say a certain pattern.
And bn is also, so this is for the number of ways of forming different structures, could be even
the same structure, so, let's call it as another structure using n elements. Now, let us look at
the following problem. We are given elements 1 to n. We are allowed to form a subset A
containing elements 1 to let us say i, and B containing elements i plus 1 to n and on this set
A, form a structure of the first kind or we will call it as type A, and in this you form a
structure of the second type.

And how many ways are there of forming this, and what we are interested in is, this whole
new object, there are how many ways to form these kinds of objects that is what we want to
count. And that can be, if we denote that by, let us say, Cn, so Cn is the number of ways of
splitting a set of n elements into 2 parts so that the first part is 1 to let’s say k, this k is your
choice and then the second part is from k plus 1 to n and you need to form a structure using
the first k elements and the another structure using the next elements.

609
The total number of doing this is what we denote by Cn and what we know is the generating
function of Cn, which we denote by C(x) is going to be equal to A(x) times B(x). So, this is
the combinatorial use of whatever we had learnt about generating function earlier. Let us see
a more concrete application.

(Refer Slide Time: 13:07)

So, let us say, that you are studying in university and there are n working days, and what is
required is, you have to split the semester into 2 halves, and in the first half, you have 1
exam, and in the second half, you have 2 exams. So, semester consist of n days and out of
these n days, you have to select 3 days for conducting examinations, and out of these 3 exams
1 exam should be in the first half. So there is designated first half and there is a designated

610
second half. The first half will have precisely 1 exam and the second half will have precisely
2 exams. You are free, although we call it as half we can just split it into 2 parts. So, the first
we should probably refer to it as the first part. The first part will have 1 exam and second part
will have 2 exams. There are how many ways of designing the semester with these exams.
That is what we need to compute. Let Cn denote the number of ways of organizing the
semester in this particular way.

Now if you denote a1 or ai. So an denotes a number of ways of designing the first part and bn
denotes the number of ways of designing the second part. And Cn, we can simply, by looking
at the problem say that Cn is going to be, first you have to split the semester into 2 parts, and
for each part you can design the examination days in whatever way you please and then you
will get the total numbers. So that is going to be summation over k going from the first part, k
is the number of elements or the number of days in the first part, so k goes from 1 to n minus
2, because the first part contains n minus 1 days, the second part can contain only 1 way. That
is not going to give a meaningful split with 2 days or 2 examination days, and therefor k goes
from 1 to n minus 2 and ak times bn-k. So that is going to be summation k is equal to 1 to n
minus 2 ak is the number of ways of picking 1 day out of k days. So that is going to be k
times, take this as n minus k, n minus k choose 2 is the total number of ways.

So this is what we need to compute, k times n minus k choose 2. You can expand it out and
work out. So that is going to be summation k equals 1 to n- 2, k into n minus k, into n minus
k minus 1 by 2. So you can expand out the terms. n is fixed, so you can take n outside. So
there are going to be 6 terms in the expansion. You can expand out the terms and you will
have a k raise to 1 and k square and k cube appearing. You can sum it up, and you will get
some closed form solutions. In this part, what we will see is, how to do the same calculations
without using the formulas for summing up polynomials of degree 2 and 3, etc. You will do it
with the help of generating function. Because generating functions for an and bn are fairly
easy to compute. So, what we know is, the following and so our approach. Compute A(x).
Compute B(x). This will be easy to compute, we will see, and then C(x), since Cn is equal to
ak times bn-k, what we know about the product of generating functions, we can say that C(x)
will be equal to A(x) times B(x).

611
And then, from C(x), obtain Cn. Because Cn is the convolution of an and bn, we can just
compute the generating function Cn by just taking the product of A(x) and B(x). So, let us see
what is A(x). Now, an is equal to k. Therefore, A(x) is equal to k into x raise to, so this is
going to be summation k x raise to k, k going from 0 to infinity. Or this is going to be equal
to, 0 term does not contribute, x plus 1 times x plus 2 times x square plus 3 times x cube and
so on. And bn is equal to n choose 2. That is number of ways picking 2 days for exam, and
therefore, B(x) is equal to k choose, so we had written an equals k, an is equal to n or ak
equals k. B(x) equals k chose 2 x raise to k, summation k equals 0 to infinity. So k equals 1, k
equals 0 terms are going to be absent. So this is going to be equal to 2 into 1 by 2 into x
square plus 3 into 2 by 2 into x cube plus 4 into 3 by 1 into 2 into x raise to 4 and so on.

So these expressions, we need to somehow obtain a nice close form solution for this. And that
is what we will do. So all these are very nice expressions whose close forms can be easily
obtained from the following generating function. Let us say, T(x) is this particular sequence,
1 plus x plus x square, and so on. And T(x) clearly is 1 by 1 minus x. Now, if you
differentiate both sides, what we will get is, 1 minus x raise to 2 is equal to 1 plus 2x plus 3x
square and so on. That is pretty much same as A(x), just 1x is missing. So therefore, from
this, we can conclude that A(x) is equal to x by 1 minus x the whole square. And if we
differentiate this once more, what we will get is, 1 by 1 minus x raise to 3 into 2 is going to
be equal to 2 plus 3 into 2 x plus 4 into 3 x square and so on. So this is almost the same as
B(x). But there are few missing terms. So, if you just supply the missing terms, all that one
has to do is multiply by x square.

So multiply both side the x square by 2, we will get B(x) is equal to x square by 1 minus x the
whole cube, and therefore, C(x) is just going to be product of A(x) and B(x), that is going to
be x cube by 1 minus x raise to 5. So now, what we know is, this summation that we were
looking at, which gives us the nth term of the number of ways of splitting the semester, is a
sequence whose generating function has a simple form. x cube by 1 minus x raise to 5. Now,
how do we find the nth term of 1 minus x raise to 5? You can use the generalized binomial
expansion. So we just need to expand 1 by 1 minus x raise to 5, and then shift everything by
3, because there is an x cube. We need to find the x raise to nth term inside this. If you find x
raise to nth term inside this, that is going to be equal to the x raise to n plus 3rd term in 1
minus, the expansion of 1 minus x raise to minus 5.

612
So, let us look at how we can write this as a sequence. So again we will look at this particular
equation. To differentiate it once more, what we will get is, 2 into 3 by 1 minus x raise to 4 is
equal to 3 into 2 plus 4 into 3 into 2 into x plus 1. To differentiate yet another time, we will
get 2 into 3into 4, by 1 minus x raise to 5, that is the left hand side, that will be equal to 4 into
3 into 2 plus 5 .4 .3 .2 into x plus so on. And what we need is this multiplied by x cube. So
we can write this in the following manner.

(Refer Slide Time: 25:04)

2 into 3 into 4 by 1 minus x raise to 5, is equal to (4. 3. 2. 1) plus (5. 4. 3. 2) into x plus (6. 5.
4. 3) into x square and so on. So if we just multiply by x cube, on both sides and divide by (2.
3. 4) what we will get is, (4. 3. 2. 1) divided by 1, (2. 3. 4) plus (5. 4. 3. 2) by (1. 2. 3. 4) into,
there will be an x cube, into x raise to 4 so on. So, basically this is going to be equal to
summation over n going from 3 to infinity x raise to n, n plus 1 choose 4. So that would mean
that the coefficient of the nth term is n plus 1 choose 4. So this will apply only when n is
greater than or equal to 3. The other terms are going to be 0, rightly, so because we want to
split with the first part having 1 holiday and the second part having 2 holidays you need at
least 3 terms. So that is basically the answer of that combinatorial identity summation k into n
minus k choose 2, k going from, say, 1 to n minus 2. This will be equal to n plus 1 choose 4.
Let's look at one more problem.

613
(Refer Slide Time: 27:16)

We will not do it in this detail. We will just quickly rush through the problem. We again need
to split the semester in the 2 parts. So semester has n days and now, any day of the first part
can be chosen for a surprise test. So the possibilities is that we have is, the, there could be
surprise test on all days, there could be surprise test, there could be no surprise test. Every
day, I mean you do not have any surprise test, so that is a possibility and in the second part,
there could be a surprise holiday. Any day can be chosen as an off day. Another strange way
to have a semester, but that's our problem. We have, so we have n days and these n days must
be split into 2 parts, and the first part, so you could have let us say test or no test, any choice
is okay, and in the second part, you could have a working day or an off day, and that could be
chosen in any way that the Dean pleases. So, we want to know how many ways are there to
organize such a semester.

So, again if Cn is the number of ways, Cn is going to be summation k going from 0 to n, an, so
ak bn-k, where ak is equal to number of ways of organizing the first part and bk is the number
of ways of organizing the second part, and what we know is, see generating function of Cn,
that is going to be equal to A(x) times B(x). So A(x) is the generating function of the
sequence an, and an is going to be 2 raise to n, because if you had n days or k days, every day
you could either have a test or no test.

614
So there are 2 possibilities, total there are 2 to the power k possibilities. And therefore, A(x)
is equal to summation 2 raise to k x raise to k, k going from 0, this is going to be 1 by 1
minus 2x and B(x) is also the exactly same thing. Instead of surprise test, we have an off day
or working day. So this is also going to be equal to B(x). And therefore, we can write C(x) is
equal to 1 by 1 minus 2x the whole square. Now, from this, how do we extract out the nth
term? We could again use binomial theorem or the binomial expansion, and from that we can
infer. But here, there is an easy way. So, if we denote, so C(x) is equal to A prime x by 2. So,
if you take A prime, so A prime is nothing but 1 by 1 minus 2x the whole square into 2. So
C(x) is also equal to A prime x by 2. So, Cn is equal to half of the nth term, of A prime x and
A(x) is equal to 1 plus 2x plus 2 square x square and so on, is equal to 2 raise to k x raise to k
summation.

So A prime x is equal to 2 raise to k into k into x raise to k minus 1. So A prime x by 2 is


equal to 2 raise to k minus 1 into k into x raise to k minus 1. And therefore, the nth term, of
this, of A prime x by 2 is nothing but coefficient of x raise to n, that is going to be equal to 2
raise to n into n plus 1. So, we can conclude that Cn is equal to n plus 1 into 2 raise to n. So,
we have seen 2 examples, where the use of products of generating functions to compute
combinatorial quantities.

(Refer Slide Time: 32:32)

So, the third example is a more classical example. This involves, what is the known as the
Catalan numbers. So, Catalan numbers arise in a wide variety of context. Here we will see 2
examples. Both of them are essentially the same combinatorial object masquerading as

615
different things. The first thing is balanced parenthesis. So we have let us say, n pairs of
parenthesis. Okay, so left and right parenthesis, so there are, so this is 1 pair and we have n
such pairs. And we want to rearrange them in any manner, but the result should be a balanced
parenthesis.

That means that should come out of parenthesizing some expression. In particular all it means
is, every left parenthesis, I mean, a right parenthesis should come only after the left
parenthesis and the total number of left parenthesis should always be greater than or equal to
the number of right parenthesis. So left is greater or equal to right. If we look at the number
of left parenthesis that has appeared in any prefix your parenthesis, the number of lefts are
going to be greater than or equal to the number of right and total count should be equal. So
this is an example of a balanced parenthesis. So here, I mean, at the start there is 1 and the
number of left is always greater, but except at the very end where they become equal, and if
we take this, here, the count becomes, the left minus right becomes 0 at this point, and 0
again here, and finally when the full expressions is read. at that point also it is 0. So we want
to find the number of ways of arranging this parenthesis so that it is balanced.

So, let us take the example where the number of pairs, n is equal to 2. So this is one way and
another way would have been, and these are the only possible ways. Because it should begin
with the left, so the next one can either be a right or a left. If the next one is a left, there is
these possibility. If it is a right, the only way the whole expression can be made balanced is
by having. So, these are the only 2 possibilities. And when n equals 3, you have this as one
possibility. This is yet another possibility. So when n is equal to 3, there are these 5
possibilities, and we need to compute the value on the number of balanced parenthesis for the
general n.

616
(Refer Slide Time: 36:17)

There is a very similar problem, which is also known as the Gambler Ruing or the Drunkards
walk. So this is, the x axis denotes time and y axis denotes the amount of money or the
position from your, from where it started, just call it as amount of money and you start with 0
units of money. And at each time the total amount of money that you can have will either go
up by unit or come down. So you can go up and again up and then going up, and then you
could come down. So here is a sequence of steps, such that you started, at time 0, with the 0
amount of money and after 10 units of time you are still at 0, but in between you had gone
negative. We want to find the number of ways of starting at 0, when a time 0, 0 amount of
money and ending at 0, after let's say 2n steps. It has to be always even number of steps,
because starting at some place and ending at the same amount of money.

The total number of steps that you would have taken should be even. So after 2n steps, how
many ways are there, by which you can reach the same starting position and the additional
constraint is, you should never go negative. So both these problems, if we the think of this as
the Drunkard walk. It is essentially a Drunkard is starting at some particular position, each
step he takes he either goes one step closer to, it is a one dimensional walk. He is going
towards his home or coming back, it is never allowed to be at distance greater than what he
was from his home at any point of time.

617
So under that restriction, what is the, I mean he is never allowed to go towards the negative
sides of the axis, so how many ways are there to do this? So gain we will use the notion of
generating functions. But, it is slightly trickier than what we have done in the earlier case. So,
Cn, let us say denotes the number of ways. So look at all possible steps that you can take. So
this, we will call as a bad walk, whereas if you had the following walk, this is a good walk of
length 8. So we want to find the total number of good walks. We will also define what is
known as a very good walk. Let me just give an example of a very good walk. So this is an
example of a very good walk. I haven't formally defined what a very good walk is. We will
do that shortly. So let us look at any of these walks. We can split it into 2 parts, the first part
will consist of a very good walk, and the second walk will be a good walk.

So, and we will essentially be taking the convolution of these 2 things to get to our answer.
So let look at some particular walk. Two things can happen. It starts at 0, and then it hits the x
axis at some point of time and then again goes up, or it hits the x axis only at the very end.
So, let's look at the first time when it hits the x axis. So, look at the first time when a walk
comes back to x axis. Now if you look at the remaining portion of the walk that is essentially
just a good walk of the same length. So we can think of every walk as split into 2 parts. The
first part is, start at 0, and hit the x axis at some time i. So this i is, it is hitting the x axis for
first time, and then remaining portion. Now, the remaining portion, has to be a, it has to be
exactly same type of the walk or it just has to be a good walk. What about the first portion?
The first portion is also a good walk. But, it is a special kind of good walk.

In the sense, this portion, this, means, so if you take this as let's say if you denote by i, the
time when at first hit the x axis. Now, any walk of length 2 n minus i. You take any good
walk of length 2 n minus i, put it here in this region between i and 2n, and take these 2
portions you will essentially get a good walk. Whereas if you take this region 1 to i, and put
it, replace it with a good walk, you will not get something of the kind that we are talking
about. Because what we want here is that, the walk should not have touch the x axis at any
point of time. So the number of ways of constructing walks by combining these 2 walks, is
not going to be just the product of 2 simple good walks. It is a product of a simple good walk
and something which is very good, in the sense it has never hit the x axis.

618
(Refer Slide Time: 43:28)

We will formally define what a very good walk is. So, very good walk is a good walk that
does not hit the axis, except at the start and at the end. So, good walk, we know it's generating
functions, is the number of good walks of length Cn, its generating function is going to be
let's say C(x). What is the generating function of very good walk? That is what we need to
determine. And if we say that the generating function of very good walk, suppose we call it
as D(x), then since the first, for n equals 0, the split does not work, we will have to write C(x)
minus 1 is going to be equal to C(x) times D(x). Now, from this equation we can solve for
C(x) but, assuming we know what is D(x). So, let first compute the generating function of
D(x). So let us look at the good walks of length D(x), of length 2i and these are the very good
walks of length, let us say 2 times i plus 1.

619
What we will do is, now to compute this, we will show that the number of good walks of
length 2 times i minus 1 is equal to number of very good walks of length 2i. Why is this so?
So, let us take a very good walk of length 2i. So that we will start, any walk, good or very
good, has to start with an upper, with the positive arrow and the last one of that should be a
downward arrow. First one should be upward and the last one should be downward and this
entire length is going to be let a say 2i, and the very good walk has the additional property
that it never touches the x axis anywhere in between. So let us take a very good walk and
strip it off the first and the last moves, what you will get is some walk of length 2 times i
minus 1. So, if we look at the set of all very good walks of length 2i and all good walks of
length 2 times i minus 1, we can have a one to one correspondence. I mean, take a particular
walk and strip it off its first and last moves, so if you strip this off, you will get something on
the other side.

If you take 2 distinct elements of the very good walk, and strip it off its first and last elements
what you get will be distinct elements of the good walk. Further you take any good walk, and
add these first and last moves, you will get a very good walk. So, every element of GW can
be generated from VGW and every element of VGW gives rise to a unique GW. And
therefore, these sets are in one to one correspondence. So, that would also mean that the
generating functions can be quickly computed. So now if you denote di by the ith element, I
mean, if you denote di, by di the number of very good walks, of length 2 i and di is going to
be equal to Ci-1. And therefore, the generating function D(x) is just going to be equal to x
times C(x). Okay, so that is the key property. So from this, we can write C(x) minus 1, minus
1 because split works only for walks of length greater than or equal to 1. So this is going to
be equal to x times C(x) times C(x). So, we can rewrite this as x times C(x) square minus
C(x) plus 1 equals 0. Think of C(x) as a variable, and if you solve the quadratic equation
involving C(x), you will get C(x) is equal to 1 minus under root 1 minus 4x by 2.

So when you solve the quadratic equation, there 2 possible roots. But, only one root will
make sense for this particular generating function and you can check then it is the root with
the negative sign. Now, we have the generating function for C(x). From this, we can obtain
the closed form solution. So, 1 minus 4x raise to half, you can use binomial expansion that
will have a one as the leading term and that will subtract from this one and therefore, the
constant term vanishes. Now, when the constant term vanishes, every other term will be an
even term and then you can show that C(x) will be of the form 2n choose n into 1 by n plus 1.

620
So this is the nth Catalan number. This number has a name, it's the Catalan number, and we
can see that many other recurrences of, which gives rise to generating functions of this kind,
gives Catalan number as the answer. We will stop here.

621
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Lecture 31
Composition of Generating Function

So in earlier classes we had learnt about product of generating functions.

(Refer Slide Time: 0:38)

In particular, if we looked at the generating function A(x) and the generating function B(x)
and multiplied them, the sequences, and if you call this as C(x). C(x) basically corresponds to
the generating function of the sequence C small n given by summation i going from 0 to n, ai
bn-i. Or in other words, you take the convolution of 2 sequences and what you get is, if you
look at the sequence obtained by the convolution of 2 sequences. Its generating function is
going to be equal to the product of the generating functions of the underlying sequences. We
want to understand more operations on generating function. In particular we want to look at 2
generating functions, let us say A(x) and B(x) and make sense of, let us say A composed with
B. So if you look at A(B(x)), this is a0 plus a1 instead of x, you have now B(x) plus a2 B(x)
square plus so on.

When does this even make sense? Because now each term, if you look at the nth term anB(x)
raise to n, this itself is an infinite series. B(x) raise to n is going to be product of 10 copies of
bn and you are adding infinitely many of them. Does it even make sense? When can we make

622
sense of these kind of objects and what combinatorial objects do they represent? This is what
we want to understand today.

So let us just take some simple examples. So let us say, if we look at, let us say A(x) is equal
to the simplest of generating functions 1 by 1 minus x. And then A of B(x) would be equal to
1 by 1 minus B(x) and this we can think of as, let us say 1 plus B(x) plus B square x, B(x) the
whole square and so on. What object does this denote? That is the first thing that we would
like to understand. Let us look at this B(x) a little more carefully. Suppose we call it as b0
plus b1x plus b2x square and so on, and if you look at the constant term of A(B(x)). So look at
A(B(x)), its constant term would be what? Well, A(B(x)) is equal to 1 plus B(x) plus B
square x and so on. So the constant term from this is going to be one. The constant term from
B(x) is going to be b0. The constant term from B square x is going to be the product of the 2
terms, that is b0 whole square and so on. And if you look at B^n(x) there is going to be b0
raise to n, and this goes on.

So if this b0 was some nonzero term, then there are going to be infinitely many terms in this
expression. So this is going to be the, so if you denote this, let us say H. The h0 term would
be this. And, so in order to compute any particular coefficient you might have to sum up an
infinite series and that may not really be convergent sequence. In all our previous
applications, when we were looking at adding generating functions, multiplying generating
functions and so on, each term of the resultant generating function could be computed. The
nth term of the resulting generating function could be computed by just doing constantly
many operations. Here that’s not the case, because we might have to do infinite number of
operations and that may or may not be a convergent. I mean, It may not. If you have infinitely
many terms to add up, then that might not converse to any suitable value. But here if we insist
that b0 is equal to 0, then things fall in place properly.

And not just for this particular term, if you look at the nth term, let us say hn is going to be, let
us say, so that is going to be coming from all these individual terms. But what we can say is,
if you look at B^n(x). Now in B^n(x) and every term that comes after B^n(x), you can see
that since the constant term was 0, every term was going to be multiplied by an x raise to n.
So if you take let us say n plus 1, every term is going to be multiplied by x raise to n plus 1.
Because B(x) can be written as x into b1 plus something and when you raise it to power n
plus 1, you have an n plus 1 term and therefore all the contributing terms in the formal power

623
series will have x appearing as a power which is greater than n plus 1, and therefore they
won’t contribute to the term hn.

So in each of these expansion, there are only finitely many terms 1, B(x), so the last term you
might have to consider is B^n(x). So there are only these many generating functions. Finitely
many generating functions to reconsider and added up and therefore things work out very
well when b0 is equal to 0. So while studying these compositions of generating functions, we
will assume that b0 is 0 for all of them.

(Refer Slide Time: 7:40)

So now let us understand what is this 1 by 1 minus say B(x) looking like? So what does this
generating function, if you call this as let us say H(x). What quantity does this denote?
Clearly H(x) is equal to 1 plus B(x) plus B square x plus so on. So we will describe
combinatorial quantity and show that, that combinatorial quantity or object will have
generating function H(x). So, let bn denote the number of ways of forming a combinatorial
structure on n elements. So you take a set of n elements and the total number of ways of
forming a particular combinatorial structure could be, let us say graphs and connected graphs
on n vertices, could be number of trees on n vertices, whatever combinatorial structure that
you can imagine on n elements.

Let us say some particular combinatorial structure on n elements. The number of ways of
doing that on some set of size n, we denote it by b n and B(x) is the generating function of that
particular sequence given by bn. Now our product rule says that, if we look at B square x, that

624
is going to be just B(x) times B(x). And the coefficient of x raise to n in B square x is nothing
but number of ways of splitting a set of n elements into 2 intervals and forming a structure.

So let us call this combinatorial structure as, we will give it a name, we will call it as B. So
forming, so there is lot of overloading of notations, the generating function is B(x), the
combinatorial object we will call it by the name B and the number of combinatorial structures
that you can form on n element, that we call it by small bn. So B square x is the number of
ways of splitting a set of n elements into 2 intervals and forming B on these intervals. One of
the intervals could be empty in which case the other interval would be the full set. So this is
what we know by the product rule. So generalization of this would say that B raise to k x and
the coefficient of x raise to n in B raise to k x is the number of ways of splitting n elements
into k intervals and forming B on these intervals. So that will be the coefficient of x raise to
n. Now the particular combinatorial structure that we want to count is, how many ways are
there to split n elements into nonempty intervals and forming B on the intervals? So if you
take n elements set, the number of ways of doing this we will denote it by hn. And if we
denote the sequence by hn, it’s generating function we can denote by capital H(x) and we will
show that this is equal to 1 by 1 minus B(x). So let us see a more concrete example wherein
we spell out what is a particular combinatorial structure B.

(Refer Slide Time: 13:13)

So here is a concrete combinatorial problem. We have let us say, soldiers numbered 1 to n,


and the general wants to, and they are placed on a straight line 1 to n and the general wants to
split this into some number of units. Let's say this is the first unit, this is the second unit, third

625
unit and let’s say k units. So k could be anything going from 1 to n. But each of them, each
unit should be an interval. You cannot have let us say, all the even numbered people going
into 1 unit and odd numbers going to another unit that is not allowed.

Because the units themselves have to be intervals, so 1 to 5 could be 1 unit and 6 to 7 could
be one unit. 8 could be in a single unit. It could have k units, and k is a, you can choose k to
be whatever you want. And once you have split this into units, you need to pick a unit
captain. So the question is how many ways are there of splitting bunch of soldiers, splitting n
soldiers into some number of units and choosing captains for each unit. So bn would denote
number of ways of selecting a captain for a unit of n soldiers, and note that, since we are
leaving the number of units unspecified, it could be any number, we will have to insist that
each of the units are nonempty. Because if empty units were allowed, then we can have
infinitely many such splits, because there is no bound on the number of empty units.

So each unit will now have to be nonempty, as opposed to the earlier case where we are
splitting into 2 units. When we are splitting into 2 units we cannot have let us say infinitely
many empty units because the number of units is bound. So bn is this and hn is a number of
ways of doing this, of splitting. Now we need to argue that H(x) is equal to 1 by 1 minus
B(x). Why is this so? So we will argue it for the general case. What we will see is, we will
argue this for the general B, and then we will solve our particular problem using this
particular method and count the exact value of hn using the generating function
methodologies. So let us look at, B(x) is equal to b0 plus b1x plus bn(x) raise to n plus so on.
Here we can assume that b0 is equal to 0, the number of ways of selecting a captain fo a unit
with 0 soldiers, we will just assume it to be 0, because it does not make sense to, there is no
natural meaning for b0, so we can assign it to be 0 arbitrarily.

So what we have argued so far is, the coefficient of x raise to n in B raise to k x, this will be
equal to number of ways of splitting n into k units or k parts, and forming a combinatorial
structure B on each part, and parts here are intervals. So total number of ways of splitting, so,
where the part could be anything will be equal to coefficient of x raise to n in, plus coefficient
of x raise to n in B square x plus so on. So that is equal to coefficient of x raise to n in B(x)
plus B square x plus so on. This is equal to hn. So coefficient of x raise to n in this expression
is equal to hn, for n greater than or equal to 1. By our assumptions, we will take h0 is equal to
1 and therefore H(x) is equal to 1 plus B(x) plus B square x plus so on, and this is equal to 1
by 1 minus B(x).

626
(Refer Slide Time: 18:55)

So in our particular case where we were looking at splitting a group of soldiers into n units,
we can say that H(x) will be equal to 1 by 1 minus B(x), and B(x) is nothing but generating
function of the sequence 1, 2, 3 so on. Because this was the number of ways of selecting a
captain for a group with k elements, so that can be done in k ways and this is nothing but
summation k x raise to k, k going from 0 to infinity and that is going to be equal to x by 1
minus x the whole square.

So therefore H(x) will be equal to 1 by 1 minus x by 1 minus x the whole square. Multiplying
with 1 minus x the whole square on both numerator and denominator, what we get is, 1 minus
x the whole square by 1 minus x the whole square minus x is H(x), that is equal to 1 minus 2x
plus x square divided by 1 minus 3x plus x square, that is equal to 1 plus x by 1 minus 3x
plus x square. Now this term, we can do partial fractions on this. Compute the partial
fractions. So this is equal to 1 plus A by 1 minus Alpha x plus B by 1 minus beta x, where 1
minus Alpha x into 1 minus beta x should be equal to 1 minus 3x plus x square. So that
would mean comparing coefficients, that would mean Alpha beta is equal to 1 and Alpha plus
beta equals 3. So Alpha is equal to 1 by beta that is something that we know, and if we plug
that in here, what will be the value of A?

So A by 1 minus alpha x plus B by 1 minus beta x should be equal to x by 1 minus 3x plus x


square. If we multiply both sides by 1 minus alpha x and put x is equal to 1 by alpha we get
the value of A. So A is equal to x by 1 minus beta x evaluated at x is equal to 1 by alpha. So
that is going to be equal to 1 by alpha into 1 minus beta by alpha that is equal to 1 by alpha

627
minus beta. Similarly, if you do, B will be equal to 1 by beta minus alpha. So the entire
expression H(x) is equal to 1 plus 1 by alpha minus beta into 1 by 1 minus alpha x minus 1 by
1 minus beta x. So from this, we can simply write the nth term, so Hn will be equal to, this is
a constant term for n greater than or equal to 2, we can write it as 1 by alpha minus beta into
alpha raise to to n minus beta raise to n.

So that will be the final formula, where I mean, if you substitute alpha is equal to 1 by beta
what you will get is, 1 by alpha minus 1 by alpha into alpha raise to n minus alpha raise to
minus n. So this will be the final expression where alpha you can think of as the larger root of
1 minus 3x plus x square is equal to 0. The larger root of this particular.

(Refer Slide Time: 23:31)

Now we will look at a slightly more general problem. Since here what we did is, we looked at
a collection of n soldiers who were arranged linearly and then we split it into intervals and on
the intervals we were picking the captain. Now, suppose we could do additional operation on
the intervals themselves. For example, for each interval we could say that we want to either
take that unit for night duty or not give night duty to them. So how many ways are there to do
that? So clearly you can split it and then the total number of ways of splitting, you will get
some number and for each split you can do additional operations on them, like choosing them
for special duty.

628
(Refer Slide Time: 24:23)

So let us look at that particular problem. So we are given, this is the combinatorial problem,
that we are interested in. There are n elements. The first step is to split n into, let us say some
non-empty intervals and then on each interval, we may form particular combinatorial
structure. Form a combinatorial structure which we will call it as A, and suppose the split of n
into non-empty intervals, there are k intervals. We will say, on the intervals, from a
combinatorial structure B. So you can think of n as the soldiers and A, the combinatorial
structure that we were interested in is, pick a captain. So there were k ways to do, I mean
each of the interval. If the interval was of size k, there are k ways of doing this. And on the
interval we have formed this combinatorial structure, namely we had only the trivial structure
initially, that means there is just one way of doing things.

Now suppose we had more ways of performing combinatorial structures on the intervals
themselves. So if we had split the soldiers into 3 blocks. We had just one way of, I mean, we
did not have any extra structure on this. We were happy with the split. But now we are saying
that, look each unit, let us call it as u1, u2 and u3. Some of them could be chosen for special
duty. Let us say u1 alone was chosen or let us say u2 alone was chosen, there are 8 ways of
doing it. There are 2 raise to k ways of choosing the second combinatorial structure B. So this
is the general question that we want to show and we will see that if we take the compositions
of the generating function for combinatorial structure A and B, in the correct order, then we
will get a generating function for this particular combinatorial problem.

629
So if A(x), this is our theorem. If an is a number of ways of forming A on an n element set,
bnis the number of ways of forming B on an n element set. And let us say, gn is the number of
ways of forming this complex combinatorial structure involving both A and B. So let us call
that as C, then gn will be equal to B composed with, so the generating function for gn, G will
be equal to B composed with A.

(Refer Slide Time: 28:53)

How do we see that, so suppose we split it into, so what we have is we could split it into any
number of intervals. Suppose we had split into k intervals. So suppose, this split is into k
intervals. The number of ways of forming A on these intervals is, that will be equal to, I
mean, the generating function for that. So this will be equal to, generating function for the
number of ways of forming A on the intervals will be equal to A(x) raise to k. Number of
ways of forming B on these intervals is equal to bk. So if you had just k intervals, there are bk
ways of forming combinatorial structure B, and since there are these many ways of splitting
the total number. So the generating function for the total number of ways of forming C will
therefore, I mean, if you split it into k intervals, that will be equal to bk times A(x) raise to k.

And therefore the generating function, the complete generating function will be equal to, so
G.F for c will be equal to summation bk A(x) raise to k. And this is nothing but B composed
with A. So that's the proof. And let’s see how we can solve our problem of selecting.

(Refer Slide Time: 30:56)

630
So now the question that we have, the combinatorial problem, the concrete problem is, we
have n elements 1 to n. We need to split them into some number of intervals and each interval
is given a captain. Let's say C1, C2, Ck are the captains and each interval is additionally told
whether they will be doing night patrolling or not. So, the total number of ways of doing this,
so A(x) is the number of ways of selecting the captain in a group of n people. So this would
be equal to x by 1 minus x whole square. So the sequence is 1, 2, 3 so on. And B(x) is the
number of ways of assigning duties, special duties. So that will be, each group can either be
given night patrolling duty or not be given. So there are 2 raise to n ways, so the generating
function corresponding to that would be 1 by 1 minus 2x. And the total number of ways of
forming the complex combinatorial structure, which takes into consideration both A(x) and
B(x), if you denote it by G(x), that is going to be B composed with A, that is equal to 1 by 1
minus 2, instead of x we need to put x by 1 minus x whole square.

So that is going to be equal to, multiply 1 minus x the whole square on numerator and
denominator, we will get 1 minus 2x plus x square divided by 1 minus 4x plus x square, that
is equal to 1 plus 2x by 1 minus 4x plus x square. Again we can use the partial fractions
method and show that G(x) is going to be equal to some of 2 exponentials. And if you solve
the partial fractions correctly, we will get G(x) is equal to A times alpha raise to n plus B
times beta raise to n. So that will be the end of this lecture.

631
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science & Engineering,
Indian Institute of Technology, Guwahati
Lecture 32
Principle of Inclusion Exclusion

In this lecture, we will learn about the Principle of Inclusion Exclusion.

(Refer Slide Time: 00:32)

So, let us start by the following example. Let us say, this is a set of students who play football.
This is a set of people who play cricket, and suppose we know the cardinalities of these. So, let
say there are 20 people who play football and 25 people who play cricket and we know that, let
us say 7 people who play both football and cricket. So, we want to know how many people play
at least one sport. We want to find out the number of people who play at least one sport amongst
let us say football and cricket. So, we want to basically compute the cardinality of a certain set. If
we denote the football players as F, and the cricket players as C, we want F union C and we want
to count the size of that. And we know that this is going to be equal to number of people who
plays football plus number of people who play cricket. But here we have double counted the
people who play both the sport. So we need to subtract that number.

632
So, F intersection C, whatever is the number that has to be subtracted. Therefore we will get 20
plus 25 minus 7 and that is going to be 38. Now, in this case we had just two games namely
football and cricket. But we could have a more complex thing. Let us take a little more complex
example where there are three sports. Now, you just call them as A, B and C. So, this is the set of
people who play A. This is the set of people who play B, the third collection is the people who
play the game C. Now, there are intersection regions, the red region is A intersection C, the
orange region is A intersection B and the blue region is B intersection C, and the middle region
that is going to be A intersection B intersection C. As usual we want to compute the size of A
union B union C. If we just add the sizes of A, B and C, A plus B plus C, the elements in the
intersection regions have been counted multiple times.

For example, the red intersection region, they would have been counted at least twice. The pink
intersection region which is also common with the red intersection region is going to be counted
three times. So, all those things have to be accounted and principle of inclusion exclusion
basically gives us a way of accounting these things in a systematic manner. So, the correct
formula would be, the size of A union B Union C would be size A plus size B plus size C minus
A intersection B we need to remove, we also need to remove B intersection C and we also need
to remove C intersection A. Once all these have been removed, when we were counting A, B and
C. They were counted extra, and those extras have been removed. But right now, when we just
remove in this particular manner, the pink region has been removed thrice. So they have to be
reintroduced. So, that would be size of A intersection B intersection C, and now we have
accounted for everything. So this would be the formula. In general, when we have let us say n
such sets, how do we compute its size? How do we compute the size of the union of n sets, and
that is what principle of inclusion exclusion helps us do.

633
(Refer Slide Time: 05:36)

So, let us look at a small problem and we will illustrate the principle by means of that example.
So, this is known as a derangement problem. So, we need to understand what a derangement is.
So, when we say square bracket n, this denotes the set of numbers from 1 to n and we are
interested in counting the permutations of square bracket n, which satisfies certain special
properties. To obtain the count of permutations which do not have a fixed point. So, what is the
fixed point? Let us take an example, if you look at the permutation 3, 1, 5, 6, 4, 2, this is a
permutation of numbers from 1 to 6. At position 1, we have 3, at position 2 we have 1, at 3 we
have 5, at 4 we have 6, 5 we have 4 and at 6 we have 2. So, this could be thought of as a function
from n to n any permutation would be thought of as a function from n to n such that it is a
bijective function.

Now, in this bijective function, there is no point such that f(i) is equal to i. Whereas if we have
taken the permutation 3, 1, 5, 4, 6, 2. So let us say this is sigma 1. Sigma 2 is 3, 1, 5, 4, 6, 2, the 4
has a property that f, sigma 2 of 4, we view the permutation as a function, sigma 2 of 4 is equal
to 4. So this will not be a derangement. So, when we are looking at derangement, derangement is
nothing but all those permutations such that at the ith position i do not appear, for any value of i.
There are how many such permutations and that is what we need to count.

634
So, the approach is, the total number of permutations is n factorial. From this we will remove all
the bad permutations. Instead of counting the number of derangements, we will look at the
complement set and we will try to count the complement and the complement sets count once
obtained, when that is subtracted from n factorial, we will get the number of derangements. So,
let us introduce some definitions. So, let us say Ai, which is defined as permutations, this is a set
of permutations, which have i as a fixed point. So, fix an i, so A1 would be, A1 would basically
be all the permutations which start with 1, and An will be the set of all permutations which end
with n, and similarly for every other thing. And the set A1 union A2 union An is a set of all
permutations, which are not derangements. Any derangement is basically outside this particular
collection. In fact, every collection that is outside the union of Ai 's will be a derangement. So,
we need to estimate this size of A1 union A2 union An. So, that is what we would do.

(Refer Slide Time: 10:10)

635
So, how do we compute A1 union A2 union A3, all the way up to, and this is where principle of
inclusion exclusion comes in. So, we need to obtain the size of this. So, clearly, if we just take
the sizes of A’s and add up, we are going to over count. So, A1 union A2, so here when we are
discussing the size of the unions, we need not really think of these Ai 's as subsets of
permutations, but you can think of it as the general, I mean, any collection of sets A1 to An.

So, A1 union A2 union An, this is going to be a, an over count. So we need to exclude. So, here
we have included many things. Now we need to exclude things which have been counted twice.
So, minus A1 intersection A2 minus A1 intersection A3 all the way up to An minus 1 intersection
An. Note that there are going to be n choose 2 terms here. Look at every pair, which, I mean,
every pair of sets from A1 to An and take their intersection. Everything in the intersection would
have been counted twice ones for each Ai. So, all those things have to be removed. But now this
may be removing too many things, because there would have been elements which have been,
which lies in the intersection of three elements, so they have to be reintroduced. So, that gives us
the next term, A1 intersection A2 intersection A3 and we need to look at, and this comes with a
positive sign because here in the second step, we had removed too many elements. So some of
them we have to reintroduce and this process keeps on going.

636
So, n choose 3 terms would be there and finally, depending upon the sign, so minus 1 raise to n
plus 1 A1 intersection A2 intersection An and there would be precisely one term there. Now, this
is an unwieldy formula, looks very complicated. So we will try to write the formula in a much
nicer manner. So, what are we really doing here? And how do we know that this formula is really
correct, so we can look at each element. Let us say, i belonging to A1 union A2 union An. In the
left hand side each of those elements is counted exactly once and we need to say that in the right
side also each term is going to be counted exactly once. Now, which all are, suppose the element
i appears in k sets. Suppose i is belonging to k different sets from A1 to An. Only those terms are
going to contribute to the count of i in the right hand side.

So, you can see that, in the first expression, the first line there would be k terms and from the
second line there would be k choose 2 terms and so on. So, if you add this up, you can see that
they will add up to 1. So, the signs come appropriately. So k and minus k choose 2 plus k choose
3 minus k choose 4 and so on, because you take any subset of these. So, let us say i appears in Ai
1, Ai 2 and Ai k, the terms corresponding to that would be, I mean, in the right hand side will be
all those subsets of these k sets. There are 2 to the k subsets, and if you ignore the empty subset,
there are 2 to the k minus 1 subsets and their count would appear as k, k choose 1, k chose 2 all
the way up to minus 1 raise to k plus 1, k choose k and if you add these up you can see that the
count would be 1. Because 1 plus, and the sign would adjust properly if we had taken minus 1
along with it, minus 1 plus k minus k choose 2 and so on. Those are the binomial coefficients
when you consider the expansion of 1 minus 1 raise to n.

So, since that sum is 0, these other terms must add to 1. So, that is the proof of why the inclusion
exclusion principle is correct. Now, what we need to look at is, how can we write this large
expression in a nicer format. So, what are we really doing here? Here there are roughly 2 to the n
subsets except the empty subset of 1 to n, every other subset is appearing here, whereas on the
left hand side there are just n terms. So, that is why the formula looks really complicated.

637
(Refer Slide Time: 16:20)

638
So, we will try to write it in a simpler way. What we will do is we will organize the subsets in
terms of their size. So, size of A1 union A2 union An, this is basically the sum over subsets of n.
We are going to associate each term on the right hand side with a subset of 1 to n, and we will
ignore the subset S is equal to phi. For example, the term A1 intersection A2 is corresponding to
the subset 1,2. The term A2 intersection A3 intersection A7 this is going to appear somewhere
with a plus sign on the third row that is going to correspond to the subset 2, 3, 7, and note that
when we are considering a subset of size 2, the sign is negative and whenever we are considering
a subset of size 3, it is associated sign is plus.

So, there is going to be a term minus 1 raised to size of the subset plus 1, you can also write
minus 1 does not really matter. Only the parity of this number counts, S plus 1 or S minus 1 they
are I mean either both of them are even or both them are odd and this times the next term
corresponds to the number of elements in the intersection. So, we will write that as, so
intersection over i belonging to S, Ai and we need to look at the size of this set. So, the earlier
expression that we had written for inclusion exclusion principle, if we choose the correct
notation it becomes a simplified notation. Now, this again is a sum over 2 to the n-1 terms. We
could also for some purposes enumerate the subsets in terms of their size, so then you will get a
double summation. So, that is we will first look at all subsets of size 1 and then look at all
subsets of size 2 and so on and when we do that, this term size of S plus 1 basically becomes a
fixed quantity.

639
So, this would be sum over k going from 1 to n. Here k is going to be the size of the set and
minus 1 raise to k plus 1, and then summation, again now all subsets of n. But now there is an
additional constraint that size of S is equal to k intersection i belonging to S Ai, the size of this
set. So, this is a much more succinct expression than the previous expression. The meaning is the
same, the intuitions are also one of the same. So, now so this is the general principle of inclusion
and exclusion. Now, we can look at our specific case, where the Ai's where subsets of
permutation, and what we need to look at is basically the inner terms of this expression. So, if we
look at the innermost expression which is the intersection of the Ai's. So, what is the size of that
going to be intersection over i belonging to S Ai.

So, for example if S was the set 2, 3, 7, then this corresponds to A2 intersection A3 intersection A
7 and we need to look at this size of the set. Now, what really is the set? It is just all those
permutations such that 2 appear at the second position, 3 appear at the third position and 7
appears at the seventh position. The other positions, if there were n numbers to choose from, the
other n minus 3 positions can be chosen arbitrarily. So, this size would be n-3 factorial, and this
depends only on the size of the set. It does not really depend on which are the elements that
contribute to that particular set. For example, if we had, instead of S we have taken another set T
is equal to say 1, 2 and 8, that would also have given the exact same count. So, the inner term is
going to be intersection i belonging to S Ai, it is size is just going to be n minus size of S
factorial and the total number of such sets, we need to sum up over all possible sets.

640
(Refer Slide Time: 21:49)

The total number of sets is going to be subsets of size k is equal to n choose k, therefore the size
of the union i belonging to n Ai, this is what we need to determine. This is equal to summation
minus 1 raise to k is going from 1 to n. There are n choose k subsets of size k and for each of
those subsets, the summation, the size of the intersection is going to be n choose, it is going to be
n minus k factorial.

641
So, this gives us summation k equals 1 to n, minus 1 raise to k n factorial by n minus k factorial
into k factorial the whole multiplied by n minus k factorial and that simplifies to n factorial by k
factorial, 1 to n minus 1 raised to n, n factorial by k factorial. So, this can simply be written as,
when k equals 1 this is n factorial. Sorry, this term was k plus 1. So, n factorial minus n factorial
by 2 factorial plus n factorial by 3 factorial minus n factorial by 4 factorial and so on, this will go
on till n factorial by n factorial. So, this is the bad permutations, in the sense these are the
permutations that we need to avoid. So, total number of derangements would be just n factorial
minus this number. So, if we denote the derangements by Dn. So, Dn is equal to n factorial, let us
just write this as n factorial by 0 factorial minus n factorial by 1 factorial plus n factorial by 2
factorial minus n factorial by 3 factorial and so on, all the way up to minus 1 raised to n, n
factorial by n factorial. So, that concludes the discussion on inclusion and exclusion.

642
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science and Engineering,
Indian Institute of Technology Guwahati
Lecture 33
Rock placement problem

(Refer Slide Time: 00:27)

This lecture, we will see about placing rooks on an n cross n chessboard. So, let us look at
this problem more carefully. And, this would use many of the techniques that we have learnt
so far. We will use the principal of inclusion exclusion. We will also use the principal of
method of generating functions and so on. So, let us start with an example. So, this is a 3 into
5 chessboard. So, place 3 rooks on a 3 into 5 chessboard. This is the problem that we will
begin our discussion with. Rook is a piece in chess, and the property of that piece is if you
place rook in a particular position, every position in that row and that column is under attack
by that rook. So, you cannot place any other pieces on along that row or column. Those will
be positions which are attacked by this particular rook. So, we want to place rooks on the
chess board such that these 3 positions are non-attacking. So, one way of placing it would be,
in this particular manner. So, you can see that if you look at any of the rooks, they are not
under attack from any other rook.

So, what is the total number of ways of doing it? And, note that we cannot place more than 3
rooks the maximum number of rooks that we can place is going to be 3. Because, there are

643
only 3 rows and if you place more than 3 rooks, some row is going to contain more than 1
rook and those are going to be attacking. So, our objective is to count the number of ways of
placing non attacking rooks on a board. Now, what is a board? The simplest board is an n
cross m board, which is n rows and m columns.

But, the board could be much more complicated than that. For example the red region that we
are just marking out is also a board. Now, we could look at the problem of trying to place the
rooks just in the positions indicated by that particular, I mean, by that particular board which
consist of the square indicated. Now, how do we count this? We will learn 3 methods. One is
based on say decompositions, and another is based on recursion and third method will be
based on complementing. So, given a particular board, that is subset of squares of n cross and
board. If you need place non-attacking rooks, the number of ways of doing that, when we
need to count that we will use three generalized principals. One is the principals of
decomposition, then another is a recursive way of counting and the third would be to look at
the complimentary board. So, these three methods is what we will look at in this particular
lecture.

(Refer Slide Time: 04:22)

So, let us first look at the simplest case, where the board is an n cross m board. So, if you
have an n cross m board, we will assume that n is less than or equal to m. And, we need to
place some number of rooks, k rooks to be placed and k is less than or equal to n, and the
number of ways to do this what we need to find. When it is regular rectangular board, with all
positions being allowed position, the first rook can be placed in the first row in m ways. And,

644
the second rook, so say place it here, any of the other rows you can choose any column other
than the one indicated by the first rook.

So, the red squares are gone, the other columns you can use. So, the total number of ways of
placing the second rook is m minus 1 and all the way up to m minus k plus 1, if you had to
place k rooks and if k was n, this would have been the falling factorial up to n. So, that is the
number of ways placing k rooks on a very regular rectangular board. We will introduce
certain notations. So small rk(B), so B is basically a board. And a board, by a board what we
mean is a set of allowed positions from an n cross m cross chessboard. So, given a particular
collection of allowed positions, the number of ways of placing k rooks on that board is
denoted by rk. So, this defined as the number of ways of placing k non attacking rooks.
Whenever, a rook is placed you cannot place anything along the same column, you cannot
place anything else along the same row as well. And, if we look at the sequence r0(B), r1(B),
r2(B) and so on. This gives the number of rooks, I mean, the number of ways in which you
can place k rooks in a board B.

So, if you consider this sequence, it is going to be finite sequence because, the number of
allowed positions is surely an upper bound on the length of the sequence. All the other terms
after that basically becomes 0. And, the generating function for rk(B), k greater than or equal
to 0 is called the rook polynomial. So, look at the number of ways of placing k rooks on a
board B. If you vary B, if you vary k we will get a sequence, and the generating function of
that sequence namely r0(B) plus r1(B) times x plus r2(B) times x square and so on. rn(B) times
x raise to n. So, this sequence is sequence is called as the rook polynomial.

(Refer Slide Time: 08:29)

645
For example, let us do a couple of example. If we look at this particular board, there are 4
positions and let us say all the positions are allowed positions. The rook polynomial for that
r0(B) is going to be 1, because, there is only one way which you do not place anything. And
r1(B), this is going to be equal to 4 fact for any board r1(B) is the number of placing 1 rook
will be equal to the size of the board or the number of allowed positions. And r2(B) is going
to be equal to, if you place a rook at the left side top corner and then the only place where you
place a second rook is on the other diagonal.

So, this is one way and the other way of placing it is indicated by the red colored placements.
So, there are only two ways of placing it. And therefore we will get r2(B) will be equal to 2.
And therefore, the rook polynomial of this board, because, we will call this board as B, the
rook polynomial will denote by R and the variable is x. So, R(x, B) will be equal to 1 plus 4x

646
plus 2x square. If you take a slightly different board, namely a board, with three allowed
positions. Now r0, let us call this as B1, r0(B1) is going to be equal to 1. And, r1(B1) is going
to be 3 because there are three positions, you can keep a single rook any of this positions and
r2(B1), the only way of placing 2 rooks is by placing it on the available diagonal. There is
only way of doing it. So R(x, B1) is going equal to 1 plus 3x plus x square. The first thing
that we will look at is how to compute the rook polynomial.

So, we could of course compute the values of rk(B) for all possible values of k, and from that
compute the rook polynomial. But, the purpose of rook polynomial is basically to do the
reverse thing. That is, can it help us in figuring out the number of configuration possible, I
mean, number of non-attacking rook configuration possible. So, we will look at ways to
generate the rook polynomial. And, once we have generated the rook polynomial we will use
that to compute the coefficients. So, let us say we have a board. So we will just think of this
as an n cross m board. In which certain positions are marked as not allowed. So, these are the
non-allowed positions and all the other positions are the allowed positions. And, we need to
compute r, if you call this board as B. We are interested in calculation rk(B). In order to
compute rk(B), we will compute R(x, B), namely the rook polynomial of this particular board.
And, once we have computed the rook polynomial, from that we can derive rk(B).

The first the thing that we will do is based on recursion. So, we will identify one particular
position. Let us say this is a position s in the board. And, all the placements of rooks on this
table would involve either a rook on s or it will not have a rook. There it will be two
possibilities, there is a rook position as or there is no rook. So, we will generate, so for each
board, let us draw this board in red. So, given a board and given a position, so now we are
interested in calculating the number of ways of placing k rooks on this particular board that is
given. The red outline basically indicates the allowed position. The marked positions, the
cross positions are the forbidden positions. Now, let us mark out one particular square,
namely s and look in terms of whether s is a part of an arrangement, that is whether a rook is
placed at s or not. So, if you look at all the placements, there are placements which involve a
rook at position s, and there are others where there is no rook at s. So, we will construct two
sub boards, one is the one where, I mean, the first one will consider is the board where we
will ignore the position s.

647
So that we will call as the board, B prime. So, B prime is the board without position s in it.
And, that will have its own rook polynomial. So, its rook polynomial will be R(x, B prime).
Also we could have rook at position 1. In which case, I mean, suppose we place a rook at s,
then, what we will have is, all these positions will now be gone. Let us draw it separately. So,
if you do that, the remaining positions would be, everything in the row and column would
essentially be gone. So, this would be the remaining board. Because, these positions would be
ruled out if we have a rook on s. So, that will be our board B_s. So B_s is the board obtained
by removing the column and row containing s. And, that will also have a rook polynomial, let
us call that as R(x, B_s). Our objective was to compute R(x, B). But, let us just look at, in
terms of the number of ways of placing k rooks on this board B. Now, r k(B) is going to be
rk(B prime) plus rk minus 1(B_s). Let us see why this recursive formula is true.

The number of ways of placing k rooks on the board B is equal to the number of ways of
placing k on B prime that is after removing position s. You do not put anything on position s
and place k rooks on the remaining positions or you put 1 rook on position s, and then put k
minus 1 in the remaining board. So, k minus 1 in B_s and 1 in s, so, this recursive formula we
have justified why it is true. Now, form this recursive formula we can compute the rook
polynomial. We can give an expression for the rook polynomial of B.

(Refer Slide Time: 17:35)

648
So, what we know is rk(B) is equal to rk(B prime) plus rk minus 1 (B_s). We can multiply this
expression by x raise to k, will get rk(B) x raise to k is equal to rk(B prime) x raise to k plus rk
minus 1 (B_s) x raise to k. Now, you can sum this over all possible values of k going from let
us say 1, because we do not want to take 0. Because, r minus 1 does not make sense, so
summation k equals 1 to infinity rk(B)xk is equal to rk(B prime) xk again sum from k equals 1
to infinity plus summation k equals 1 to infinity rk minus 1 (B_s) x raise to k and this term is
equal to summation j is equal to 0 to infinity r_j (B_s) x raise to j plus 1.

And, the left hand side term is almost the rook polynomial of B, but just the first term is
missing. So, we can write it as R(x, B) minus r0. R0 is always 1, and the right hand side, the
first term is going to be the rook polynomial of B prime. So, R(x, B prime) minus 1 plus x
times summation j equal 0 to infinity rj of B_s into x raise to j and this is rook polynomial of
B_s. So, this is equal to R(x, B prime). So we can write this as R(x, B) is equal to R(x, B
prime). The minus 1 minus 1 gets cancelled plus x times R(x, B_s). What we have managed
is, in order to calculate the rook polynomial of B, we can calculate, we can write in terms the
rook polynomial of board with 1 less position.

And another one which has significant, has one row and one column less. So, one position is
less and here one row and column gone. So, we write the rook polynomial of B in terms of its
constituent terms. And, we can do this recursively and compute the rook polynomial of the
entire board. But, this could be very cumbersome if there are lot positions. So, we will look at
other methods. So, this can be useful for certain kind of boards. We will see other methods as
well.

(Refer Slide Time: 20:37)

649
The next method we will see is called as decomposition. So, let us look at the special kind of
a board. So, let us say our board contained these positions. So this is 6 positions. So, this is
the board with 9 positions. And let us call this as B, and we need to find out the number of
ways of placing rooks on this board. We know that it is sufficient to find out the rook
polynomial of this board. So, what is R(x, B)? Now, note that this board is in some sense
peculiar, it naturally breaks into two parts. So, what does it mean to break into two parts or
decompose? So, the formal definitional is, it means the board can be split in to two positions
such that rooks in one position, in one board do not attack the rooks in the other board. For
example if I place a rook here, and if place a rook anywhere else in the other part. They can
never be attacking. So if you can break down the board in to constituent parts, such that any
position in one of the part do not attack any other squares on the other part, then we will call
that as a decomposition.

Why decomposition is helpful to us is because of the following fact. If the board decomposes
into parts, the rook polynomial of the board is basically the product of the generating
function, or is the product of the rook polynomial of the constituents. So, let us call this board
as B1, and this is B2. We will see that R(x, B) is going to be equal to R(x, B1) into R(x, B2).
The proof is simple. If you look at the definition of rook polynomial, R(x, B) is equal to r0
plus r1(B) times plus x plus r2(B) times x square and so on. And, here, the number of ways of
placing a rook in B1, k rooks in B1, that is given by the rook polynomial. That is going to be,
let us say, I will write it as a r1 to indicate they are looking at board 1. So r0 plus r1 1. So, we
ask ourselves this question, how do we place k rooks in B? So, we could think of it as 0 rooks
in B1 and k in B2 or 1 in B1 and k minus 1 in B2, 2 in B1 and k minus 2 in B2. All the way
upto k in B1 and 0 in B2 and all of these arrangements are going to be different
arrangements. Because, they have different number of rooks in B1 and B2 and notice that we
could do these part and these part independently.

Because, any placement of rooks in B1 is not going to attack any positions in B2, and this is
exhaustive because, if you have places rooks in B, there has to be some number of rooks in
B1 and some number of rooks in B2 and their sum should be k. And, therefore the total
number of ways is going to be equal to r0(B1) into rk(B2) plus r1(B1) into rk minus 1 (B2) all
the way up to rk(B1) into r0(B2). And, this basically is just the convolution of the two
sequences. If you look at the sequence, that is corresponding to B1, so, r0 r1 up to rk and the
sequence corresponding to B2.

650
And, if you convol them, take the convolution of the two, you will get the kth term of B. So,
this is summation i going from 0 to k ri (B1) times rk minus i (B2). If terms have this
particular property, then we know that it is just the product of the, I mean if you look at the
rook polynomial or the generating function, that is just going to be the product of the
constituent generating functions. So, this is a very useful thing if the boards are, can be
naturally decomposed into parts. And it applies even if you have let us say more part, if you
had a board where there were elements, the three elements here. Again if this was B3, the
rook polynomial will be just the product of the rook polynomial of the constituents.

(Refer Slide Time: 26:33)

The third technique that we would see is something called as taking the complimentary
board. So, here we will use the principle of inclusion exclusion. So, in general, if the number
of allowed positions become too large, then, the decompositions are also going to become
difficult and recursive formulation is also going to be difficult. Because, there are too many
positions to handle. But, if the board is having a lot of allowable positions. And if the
compliment has let's say, significantly less, then we can use the complimentary board
technique. For, example if you had a board with all this positions, but very few missing
positions. If you have a board of this format, you can see that the complementary board is a
very small one. It has just three allowed positions. So, we will denote a board by B and for
this discussion, B prime denotes the complimentary board. The complimentary board
basically contains all the positions which are absent in the current board. We can look at rook
placements in the complimentary board as well, it is just another board.

651
Let us look at how these techniques are used. So, we will introduce some notations. So let A
be the set of in all rook placements. So, A is the collection of all rook placements, for
example if this was an n cross m board. We do not look at the forbidden positions. I mean, we
will allow rooks in all place all positions. The only requirement is they should be at non-
attacking. So, A is what we denote that set by. The set of all possible rook placements
without considering what is allowed and what is not allowed.

So, we need to think of a bounding box for the board rectangular bounding box. And in that
the set of all possible placements is what we will call as A. And Ai will denote the placements
which contains a forbidden square in the ith column. So, look at any particular placement.
Suppose we had placed rooks in the following manner, one here one here and one here. This
will be a rook placement. But it is not a placement which we want to include in our counting.
Because there are two rooks in forbidden positions, of course this is not even valid
placement. So, consider the orange positions that are a placement of rooks. And, this is not to
be included in our count, because, there are two rooks at forbidden positions. So, if we
consider A4 and A7, so this would be placement which will belong to the set A4 and the set
A7. Because, in the fourth column there is a forbidden square on which there is a rook. And
the seventh column also there is a forbidden square on which there is a rook.

The total number of rook placements was easy to count. We can denote that by P(m, n). So,
this is the number of ways of placing n rooks on m columns. Or placing n rooks on an n cross
m board. We can assume that m is going to be larger than n. If m is less, then we cannot
really place n rook, that count will automatically be 0. So, A is just going to be P(m, n). Now,
we need to basically find the number of rook placements on B. And, that is basically the same
as, find the number of, find the size of A1 union A2 union Am the whole compliment. So, A1
union Am, if you take this set, the union of that basically consists of all placements in which a
forbidden square is used. If you take the compliment of that, none of the rooks will be on a
forbidden square. And this set's size is what we need to consider. And that is where we will
use inclusion exclusion principle. We know the set of the size of A. So, A minus A1 union
Am. If you the compute the size of A1 union Am. And, subtract it from the size of A, you will
get the number of ways placing rooks on A on to the board B. So, how do we compute these
things?

(Refer Slide Time: 32:59)

652
653
So, we will be using the inclusion principle. So while we are using inclusion exclusion
principle, the terms that we need to estimate are let's say, A1 or let's say A i1 intersection A i2
intersection A ik. The size of this set is what we need to estimate, and this summed up over all
subsets, summed up over i1, ik. Let us denote this by N(k). N(k) is what we need to find out.
So, if you can find out the value of N(k), then, the count that we want is just the size of A
minus these, I mean, plus or minus these N(k)'s, by means of inclusion exclusion.

So, N(k) denotes the number of ways of placing rooks. So, look at the inner term, A i1
intersection A i2 intersection A ik. So, this is a set of rook placements, such that there is a
forbidden square in i1 and i2 and ik. So, if we look at the complimentary board, these positions
are valid positions for those, those are positions, those are allowed positions. So, we are
placing k rooks on i1 to ik. And the remaining n minus k rooks can be placed in n minus k, I
mean, P(m-k, n-k) ways. So, A1 intersection Ak basically denotes the sets of rook placements
such that the forbidden squares in i1, i2, ik etc are being used. Now, if you consider
complimentary board, these positions are valid positions or allowed positions in the
complimentary board. For this particular choice i1, i2, ik, there were let us say x, there were t
ways of doing that. The total number of ways of placing rooks would be, I mean, placing n
rooks would be t times P m minus k times n minus k, Because, if k rooks have already been
placed, we have taken away k rows and k columns. The remaining m minus k columns and n
minus k rows could be can used.

So, t into P(m minus k, n minus k) ways are there to place rooks, and this t is just the number
of ways of placing rooks on the complimentary board. This is for one particular i1, i2, ik. If
you sum it up over all possible i1, i2, ik, that is going to give you the total number of ways of
placing k rooks on the complimentary board. So, we can write N(k) as equal to P( m minus k,
n minus k), this is going to be the common factor, multiplied by the total number of ways of
placing k rooks in B prime. So, this is going to be equal to P(m minus k, n minus k) times
rk(B prime). And therefore, rn(B), we can just write it as P(m, n), this is the total number of
ways. Now, we can apply inclusion exclusion, minus N(1) plus N(2) and so on. And N(1) and
N(2), we already know. This is going to be P(m, n) minus P(m minus 1,n minus 1) into r k (B
prime) plus P (m minus 2, n minus 2) times rk, sorry, instead of rk this will be r1 and r2(B
prime) plus and so on. So, if we can compute the rook polynomial of the complimentary

654
board, or the number of ways of placing rooks on the complimentary board. From that count,
we can compute rn(B) by using the principal of inclusion exclusion. We will stop here.

655
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering,
Indian Institute of Technology, Guwahati
Lecture 34
Solution of Congruences

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the fifth lecture on number
theory.

(Refer Slide Time: 00:40)

So let us begin with a theorem. Let's call it theorem 5.1. Let’s say, GCD of a and m is 1. Then
the congruence ax equals b mod m has a solution x1. Moreover, all solutions are given by x
equals x1 plus jm where, j is an integer. In other words, there is a solution x1 and every other
solution is congruent to x1 modulo m. So that is what the theorem states.

656
(Refer Slide Time: 01:51)

So let us prove the theorem. The theorem follows from the generalization of Fermat’s
theorem. Let us say, x1 is a power phi of m minus 1 times b, where phi of m is the totient of
m. Then plugin x1 into the congruence, we have. The congruence says a x1 is b mod m. Then
we have a times a power phi of m minus 1, is congruent to this times b, is congruent to b mod
m if x1 is indeed a solution, so let us check. We find that a power phi of m times b is indeed a
solution because, by the generalization of Fermat’s theorem, a power phi of m is 1 mod m. So
1 into b is indeed b mod m. So the congruence is satisfied. So if you plugin x1 as a power phi
of m times b1, the congruence holds good. So x1 is indeed a solution of the congruence.

(Refer Slide Time: 03:26)

657
Let x equal to x1 plus jm, for some j which is an integer. Plugging this x in, we find that ax is
equals ax1 plus a jm, this is congruent to b mod m. So x is also a solution. So, all numbers
that are congruent to x1 modulo m are solutions. But, are these the only solutions?

(Refer Slide Time: 4:21)

Suppose y is some solution. If y is a solution, then ay minus a x1 is congruent to b minus b,


which is congruent to 0 mod m. y is a solution, so ay is congruent to b mod m. x1 is a
solution, so a x1 is b mod m. Therefore, when you subtract we have a times y minus x1 is 0
mod m. Or in other words, m divides a times y minus x1. But, since, a and m are relatively
prime, they do not have a common factor. Therefore it must be that m divides y minus x1,
which means y is congruent to x1 modulo m.

(Refer Slide Time: 05:26)

658
Or in other words, y is equal to x1 plus jm, for some j, which is an integer. Therefore we
know that every solution of the congruence is congruent to x1 modulo m, and these are the
only solutions. So let us consider an example.

(Refer Slide Time: 05:48)

Let us say this is what we want to solve. 353 x is 254 mod 400. 353 is a prime, so GCD of
(353, 400) is 1, and we also know that phi of 400 is 160. We will see a close form expression
for phi later on. So from that we will be able to calculate phi. But a manual verification will,
in any case show that phi of 400 is 160.

(Refer Slide Time: 06:42)

659
So 353 to the power of 159 that is phi of 400 minus 1 into 254, as the theorem shows, a
power phi of m minus 1 times b equal to x1 is a solution. Therefore, this would be a solution.
But how much would this be?

(Refer Slide Time: 07:20)

350 to the power of 159 times 254 is what we have to calculate, of course, mod 400. First, let
us calculate 353 power 159. This can be written as 353 times 353 square power 79. But this,
modulo 400 can be written as minus 47 into 47 square power 79. But, 47 square as you can
verify is 2209. Therefore, 47 squared is 209 mod 400. So modulo of 400 you can write this as
209 the whole power 79.

(Refer Slide Time: 08:35)

660
That is, minus 49 into 209 into 209 square the whole power 39. But, then 209 square is 81
mod 400. Therefore this is writable in this form. But, 81 square is 161 mod 400. 80 squared is
0 mod 400. 2 into 80 into 1 is 160, plus 1 is 161. So the equation becomes this. 161 power 19
is what we need to find, all mod 400. So these are all congruences, which can be further
written as take 161 out, then we have 161 power 18 which is 321 power 9, because 161
square is 321 mod 400. And then we have 321 power 8. Since, 321 squared is 241 mod 400,
we can write this as 241 power 4.

(Refer Slide Time: 10:45)

But, 241 we find is, which is of course 0 mod 400 so we need not consider it so we have 480
plus 1, which is 81 again. So this is 81 squared, we have already seen this, 161, since, 81
squared is congruent to 161 this is congruent to this expression. But we have already seen
that 161 squared is congruent to 321, and we know that 321 squared is congruent to 241 and
computing 241 into 81, 240 into 80 is 0 mod 400. So we need not count that, so this is 321
once again. Which can be written as 200 plus 9 into 320 plus 1, which is minus 47 into 200
plus 320 is a multiple of 400. Therefore that does not feature in the answer and 9 into 1 is 9
so we have 9 into 320 plus 200 which comes to 289. So the expression now reduces to minus
47 into 289.

661
(Refer Slide Time: 13:13)

289 is 300 minus 11, that into minus 47 is minus 14100 plus 517. 11 into 47 is 517. 300 into
47 is 14100. But 14100 is minus 100 modulo 400 all this is modulo 400. So we have, but, 517
is 117 mod 400. So this is 17 mod 400, which is 353 power 159. But, what we need to find is
17 into 254. So this, 353 power 159 into 254 is 17 into 254 mod 400, which you can verify is
318 mod 400.

(Refer Slide Time: 14:39)

So, 318 is a solution for 353 x is 254 mod 400, and we know that all solutions to this
congruent are congruent to 318 mod 400. In other words, 318 intersect 0 to 399 is the only
solution. In the interval 0 to 399, 318 is the only solution.

662
(Refer Slide Time: 15:30)

The other solutions are obtained by moving forward and backward at modulo m. So, 718,
1118, 1518 are all solutions. Moving backwards, 318 minus 400 would be minus 82, minus
482, minus 882, these are also solutions. So there is an infinite numbers of solutions, but all
solutions are congruent to 318 modulo 400. So this is one way of finding solutions for
congruences of the form ax is congruent to b mod m, with GCD of (a, m) equal to 1. But it
involves finding exponents of this form which is a long and tedious process. So this is not
exactly practical.

(Refer Slide Time: 16:40)

We will see an easier method later in the lecture. Now, let us see another theorem, which is
called Wilsons theorem. What Wilson’s theorem asserts is this, if p is a prime, then p minus 1

663
factorial is minus 1 mod p, or in other words p minus 1 factorial plus 1 is divisible by p. So
let us prove this theorem.

(Refer Slide Time: 17:30)

The proof goes like this. For any a, with a varying from 1 to p minus 1, where p is a prime,
we assume that p is a prime, and then, for any a in the range 1 to p minus 1, GCD of (a, p)
equal to 1. a and p do not have any common factor. A prime does not have a common factor
with any non-negative integer, any positive integer less than that. So ax equals 1 mod p has
exactly one solution in the interval 0 to p minus 1. This is what we have seen in theorem 1. ax
equals 1 mod p has exactly one solution, here we have taken b as 1 and m as p.

(Refer Slide Time: 18:49)

664
So if you consider the integers 1 to p minus 1. In this sequence we have a and the solution x
of ax equals to 1 mod p. So we can think of a and x as being paired of. We want to consider a
and x so that ax equal to 1 mod . But then could a and x be the same? if a and x are the same,
we have a squared equal to 1 mod p. There are two a’s which satisfy this. For example 1
squared is 1 mod p, p minus 1 the whole square is minus 1 the whole square mod p, which is
1 mod p. So in this we find that 1 and p minus 1 pair with themselves. That is 1 into 1 is 1
mod p, and p minus 1 into p minus 1 is 1 mod p.

(Refer Slide Time: 20:01)

But, what about the remaining? If you consider the numbers in the range 2 to p minus 2, if a
is in this range, 2 to p minus 2. Then we know that GCD of a minus 1, p is 1. p is a prime.
Similarly, GCD of a plus 1 and p is also 1. a plus 1 can be at most p minus 1. Therefore, a
minus 1 and a plus 1 are both relatively prime to p. a minus 1 and a plus 1, therefore must be
congruent to 1 mod p, which means a squared minus 1 is 1 mod p or a squared is not
congruent to 1 mod p. Therefore no a within the range 2 to p minus 2 will have this property.
a squared will not be 1 mod p, and here we were seeking all the a’s so that a squared is 1 mod
p. We found that 1 squared is 1 mod p and p minus 1 the whole squared is also 1 mod p. So 1
and p minus 1 do pair with themselves.

(Refer Slide Time: 21:48)

665
But, then if you consider the other numbers in the range 2 to p minus 2, we find that this do
not pair with themselves. So they pair with some other, so any a here will pair with x not
equal to a, so that ax equal to 1 mod p. Now let us consider this product, p minus 1 factorial.
Consider the integers in the range 2 to p minus 2. We know that all of them pair with each
other. For every a in this range, there is an x within this range that is not equal to a, so that ax
equal to 1 mod p. Therefore the product of all these together is 1 mod p. Therefore this entire
product which is p minus 1 factorial can be written as 1 into p minus 1 mod p.

(Refer Slide Time: 23:06)

Or in other words, p minus 1 factorial, is p minus 1 mod p. But p minus 1 is minus 1 mod p.
Moving minus 1 to this side, we have p minus 1 factorial plus 1 is 0 mod p or p divides p

666
minus 1 factorial plus 1, which is what the theorem asserts. Wilsons theorem asserts that p
minus 1 factorial is minus 1 mod p.

(Refer Slide Time: 23:53)

Now, let us talk about solutions of congruences. Suppose f of x is a polynomial with integer
coefficients. Let's say u and v are integers. If f of u is congruent to 0 mod m and u is
congruent to v mod m, you can readily verify that f of v is congruent to 0 mod m.

(Refer Slide Time: 24:56)

667
Therefore, when we talk about solutions of the congruence f of x congruent to 0 mod m, we
assume that u and v are indistinguishable. They are congruent to each other mod m. So we
essentially assume that they are same solutions, so we do not count them as separate solutions
modulo m. For example, consider x squared minus x plus 4 equal to 0. It's solutions are, this
is congruent to 0 mod let us say 10, it’s solutions are 3, 8, 13, 18, 23, 28, etc. For example,
substituting 3 in this we have 9 minus 3 plus 4 which is 10. This is 0 mod 10. Substituting 8
here we have 64 minus 8 plus 4 which is 60, which is again 0 mod 10. So, these are all
solutions. And you can also verify that if x is a solution then x plus 10 is also a solution.

(Refer Slide Time: 26:23)

But then, 3, 13, 23, these are all congruent to 3 mod 10. Similarly 8, 18, 28, these are all
congruent to 8 mod 10. Therefore, we do not consider them distinct solutions, we say that we
have only two solutions in 0 to 9. That is we have only two solutions modulo 10 for this
congruent.

668
(Refer Slide Time: 26:56)

In general, if S is a complete residue system modulo m, then every u such that f of u is


congruent to 3 mod m, and u belongs to S. Such u are what we consider as solutions. So the
size of this set is the number of solutions, modulo m for f of x equal to congruent to 0 modulo
m. So we say that this congruent has these many solutions. So within the CRS, we consider
all u that satisfy the congruence and the number of them is the number of solutions that we
say this congruence has.

(Refer Slide Time: 28:22)

When we consider a polynomial of this form. Say, j is the largest integer, so that aj is not
equal to 0, then j is the degree of this polynomial.

669
(Refer Slide Time: 29:04)

So let’s consider the congruences of degree 1. There is a congruence of the form ax equals b
mod m. So by the first theorem of today, we showed that this has a solution. This has a
unique solution in 0 to m minus 1, in the interval 0 to m minus 1 if GCD of (a, m) equal to 1.

(Refer Slide Time: 29:50)

But now, let us assume that GCD of (a, m) is small g, which is not equal to 1. So a and m
have common factors. Suppose x is a solution of the congruence ax congruent to b mod m.
Then for integer z, for some integer z, ax equals mz plus b, as ax is congruent to b modulo m,
ax must be b plus m z for some integer z. Now g is the GCD of a and m, so g divides a and g
divides m.

670
Therefore, g must divide b too. In other words, after finding the GCD of a and m we find that
g does not divide b, then there is no solution. Once again the argument is if GCD of (a, m) is
g which is not equal to 1, and the congruence has a solution x then ax must be mz plus b for
some integer z. Since, g divides a and g divides m, g must divide b too. Conversely if g does
not divide b, there can be no solution.

(Refer Slide Time: 31:36)

So let us assume that g divides b. If g divides b, then ax congruent to b mod m can be


simplified. g is a common factor of a, b and m. In fact it is the GCD of a and m. So dividing
by g, I can rewrite this congruence in this fashion, a by g x is congruent to b by g mod m by
g, by the theorem that we saw in the previous lecture. So, it is enough to solve this
congruence.

671
(Refer Slide Time: 32:25)

To solve this congruence, first let us consider a by g x is 1 mod m by g. We know that GCD
of (a by g, m by g) is 1, they are relatively prime. GCD of a and m is g, so when you divide
both a and m by g, then the resolving numbers are relatively prime to each other. GCD of (a
by g, m by g) is 1. So this has, this congruence has exactly one solution in which interval, 0 to
m by g minus 1. In this range this congruence has exactly one solution. Say, x 0 is that
solution, so x 0 is a solution of a by g x congruent to 1 mod m by g.

(Refer Slide Time: 33:37)

Then let us consider x 0 times b by g, and substitute this in the original congruence, which is
ax congruent to b, a by g, congruent to b by g mod m, mod m by g. So we find that, a by g,
times x 0 b by g is congruent to, since x 0 is a solution of a by g, x 0 congruent to 1 mod b by

672
g, we find that a by g, x 0 is 1 mod b by g. So this is b by g mod m by g. Therefore x 0 b by g
is a solution of a by g x, solution of this congruence.

(Refer Slide Time: 34:54)

So this is the only solution of that congruence in 0 to m by g minus 1, and every solution to
that congruence would be x 0 plus t times m by g. Consider all congruences of this form, in
the interval 0 to m minus 1. They are all solutions of ax equals b mod m. So ax equal to b
mod m has multiple solutions in the range 0 to m minus 1, and all the solutions can be found
this way. That is, when the GCD of a and m is not 1, there are multiple solutions for this
congruence in the interval 0 to m minus 1. So let us consider our previous example once
again.

(Refer Slide Time: 35:58)

673
Where we wanted to solve 353 x equals 254 mod 400. We know that GCD of 353 and 400 is
1. 353 is in fact a prime. Therefore 1 can be expressed as a linear combination of 353 and 400
using Euclid's algorithm. We find that 1 is 17 into 353 minus 15 into 400. Taking modulo 400
on both sides we have, that 17 is a solution for 353 x equals 1 mod 400. In other words, 17 is
the only solution for this congruence in 0 to 399. But what we want is a solution for 353
times x is congruent to 254 mod 400 instead of 1. So if you multiply this solution with 254.

(Refer Slide Time: 37:28)

That means 17 into 254, which is 4318. This is congruent to 318 mod 400. We find that 318
is a solution for the original congruence 353 x is 254 modulo 400. So here we have managed
to find the solution without resorting to large exponents.

(Refer Slide Time: 38:05)

674
Let us consider one more example. Let us say, we want to solve this congruence 15 x is
congruent to 25 mod 35. So this is of the form ax equals b mod m, where, a and m are not
relatively prime. GCD of 15 and 35 is 5. Therefore, it is enough to solve the congruence
obtained by dividing this by 5. So a by g is 3, so we have 3x congruent to 5. b by g is 5 mod
7. So let us solve 3x congruent to 5 mod 7.

(Refer Slide Time: 39:00)

To solve this, first consider, 3x is congruent to 1 mod 7. So let us solve this first. GCD of 3
and 7 is 1. Therefore 1 can be expressed as a linear combination of 3 and 7. So 1 is 7 plus 3
into minus 2. Therefore minus 2 is a solution for 3x congruent to 1 mod 7. This 7 can be
ignored since we are taking mod 7 on both sides. So we have 3 into minus 2 equals 1 mod 7.
So minus 2 is a solution for this. But, minus 2 is the same as 5 mod 7.

675
(Refer Slide Time: 40:00)

Therefore, 5 is the unique solution of 3x is 1 mod 7 in 0 to 6. In the interval 0 to 6, 5 is a


unique solution for this congruence. But, what we need is a solution not for this congruence
but, for 3x is congruent to 5 mod 7, this is what we want to solve. Since, 5 is a solution for 3x
is congruent to 1 mod 7, 5 into 5, 25, is a solution for 3x is 5 mod 7. But 25 is 4 mod 7, 21
plus 4. Therefore 4 is a solution for this congruence 3x equal to 5 mod 7. Of course
substituting 4 here, you can readily verify 3 into 4 is 12, which is 5 mod 7. So it is indeed a
solution.

(Refer Slide Time: 41:16)

So we have a found a solution for 3x is 5 mod 7. But we want the solution for 15 x is 25 mod
35. Since we want the solution mod 35, we would like solutions in the range 0 to 34, both

676
inclusive. We know that 4 is a solution, 4 is the unique solution for 3x is 5 mod 7. But, then 4
plus t into 7, where t belongs to Z, the set of integers, is also a solution. So any integer of the
form 4 plus 7t is a solution. So 4 is a solution, 4 plus 7, 11 is a solution. 11 plus 7, 18 is a
solution. 25 is also solution. 32 is a solution. But these are the solutions in 0 to 34. So these
are all solutions for 15 x is 25 mod 35. So this we find all solutions within the interval 0 to
34. That is it, from the lecture. Hope to see you in the next. Thank you.

677
Discrete Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering,
Indian Institute of Technology Guwahati
Lecture 35:
Chinese Reminder Theorem

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the sixth lecture on number
theory.

(Refer Slide Time: 00:40)

Today we study the Chinese reminder theorem. Chinese reminder theorem is one of the
oldest mathematical theorems. It has been known since ancient times. The first record of this

678
problem was found in Chinese treatise on mathematics, called Sunzi dated to 3rd to 5th
century of the Common Era. But a statement of the problem could be found there. But the
first known algorithmic solution was due to Indian mathematician Aryabhata who lived in the
6th century of the Common Era. The India mathematician Brahmagupta who lived almost the
century later, is also known to have been aware of the problem as well as its solution.

(Refer Slide Time: 1:49)

The Chinese treatise in which the problem first appears states thus. There are certain things,
whose number is unknown that is there is an unknown variable. Count them by the threes, 2
are left. Count then by fives, 3 are left. Count then by sevens, 2 are left. How many are there?

679
(Refer Slide Time: 02:50)

In other words, let us say, we have an unknown integer x, what we know is that x mod 3 is 2,
x mod 5 is 3, x mod 7 is 2. What is x? This is the question that the Chinese mathematician
posed. Basically we have to solve these many congruences simultaneously, that is we need a
simultaneous solutions of a set of congruences.

(Refer Slide Time: 03:41)

So now let us look at the general statement of the problem. So let us say, m1 through mr are,
let these be positive integers that are pair wise coprime. Let us say, a1 through ar are integers.
We have a set of congruences, r congruences to be precise. The ith congruence says that x is
congruent to ai mod mi.

680
(Refer Slide Time: 04:47)

Then the Chinese reminder theorem says that, in this context the congruences have
simultaneous solution, in particular any two such solutions are congruent modulo m, where m
is the product of m1 through mr. So that is what Chinese reminder theorem says. In this
context where we have a set of positive integers, r positive integers to be precise, which are
pair wise coprime, which means the GCD of any pair is 1, then also given are, a1 through ar
that are r integers and we have these congruences, r congruences, x is congruent to ai mod mi.
In this context, what the theorem says is that, these congruence do have simultaneous
solutions and any two solutions are congruent modulo m, where m is the product of m1
through mr.

681
(Refer Slide Time: 06:03)

So let a see proof of this. So small m is the product of m1 through mr. Let us define capital Mj
as small m divided by small mj, which means capital Mj is m1 through mj-1 and then mj+1 plus
1 through mr, that is we take a product of all the m's except mj.

(Refer Slide Time: 06:44)

Then what would be the GCD of Mj and small mj that is we are seeking the GCD of this
product, m1 through m j minus 1 mj plus 1 through mr and mj. Now what we know is that, the
mj’s are all pair wise coprime. So mj is coprime with m1 m2 etc, each of these m's, which
means mj is coprime with the argument on the left hand side, which means we have 1 as the
GCD of these, in other words mj and capital Mj and small mj are coprime, or relatively prime.
coprime is another word for it.

682
(Refer Slide Time: 07:46)

Let us consider this congruence now. capital Mj x is 1 mod small mj. So here we know that
Mj and small mj are coprime, that is what we have just shown. Therefore by the discussion
that we had in the previous classes we know that this congruence has a unique solution in 0 to
mj minus 1. So there is a bj within this range which is a solution for this.

(Refer Slide Time: 08:40)

So consider that solution, that unique solution in the range 0 to mj minus 1. Let us denote it
by Mj inverse, what we find is that Mj multiplied by Mj inverse is 1 mod small mj. So that is
why we call it the inverse. Mj multiplied by Mj inverse is 1 mod small mj.

683
(Refer Slide Time: 09:15)

But, then Mj multiplied by Mj inverse can be written as m1 through mj minus 1 small m's mj
plus 1 through mr multiplied by Mj inverse. Therefore, Mj Mj inverse is 0 mod mi, when i is
not equal to j, because, mi will feature here, when i not equal to j. Therefore mi will divide
this factor. Therefore, it would divide the entire product. Therefore Mj Mj inverse is divisible
by mi, when i is not equal to j.

(Refer Slide Time: 10:22)

So this naturally suggests, that we should try x0, which is of this form. Take the sum of j
varying from 1 to r of Mj Mj inverse aj. So this is what x0 is. Now x0 can be written as M1 M1
inverse a1 plus M2 M2 inverse a2 through Mr Mr inverse ar. This we find is congruent to M1
M1 inverse a1 mod small m1. That is because M2 is a multiple of small m1, M3 is a multiple of

684
small m1, and so on, similarly Mr. is also multiple of small m1. Therefore this entire quantity
within the brackets will go to 0. But then M1 multiplied by M1 inverse we know is 1 mod m1.
Therefore this is congruent to a1 mod m1. In other words, x0 is congruent to a1 mod m1.

(Refer Slide Time: 11:53)

x0 can also be written as M2 M2 inverse a2 plus M1 M1 inverse a1 plus M3 M3 inverse a3 plus


all the way to Mr Mr inverse ar, that is the remaining terms. If I take the congruence of this
modulo m2, we find that the quantity within the bracket again goes to 0, because small m2
divides capital M1 capital M3 etc. Therefore, the quantity within the bracket is 0, and M2 M2
inverse is 1 mod m2. Therefore this is a2 mod m2. So continuing like this we find that for
every i with 1 less than or equal to i less than or equal to r, x0 is congruent to ai mod mi, that
proves one part of the theorem. So x0 is indeed a simultaneous solution.

So looking back at the theorem, we know that the theorem says the congruences do have
simultaneous solutions, so we have found one simultaneous solution, and then the rest of the
theorem says that any two solutions are congruent modulo m where small m is m1 through
mr. So let us prove that now.

685
(Refer Slide Time: 13:45)

If x0 and x1 are both solutions both simultaneous solutions of the system, then for every i, 1
less than or equal to i less than or equal to r, we know that x0 is ai mod mi, and x1 is ai mod
mi, which means x1 minus x0 is 0 mod mi.

(Refer Slide Time: 14:36)

In other words, mi divides x0 minus x1, for every i. Since every mi divides x0 minus x1, then
the least common multiple of m1 through mr must also divide x0 minus x1. But then what is
the LCM of m1 through mr? This is nothing but m, which is the product of m1 through mr,
that is because mi and mj are coprime with each other for any i not equal to j. So the LCM of
these is nothing but m. So what we have is that m divides x0 minus x1. In other words x1 is
congruent to x0 mod m that is precisely what the theorem says. If for any two solutions, these

686
two solutions are congruent to each other modulo m. So that completes the poof of the
theorem.

(Refer Slide Time: 15:50)

So now, let us work out an example. Let us say, we have a congruences of this form. x is
congruent to a1 mod 11, x is also congruent to a2 mod 16, x is congruent to a3 mod 21, x is
congruent to a4 mod 25. So 11 is a prime. 16 is 2 power 4. 21 is 3 into 7, so 3 power 1 into 7
power 1. 25 is 5 power 2. So, they are all relatively prime. That is 11, 16, 21, 25 are all pair
wise coprime. So these are respectively m1, m2, m3 and m4 of the theorem. Here r is 4. So we
are considering a problem of size 4.

(Refer Slide Time: 17:12)

687
Now, small m is defined as the product of this, 11 into 16 into 21 into 25, which is 92400.
Then capital M1 would be small m divided by small m1, which is 92400, divided by 11 which
is 8400. M2 is small m divided by small m2 which is 5775.

(Refer Slide Time: 17:55)

M3 is small m divided by 21, 92400 divided by 21, this is small m3, which is 4400 and
capital M4 is small m divided by small m4 which is 92400 divided by 25, which is 3696. So
we have to now find the inverses of M1, M2, M3 and M4, modulo small m1, small m2, small
m3 and small m4 respectively. So let us try to find those inverses.

(Refer Slide Time: 18:47)

688
First we have to solve this. M1 of x is 1 mod small m1, that we have to find the inverse of
capital M1 with respect to small m1, that is 8400 x is congruent to 1 mod 11. Let us, 8400 is 7
mod 11, 8393, 8 plus 9 17, 3 plus 3 6, so 17 minus 6 is 11. So this is divisible by 11. So 8400
is 7 mod 11. So this congruence can be written as 7x equals 1 mod 11. So we only need to
find the inverse of 7 with respect to 11 that would also be the inverse of 8400 with respect to
11.

(Refer Slide Time: 19:44)

So this is what we have to solve, 7x is a 1 mod 11. Of course you could use Euclid’s
algorithm for doing this. If we use Euclid’s algorithm on 11 and 7, we find that 4 is 11 minus
7, then 3 is 7 minus 4, 1 is 4 minus 3 which is then 4 minus 7 minus 4, which is 2 into 4
minus 7. But 4 is 11 minus 7. So 2 into 11 minus 7 minus 7, so that will be 2 into 11 minus 3
into 7, so if you take modulo 11 on both sides of the equation, we find that 1 is congruent to
minus 3 into 7. So we want the solution for 7x equals 1 mod 11, so we find that minus 3 is a
solution. But minus 3 is 8 mod 11. So 8 is a solution as well.

689
(Refer Slide Time: 21:06)

Of course an easier way of solving would be, to count the multiples of 7, 0, 7, 14, 21, 28, 35,
49, and 56. 56 is 1 mod 11. So 7 inverse mod 11 is 8.

(Refer Slide Time: 21:37)

Then we have to solve this congruence M2 x equals 1 mod m2, or in other words 5775 x is
congruent to 1 mod 16. 5760 is a multiple of 16, so we have a 5760 plus 15 x here. So its 15 x
is congruent to 1 mod 16. So when you run through the multiplication table for 15, you find
that 0, 15, 30 etc are none of them 1 mod 16, until you come to 225. The 225 is a 224 plus 1.
224 is 14 into 16. So we find that 15 is the inverse of 15 mod 16. So the second solution is
15.

690
(Refer Slide Time: 22:38)

And the 3rd one is a M3 x is congruent to 1 mod small m3, what is capital M3, that is 4400
that is 1 mod 21, 4400, 4200 and then 200 left 189 11, 11 x is 1 mod 21. So as you can
readily see 22 is 1 mod 21, so x equal to 2 is a solution, which means 11 inverse, which is
also 4400 inverse is 2 mod 21. So that is the 3rd solution.

(Refer Slide Time: 23:34)

And then coming to the 4th one, M4 x is 1 mod small m4, which is 3696 x is 1 mod 25. 3696
is minus 4, 3700 minus 4 which is 21. 21 x is 1 mod 25. This is what we have to solve. So,

691
running through the multiples of 21 etcetera, when we come to 126, we find that it is 1 mod
25. So x equal to 6 is the solution, the 4th solution.

(Refer Slide Time: 24:30)

So we have now the 4 solutions, M1 inverse is 8, M2 inverse is 15, M3 inverse is 2, M4 inverse


is 6. So a solution then would be M1 M1 inverse a1 plus M2 M2 inverse a2 plus M3 M3 inverse
a3, plus M4 M4 inverse a4, which means 8400 into 8 a1 plus 5775 into 15 into a2, plus 4400
into 2 into a3, plus 3696 into 6 into a4. The whole of this modulo 92400 is the solution that we
want. I have deliberately avoided choosing a1, a2, a3, a4 to show that the computation remains
the same irrespective of these values. So whatever a1, a2, a3, a4 are, the solution will take on
this from. Now we only have to plugin a1, a2, a3, a4.

(Refer Slide Time: 26:11)

692
So let us assume that a1 equal to 1, a2 equal to 2, a3 equal to 3, a4 equal to 4. If this is the case,
plugging in these values, we find that the solution x0 is 78354. Verifying, we find that 78354
mod 11 is 1, 78354 mod 16 is 2, 78354 mod 21 is 3, mod 25 is 4. So this is indeed the
solution that we seek.

(Refer Slide Time: 27:12)

And then, 78354 plus j into 92400 is the general solution. So let us consider another example
now. So this is a familiar problem. Let us say we want to solve 353x is congruent to 254
modulo 400. We have seen two ways of solving this before, so this is a 3rd way. We could
convert this into simultaneous congruences in this manner. 353 x is congruent to 254 mod 16
and 254 mod 25 separately, this is because 16 into 25 is 400 and 16 and 25 are relatively
prime to each other.

693
(Refer Slide Time: 28:27)

But then, 352 is 320 plus 32, so that is a multiple of 16. So 353 x is congruent to x mod 16,
which is equal to 254 mod 16. But then 254 is 240 plus 14, so 240 is a multiple of 16. So we
have 14 mod 16. So the first congruence namely, 353 x is congruent to 254 mod 16 reduces
to x congruent to 14 mod 16. So this is one congruent that we have.

(Refer Slide Time: 29:17)

The other congruence, namely 353 x is congruent to 254 mod 25 can be simplified like this.
353 x is 3 x mod 25, 350 plus 3 and 254 is 250 plus 4, so that is 4 mod 25. So this congruence
simplifies to 3x congruent to 4 mod 25, but this is not in the desired form because we would
have liked this to be in this form. But, here we have 3x on the left hand side.

694
(Refer Slide Time: 30:01)

So let us solve 3x equals 1 mod 25 first. So, considering the multiples of 25 plus 1, example
consider 26, 3 does not divide 26. But, 3 divides 51 which is 2 into 25 plus 1. Which means 3
into 17 is 1 mod 25. So 17 is a solution for this. So we have now a solution for 3x equals 1
mod 25. But, we are looking at the congruence 3x equals 4 mod 25. So if 17 is a solution for
3x equals 1 mod 25, then 17 into 4 which is 68 is a solution for 3x equals 4 mod 25.

(Refer Slide Time: 31:07)

In other words, x is 68 mod 25 is a solution, or x is 18 mod 25 is a solution. Now this is in the


desired form, this is in the x equals a2 mod m2 form. So this is our second congruence.

695
(Refer Slide Time: 31:34)

So now putting the two congruences together, we have x congruent to 14 mod 16, x
congruent to 18 mod 25. So now we can apply the Chinese reminder theorem. Here m1 is 16,
m2 is 25, so small m is 400. M1 is 400 by 16, which is 25. M2 is 400 by 25 which is 16, a1 is
14, a2 is 18.

(Refer Slide Time: 32:19)

So now we have to solve M1 x is 1 mod small m1, which means we have to find the inverse of
capital M1 with respect to small m1, or 25 x is 1 mod 16. So taking the multiples of 25, we
find that 225 is 1 mod 16. But, 225 is 25 into 9. So 9 is a solution, so 9 is 25 inverse mod 16.

(Refer Slide Time: 32:55)

696
The other congruence we have to solve is M2 x is 1 mod small m2, which is 16 x is 1 mod 25.
Counting through the multiples of 16, we find that 16 into 11 is 176, which is 175 plus 1 so, 1
mod 25, which means 11 is 16 inverse mod 25. So we now have the inverses.

(Refer Slide Time: 33:37)

The first solution would be M1 M2 inverse a1, plus M2 M2 inverse a2, mod m which means 25
into 9 into 14, plus 16 into 11 into 18 mod 400, this works out to 6318 mod 400, which is 318
mod 400. So this is the only solution in the range 0 to 399. So that is about Chinese reminder
theorem. That is it from this lecture. Hope to see you in the next. Thank You.

697
Discreet Mathematics
Professor Sajith Gopalan
Professor Benny George
Department of Computer Science and Engineering,
Indian Institute of Technology Guwahati
Lecture 36
Totient; Congruence, Floor and Ceiling Functions

Welcome to the NPTEL MOOC on Discrete Mathematics. This is the seventh lecture on
number theory.

(Refer Slide Time: 00:36)

First we will consider a theorem regarding Euler's phi function. If m and n are relatively
prime, then, phi of mn is equal to phi of m into phi of n. For any 2 positive integers, m and n,
that are relatively prime to each other, phi of mn is phi of m into phi of n.

698
(Refer Slide Time: 1:23)

So let us prove this. Let us say A, B, C are three sets, they are 3 reduced residue systems,
modulo m and n, mn respectively. Then the size of A is phi of m, size of B is phi of n and
size of C is phi of m into n.

(Refer Slide Time: 02:04)

Let us say x is some member of C, the reduced residue systems modulo mn, then GCD of x
and mn is 1 by definition. So x does not divide m and n, there is no common factor between x
and mn, therefore GCD of x and m should be 1 and the GCD of x and n should be 1 too,
which means x is r mod m and x is s mod n, for unique r and s, where r belongs to A and s
belongs to B. If x is relatively prime to m and x is relatively prime to n, then, x belongs to the

699
reduced residue systems modulo m and x also belongs to the reduced residue system modulo
n, which means in this residue systems A and B that we consider, there are some elements
unique elements r and s so that x is congruent to r mod m and x is congruent to s mod n.

(Refer Slide Time: 03:23)

Which means the size of C is less than or equal to the size of A cross B. Because we take an
arbitrary element x belonging to C and correspondingly we find an ordered pair (r, s)
belonging to A cross B. Therefore, the size of C must be less than or equal to the size of A
cross B. But what is the size of C? That is phi of mn, this is less than or equal to phi of m into
phi of n. So that is what we have established, that is one way.

(Refer Slide Time: 04:00)

700
Now let us take an ordered pair (r, s) belonging to A cross B. Then by Chinese Remainder
Theorem x is r mod m and x is s mod n. These two congruences have a unique solution in 0
to mn minus 1. So if x0 is that solution, then GCD of (x0, mn) is 1, because GCD of (r, m)
equal to 1. r belongs to A and GCD of s belongs to n. GCD of (s, n) is 1 because s belongs to
B, which is a reduced residue systems modulo n. Therefore, x0 is relatively prime to mn.

(Refer Slide Time: 05:25)

Which means there exists an x0 prime belonging to C, such that x0 is congruent to x0 prime
modulo mn. C is reduced residue systems modulo mn, therefore there should be an x 0 prime
which is congruent to x0 in that. This establishes that the size of C is greater than or equal to
the size of A cross B. That is for each ordered pair (r, s) belonging to A cross B, we have
been able to find an element x0 prime. In other words, phi of mn is greater than or equal to
phi of m into phi of n. So combining both, we have the theorem. When m and n are relatively
prime, then phi of mn is equal to phi of m into phi of n.

701
(Refer Slide Time: 06:30)

Another interesting theorem which allows us to calculate phi easily, for n greater than 1, phi
of n is equal to n times the product over all p which divides n of 1 minus 1 by p. So this
makes it easy to calculate the phi function for even fairly large numbers.

(Refer Slide Time: 07:04)

From the definition, we know that phi m is the number of positive integers greater than or
equal to 1, sorry, less than or equal to m, that are relatively prime to m. We know phi of 1
equal to 1. Say, the given number n is prime factorized in this fashion, p1 power e1 etcetera
upto pr power er, where piis the ith prime number. So this is the prime factorization. Therefore

702
piand pj are not the same, when i not equal to j, which means the GCD of pipower ei and pj
power ej is 1 when i not equal to j.

(Refer Slide Time: 08:09)

So by the previous theorem, we can express phi of n as the product over i varying from 1 to r,
of phi of pi power ei. But what is this quantity? Let us compute this first, phi of p power e. So
here we consider all integers less than or equal to p power e, all positive integers less than or
equal to p power e, and from this we remove all numbers that are divisors of p power e. The
size of the resultant set is phi of p power e, that is we consider the reduced residue systems,
then, the reduced residue system will have these many elements, p power e minus the size of
all divisors of p power e. But what are the divisors of p power e? They are p, p square, etc
upto p power e. But what is this quantity? That is p power e divided by p, so this can be
written as p power e into 1 minus 1 by p, so that is the phi value of p power e.

703
(Refers Slide Time: 09:35)

Therefore, coming back to this equation, we can write phi of n as product of i varying from 1
to r of pi power ei into 1 minus 1 by, using this form. But this is i varying from 1 to r of pi
power ei, and again i varying from 1 to r of 1 minus 1 by pi, but the quantity within the first
bracket is nothing but n. So this is n multiplied by the product over i varying from 1 to r of 1
minus 1 by pi, which means we are considering all prime numbers that divide n. So this
product is actually over all prime numbers that divide n. For each such prime number, 1
minus 1 by p has to be multiplied together. So this product is what phi of n is.

(Refer Slide Time: 10:46)

704
For example, let us calculate phi of 10, which is taking the prime factorization of 10, this is
phi of 2 into 5, then that would be 10 multiplied by 1 minus 1 by 2. 2 is a prime which
divides 10 and 5 is a prime which divides 10. Therefore this is 10 into half into 4 by 5. So phi
of 10 is 4. Let us now compute phi of 400. phi of 400 is phi of 2 power 4 into 5 power 2. So
there are two primes here, 2 and 5 again, so this will be n into 1 minus 1 by 2 into 1 minus 1
by 5, this is 400 into half into 4 by 5, 200 into 0.8, which is 160. So phi of 400 is 160, which
we had used in an example earlier in one of the earlier lectures.

(Refer Slide Time: 12:03)

Phi of 2 power 4 multiplied by 7 power 6 multiplied by 13 power 3 is this number 2 power 4


into 7 power 6 into 13 power 3 multiplied by 1 minus 1 by 2, which is 1 by 2. 1 minus 1 by 7
which is 6 by 7 and 1 minus 1 by 13 which is 12 by 13, that would be 2 power 3, 7 power 5,
13 power 2, multiplied by 6 and 12. So this is phi value of this number.

705
(Refer Slide Time: 12:51)

Now, from this, we can prove another interesting result. If n is a positive integer, sum of phi
of d over all divisors d of n is n. The proof goes this way. First consider numbers of the form
p power e, so n is p power e, let us say, then we are summing over all divisors of n of phi of
d. So this would be phi of 1 plus phi of p plus phi of p square etcetera, all the way upto phi of
p power e. These are the divisors of n when n is of the form p power e, 1, p, p square etcetera
are the divisors of n. But phi of 1 is 1, phi of p is p minus 1, as we have seen before, phi of p
power e is p power into 1 minus 1 by p. So, phi of p square would be p square into 1 minus 1
by p, which is p square minus p. Then we have p cube minus p squared culminating with p
power e minus p power e minus 1. So we find that this sum telescopes. 1 and 1 cancel, p and
minus p cancel, p squared and minus p squared cancel, p cube cancels, p power e minus 1
cancels, what remains is p power e, which is nothing but n. So the theorem holds when n is of
the form p power e.

706
(Refer Slide Time: 14:44)

Now, suppose n is of the form k times p power e, for integer k such that p does not divide k.
So, k and p power e are relatively prime to each other. Then, our required sum, sum over all
divisors of n of phi of d can be written like this. First consider all divisors of k, so this is a
part of the sum, but then that is not the whole of it. We also consider all divisors of pd that
have not been considered before, that is for every divisor d of k we consider pd. Continuing
like this, if you sum in this fashion we would exhaust all divisors of n. So all these sums are
over divisors of k, but then since d and p are relatively prime to each other, this can be
written in this fashion. The first term does not change, the second term can be written like
this, phi of p into phi of d.

Here it is phi of p power e into phi of d, then, this is a common factor. If this is taken outside,
we have 1 plus phi of p plus phi of p square plus etcetera all the way up to phi of p power e.
But this is a sum that we have just seen. Since 1 is the same as phi of 1, it is identical to this
sum which we know evaluates to p power e.

707
(Refer Slide Time: 17:00)

Therefore, this is the quantity within the square brackets, which is sigma d that divides k of
phi of d into p power e. But this quantity inductively we assume is k, then we have the sum
reducing to k into p power e which is nothing but n, and that is what we seek to prove. So the
case for p power e is the basis and inductively here we apply the hypothesis that this sum
evaluates to k, therefore the induction holds. So inductively we prove the statement.

(Refer Slide Time: 17:51)

For example, let us say n equals 24, the divisors 1, 2, 3, 4, 6, 8, 12 and 24. Phi of 1 is 1, phi of
2 is 1, phi of 3 is 2, phi of 4 is 2, that is 1 and 3. Now phi of 6, using the formula would be 6
multiplied by 1 minus 1 by p for every primes, so 2 and 3 are the primes which go in 6, so

708
you have 1 minus 1 by 2 into 1 minus 1 by 3, which is 6 into half into 2 by 3 ,which is 2, So
phi of 6 is 2. Phi of 8 similarly is 4, phi of 12 is 12 into half into 2 by 3 which is 4, phi of 24
is 8. So summing all these values 1, 1, 2, 2, 2, 4, 4 and 8, you find that the sum comes to 24.

(Refer Slide Time: 19:19)

Taking the larger example, consider n equals 3000, then the factors would be 1, 3000, 2,
1500, 3, 1000, 4, 750, 5, 600, 6, 500, 8, 375, 10, 300, 12, 250, 15, 200. So those are some of
the factors, if you compute the corresponding phi values, you find these are, 3000, you can
see that the phi value is 800. 1500, it is 400. For 1000 again it is 400. For 750 it is 200. 600 is
160. 500 is 200. 375 is 200. 300 is 80. 250 is 100, 200 is 80.

(Refer Slide Time: 20:59)

709
The remaining factors would be 20, 24, 25, 30, 40, 50, 50 into 60 is 3000, 40 into 75, 30 into
100, 25 into 120, 24 into 125, 20 into 150. So the corresponding phi values would be 8, 8, 20,
8, 16, 20, for 150 it is 40, for 125, it is 100. For 120 it is 32, for 100 it is 40, 75 it is 40 again,
for 60 it is 16. So these are the phi values. If you add up all the phi values together, you find
that the sum comes to 3000 again, which is what n is.

(Refer Slide Time: 22:16)

By Zm we denote integers modulo m, that is Zm is 0 through m minus 1, the cardinality of Zm


would be m. So as you can see Zm is a complete a residue system module m.

710
(Refer Slide Time: 22:48)

We can define the addition operation on Zm thus. You can draw up a table, so here let us
consider the table of Z5. So 0 plus X is X, 1 plus 4 is 5 which is 0 modulo 5, 2 plus 4 is 6
which is 1 modulo 5, 3 plus 4 is 7 which is 2 modulo 5, 4 plus 4 is 8 which is 3 modulo 5. So
this is the addition table for Z5.

(Refer Slide Time: 24:00)

When we come to the multiplication operation on Z5, 0 into X is 0. So the top row is all 0,
similarly the left most column is also all 0. 1 into X is X, so here we have values in this
fashion. Now 2 into 2 is 4, 4 is 4 mod 5, 2 into 3 is 6, so we have 1 mod 5 here. 2 into 4 is 8,
which is 3 mod 5. 3 into 2 is 6 which is 1 mod 5. 3 into 3 is 9, which is 4 mod 5. 3 into 4, 7,

711
which is 2 mod 5. 4 into is 8 which is 3 mod 5. 4 into 3 is 7 which is 2 mod 5, 4 into 4 is 16
which is 1 mod 5. So this is the multiplication table on Z5. So from the table you find that 4
into 4 is 16, which is 1 mod 5, for example.

(Refer Slide Time: 25:23)

That is 4 is the solution of 4x equals 1 mod 5. Recall we solved such equations using Euclid’s
algorithm. GCD of (5, 4) is the same as GCD of (4, 1), which is 1. Therefore 1 is 1 into 5
minus 1 into 4. If you take mod 5 on both sides, you have 1 congruent to 4 into minus 1 mod
5. So minus 1 is a solution for this equation. So x congruent to minus 1 mod 5 is a solution.
But then minus 1 is congruent to 4, so this is the same as x congruent 4 mod 5.

(Refer Slide Time: 26:37)

712
Let us now consider some quadratic congruences. x squared plus 2 x plus 1 is congruent to 0
mod 4. The complete residue system mod 4 has 4 members. This is one complete residue
system. So we can try its members. So when x equal to 0, we have 0 squared plus 2 into 0
plus 1, which is 1, this is not congruent to 0 mod 4. So 0 is not a solution. So if you put 1
here, we have 1 squared plus 2 into 1 plus 1 which is 1 plus 2 plus 1 which is 4, that is 0 mod
4, which means 1 is a solution. If you put 2 in we have 2 squared which is 4 plus 2 into 2
which is 4 plus 1, so that is 8 plus 1 9 which is 1 mod 4. So this is not a solution. When you
have 3 into 3, 9, 2 into 3 which is 6 plus 1, which is 16 which is 0 mod 4. So this is a solution
2. So 1 and 3 are two solutions for this quadratic congruence, and modulo 4 these are the only
solutions. Modulo 4, there could be 4 solutions we have tried all of them and we found that
only 1 and 3 are solutions. By what we have seen earlier, if for in any complete residue
system modulo 4, there will be exactly 2 members that are solutions for this quadratic
congruent those two solutions would be congruent to 1 and 3 respectively modulo 4.

(Refer Slide Time: 28:35)

Consider another quadratic congruence. Again we are considering CRS modulo 4, so we can
consider the members of CRS 0, 1, 2, 3, or members of Z4, then when you plug in these
values for x. From 0 you get 1, from 1 you get 1 plus 1 plus 1 which is 3. From 2, you have 4
plus 2, 6 plus 1 7, 7 mod 4 is 3 and then from 3, you find 9 plus 3, 12 plus 1, 13 which is 1
mod 4. So we find that none of them will provide a solution. So this quadratic congruence is
without a solution. So it is possible for congruences to have no solution.

713
(Refer Slide Time: 29:32)

Consider another one, x squared plus 3x plus 1 is 0 mod 4, this is what we want to solve.
Again when you consider 0, 1, 2, 3, you find that the values are 1, 1, 3, 3, again there is no
solution. If you consider 2x squared plus x plus 1 equals congruent to 0 mod 4, if this is the
case, then for 0, 1, 2, 3 you find that the values are 1, 0, 3 and 2. So there is a unique solution
here. 1 is a unique solution.

(Refer Slide Time: 30:22)

Consider x squared minus 1 congruent to 0 mod 8. So here the members of Z8 would be 0, 1,


2, 3, 4, 5, 6 and 7. So if you evaluate here, for 0, x squared minus 1 is minus 1 which is 7
mod 8. For 1 it is 1 minus 1 0, for 2 it is 4 minus 1 3, for 3 it is 9 minus 1, 8 which is 0, for 4,

714
it is 16 minus 1, 15 which is 7 again. For 5, 25 minus 1, 24, for 6, 35 which is 3, for 7, it is 49
minus 1, 48 which is 0 mod 8. So, you find that there are 4 solutions. So the solution set is 1,
3, 5, 7. If you find that the number of solutions could be larger than the degree of the
polynomial, so here, we have a quadratic polynomial, where the number of solutions is 4.

(Refer Slide Time: 31:38)

Regarding polynomial addition and multiplication, consider these 2 polynomials over Z5. Let
us say f of x is 6x cube minus 4x squared plus 5x minus 4, g of x is 3x cube plus x squared
minus 6x plus 1. These can be simplified in this fashion, f of x is the sum of, the first term of
6x cube, but since 5x cube for any integer x is divisible by 5, this can be written as x cube.
Minus 4 is congruent to 1 modulo 5. So this can be written as x squared. 5x is divisible by 5,
so we have 0x and then minus 4 is congruent to plus 1, again, so we have plus 1. So f of x can
be written in this simplified form modulo 5. So we say f of x is congruent to this polynomial
modulo 5. Similarly g of x is congruent to 3x cube plus x squared minus 6x is the same as
minus x which is the same plus 4x and plus 1. If you were to add these polynomials, the sum
would be 4x cube plus 2x squared plus 4x plus 2 mod 5.

715
(Refer Slide Time: 33:44)

Once again, if f of x is 6x cube minus 4x squared plus 5x minus 4 and h of x is 2x plus 7, then
as before, we can simply them. f of x is congruent to x cube plus x squared plus 1, h of x is
congruent to, this is of course mod 5, f of x is congruent to 2x plus 2 mod 5 again, then if you
take the product f of x into h of x, you get get 2x power 4 plus 2 x cube plus 2x, 2x cube plus
2x square plus 2, which is 2 x power 4 plus 4 x cube plus 4 x squared plus 2. So that is how
you add and multiply polynomials in modular arithmetic.

(Refer Slide Time: 35:05)

716
Now, let us study the Ceiling and Floor functions. The floor of x is defined as the greatest
integer less than or equal to x, and the ceiling effects is defined as the smallest integer greater
than or equal to x. For example floor of 7.1 is 7, ceiling of 7.1 is 8.

(Refer Slide Time: 35:53)

So, let us see a few results regarding the ceiling and floor functions. The first theorem says
that, floor of x is less than or equal to x, which is less than floor of x plus 1 and x minus 1 is
less than floor of x and 0 is less than or equal to x minus floor of x, which is less than 1. So to
prove this, suppose x is n plus epsilon for an integer n and a epsilon which is between 0 and
1.

717
(Refer Slide Time: 37:30)

Epsilon could be 0, but epsilon is less than 1. Then the floor of x is the floor of n plus epsilon,
which is n, this is less than or equal to n plus epsilon naturally. Because epsilon is between 0
and 1, but this is what x is. So we have that the floor of x is less than or equal to x, but x is n
plus epsilon, which is less than n plus 1, because epsilon is less than 1. This is as we have
seen, n is floor of x, so this is floor of x plus 1. So, that proves the first line here. x minus 1 is
n plus epsilon minus 1, which is less than n, which is less than or equal to n plus epsilon,
which proves the second line. 0 is less than or equal to n plus epsilon minus n, because
epsilon is greater than or equal to 0. This is of course epsilon which is less than 1, which
proves the third line.

718
(Refer Slide Time: 37:57)

Floor of x is also equal to the sum over 1 less than or equal to i less than or equal to x, where
i is an integer of 1. So as the proof, consider x is n plus epsilon, where epsilon is as before,
then floor of x is n, that is what the left hand side is. Then sigma 1 less than or equal to i less
than or equal to x, for integer i would be less than n.

(Refer Slide Time: 38:38)

Thirdly, the floor of x plus j is the floor of x plus j, for any integer j belonging to Z any j
belonging to Z. Say, x is n plus epsilon, where epsilon is as before, then floor of x plus j
would be the floor of n plus j plus epsilon. n is an integer, j is also an integer. So n plus j is an

719
integer, so this would be n plus j. Floor of x is n, so the right hand side is n plus j, which is
the same as the left hand side.

(Refer Slide Time: 39:30)

The next theorem says that floor of x plus floor of y is less than or equal to floor of x plus y,
which is less than or equal to floor of x plus floor of y plus 1. Let us say x is n plus epsilon,
and y is m plus delta, where 0 less than or equal to epsilon and delta, which are less than 1.
Then, n plus m, this is what floor of x plus floor of y is, this is less than or equal to floor of x
plus y, which is n plus m plus epsilon plus delta. So depending on epsilon and delta, this is
either n plus m or n plus m plus 1. So indeed the first inequality holds, n plus m is less than or
equal to this, and this is less than or equal to n plus m plus 1, which is what floor of x plus
floor of y plus 1 is, and therefore, the second inequality too.

720
(Refer Slide Time: 40:53)

Floor of x plus floor of minus x equal to 0, if x is an integer, is minus 1 otherwise. So let us


consider a non-integer first. Let’s suppose x is n plus epsilon where epsilon is neither 0 nor 1.
It is strictly between 0 and 1, in which case floor of x is n, floor of minus x is floor of minus n
minus epsilon. So we are looking at the integer which is smaller than this, but is the greatest,
that will be minus of n plus 1. Therefore, when you take the sum, you have n plus minus of n
plus 1, which is minus 1. Now if x is an integer, floor of x is the same as n, minus of x is
minus n, the floor of minus n is minus n. Therefore the sum would be 0 as a theorem claims.

(Refer Slide Time: 42:21)

721
The floor of floor of x by j is floor of x by j if j is a positive integer. To prove this, let us
assume that x is n plus delta, for an integer n and 0 less than or equal to delta less than 1.
Suppose n is qj plus r. For any n and any j we can write n as qj plus r, where r is greater than
or equal to 0 but less than j. Then x is qj plus r plus delta. Now let us consider the left hand
side, the ceiling, the floor of x is qj plus r. So we have the floor of qj plus r by j, which is the
floor of q plus r by j, since r is less than j, this fraction r by j is strictly less than 1,and
therefore this would be q. So the left hand side is q.

(Refer Slide Time: 43:50)

Now the right hand side is floor of x by j, which is floor of qj plus r plus delta divided by j,
which is q plus r plus delta by j floor. But r plus delta is less than j. r is less than j, so r can be
at most j minus 1 and delta is strictly less than 1, so r plus delta is less than j. Therefore the
fraction r plus delta by j is less than 1, so this would be q, which is same as the left hand side.
So theorem is then proved.

722
(Refer Slide Time: 44:37)

The seventh result would be this. The negative of the floor of minus x is same as ceiling of x.
Say x is n plus epsilon, for 0 less than epsilon less than or equal to 1. Note, here I have taken
epsilon as strictly greater than 0, where n is an integer. Then ceiling of x is n plus 1, this is
what right hand side is. Minus of x is minus n minus epsilon, the floor of minus x therefore is
minus n minus 1, then, negative of that, this is what the left hand side is. That will be n plus
1, this is the right hand side. Hence the result holds.

(Refer Slide Time: 45:40)

The floor of x plus half is round of x, where halves are rounded upwards. For example, round
of 7.5 is 8, because a half is rounded upwards. To prove this, first consider x is equal to n,

723
there is no fractional part, then floor of x plus half is the floor of n plus half which is n, but
this is the same as round of x. Since x is an integer, round of x is the same as round of n. If x
is n plus epsilon, where epsilon is greater than 0 but less than half, then floor of x plus half is
the floor of n plus epsilon plus half, which is n. epsilon is less than half, so epsilon plus half
together will not make up one, so the floor here is 1, floor here is n, which is the same as
round of x. Since epsilon is not large enough, the round function will not round upwards. If x
is n plus epsilon for epsilon that is greater than or equal to half, but less than 1, floor of x plus
half would be floor of n plus epsilon plus half, which is n plus 1, which is the same as round
of x. So in this case epsilon could be half, in which case x would be rounded upwards.

(Refer Slide Time: 47:58)

Analogously, negative of the floor of negative x plus half is round of x, where halves are
rounded downwards. The proof is a dual of the previous one, so I leave it as an exercise.

724
(Refer Slide Time: 48:21)

Another result about ceiling and floor is this. For n and m belonging to Z plus, set of positive
integers, floor of n by m is the number of integers in the range 1 to n divisible by m. Say, n is
qm plus r. For any 2 positive integers n and m, we can write n as qm plus r, for r that is
greater than or equal to 0 but less than m. Then, floor of n by m is the floor of q plus r by m,
which is q because r by m is less than 1. So integers in 1 to n divisible by m, will form the set
m, 2m, 3m, etcetera upto qm. n is qm plus r, therefore n is greater than or equal to qm. So
there are q of them, hence the theorem.

725
(Refer Slide Time: 50:00)

For n, m, which are in Z+, floor of n by m is the same as the ceiling of n minus m plus 1
divided by m. So again assume that n can be expressed as qm plus r as before. So floor of n
by m is the floor of q m plus r by m, which is floor of q plus r by m, which is q as we have
seen before. Then the ceiling of n minus m plus 1 by m would be the ceiling of qm plus r
minus plus 1 divided by m, which is the ceiling of q minus 1 plus r plus 1 by m. This is q
minus 1 plus a fraction epsilon. r plus 1 by m is less than 1, when r plus 1 is less than m, and
therefore this quantity will be q. This would be ceiling of q minus 1 plus 1, which is a ceiling
of q which is again q, if r plus 1 equal to m.

Since r ranges from 0 to r minus 1 both inclusive, r plus 1 is either less than 1 or is equal to
m, either less than m or is equal to m. So these are the only 2 possibilities. In both the cases,
we find that the right hand side evaluates to q, and hence, the theorem. So that is it, from this
lecture. This is the last lecture of the module on number theory. Hope to see you in the other
modules. Thank you.

726
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science and Engineering,
Indian Institute of Technology Guwahati
Lecture 37
Introduction to Groups

(Refer Slide Time: 00:46)

So, we will begin our study of algebraic structures with an introduction to the algebraic
structure known as groups. We will understand this by solving a puzzle. We will introduce
the notion of groups by solving up puzzle. I will first describe a game that many of you may
be familiar with. So this is known as the peg solitaire game. So this game is a single player
game, and it is played on a board which looks like this. So you can imagine these holes as
places where a marble can be placed. So the initial configuration, all the positions will
contain marble. So let us say that the darkened positions are devoid of marbles.

So if I mark it in black, that means there is no marble at that particular position. And, so this
is a starting configuration. There are marbles at all the positions except at the very center of
the board, and there are certain moves, that is available and you can perform these moves in
the board, and the final configuration is given. The question is whether you can reach that
particular configuration. So, we need to describe what the moves are. So center position is the
only, so the starting, the standard starting configuration is one, where the center position is
devoid of any marbles. So let us look what a generalized move is like. So let us say there are
2 positions which are filled with marble and there is a vacant position next to it.

727
So that these positions are one after another, they are adjacent position. They can either be in
the horizontal position, so this is horizontal case, or you can have an arrangement where there
is a vertical alignment of 3 positions, such that 2 of them are filled with marbles and the third
one is empty. So this is the vertical case. So in this case what you can do is, you can allow, so
the move is described as, the left most marble can jump to the vacant position and if you
jump over a marble, the middle one is removed. So from this configuration, we will end at up
at the following configuration wherein you have 2 empty positions and 1 position filled with
a marble.

So basically if you name this as A, B and C, where A, B are occupied positions and C is an
unoccupied position, then the marble from position A can be transferred to the position C
which is empty and in the process you can remove of the marble, that is there at position B.
In case of vertical alignment also you can do the same thing. So, if you name these positions
as alpha, beta and gamma, the marble at gamma can jump over the marble at beta to the
position alpha and in the process, the marble at beta position is removed. Now, there are other
moves as well. For example, here the move is left to right, you can have it from right to left,
top to bottom all those are fine. So, basically any move is of the following kind. You need to
find 3 positions of the board, which are next to each other, they should be consecutive
positions such that the extreme positions, that is, if you find these 3 consecutive positions,
either horizontal or vertical, the middle position should surely contain a marble and one of the
sides should be empty. In that case, you can transfer the marble to the empty position and
remove the other 2 marbles. So that is the move.

The question is, if you start with a particular configuration can you reach other configuration?
Note that, any move, horizontal or vertical will reduce the number of marbles on the board.
So in this starting configuration, there are along the arms, there 6, so 6 into 4 plus 9 positions
out of which 1 is empty, so into 8 marbles are present, that is 32 marbles are there on the
table. So 32 marbles to start with, and then the question is can you reach a configuration with
exactly 1 marble. So you can try this out, and may be after it is slightly difficult, but you can
figure out that there are configure, you can reach a configuration where there is only 1 marble
remaining. So, let me just describe some configurations, which you, from which there are no
moves. For example, so if you had marbles at the alternate positions, let's say marbles in the

728
chess board pattern. So suppose after moving some number of marbles around, you reach this
particular configuration.

Here you can see that there is no further moves possible. So if you call this as first position,
and this is position number 2. In position number 1, there are many moves. For example you
can consider these 3 positions and make a move or you can consider these 3, and consider a
move or you can consider these 3 or you can consider these three. So many moves are there,
and after the first move is made, there are again many moves that are left. So you could
choose any of those things and keep on repeatedly doing. Suppose we reach the configuration
2, then from there are the no further moves possible, and there are no undo's, so that will be a
position in which you get stuck.

Also if you have just 1 marble, there is no way you can move anything around. So that is also
a dead end. So question is, starting from 1, can you reach a position, a configuration with
precisely 1 marble. The answer to this question is yes, there are. You can do that, but what we
will consider in this lecture is a little, it is a variant of this, in the sense, can we reach a
configuration where there is only 1 marble but that is at, let's say this corner position. So, the
only marble remaining is appearing at one of the corners, is this possible? and to this
question, if we want to list out all the possible moves, the first move itself, there are 4
possible moves and in between there could be a multiple number of moves, so it could be a
very large number of moves to explore if you wanted to conclude that there are, that it is
impossible to reach a particular configuration.

You can model it as a search through the configuration space. So each configuration can be
described by a, let us say a bit vector with 33 positions, each 33 a bit vector of length 33 tells
whether a marble is present at a position or not, and from each bit vector, you can move to
certain other bit vectors. You can move from one position to another, if there is a valid move
which will take you from one to other. And then this can be reformulated as a search problem
on a graph, but the search space is really huge here. So we cannot really hope to do a
computerized check to figure out whether certain configurations are reachable or not and if
the same configurations, if it is imagined on a, I mean if the same game if it is imagined on a
larger board with more positions, then it is hopelessly intractable.

729
We will see that in some cases we can use some cleaver arguments to say that you cannot
reach certain configurations. So we will use some standard tools from computer science while
we argue about these. The moves are non-deterministic. In the sense, we have no control over
the choices made by some player while he is playing this particular game.

So in order to argue that none of the sequence is of moves can reach a certain configuration,
we need to answer about all possible moves. The tools that we will use to solve this, the main
thing is a, there is a key thing that is used in many proofs, this is called as invariance. So here
we have some particular, so we think of this as an algorithm. So somebody, some adversary
is performing some sequence of operations on some configuration space and each
configuration has to follow certain rules, it is either the horizontal case or the vertical case.
But we will identify some quantity associated with each configuration such that, that quantity
remains invariant. It does not change even after the moves are made. And once we have
established an invariant, what we can argue is that no matter what are the moves done by the
adversary, the first configuration and the last configuration, whatever are the sequence of
steps that take you to the last final configuration, the invariant must be satisfied and if we can
argue that the configuration that we want to end up with, does not have the invariant property,
then we can say that, look, that is a configuration that is unreachable. So let us identify what
is invariant property here.

(Refer Slide Time: 13:14)

So in order to describe the invariance, what we will need is the idea of groups. We will not
initially formulate in terms of group theory, but we will explain it via simpler methods. So let

730
us, so what we are going to do is the following. So look at these positions. We are going to
number these positions in a certain way, and based on the numbers that we give to these
positions and the presence or absence of marbles on individual position, based on that we will
come up with a numerical quantity.

So there are 33 positions of the board. So what we will do is, we will look at this board
configuration and to each position in the board, we will give a certain colour. So these
colours, let us call these colours as A, B and C. So this position gets colour A, and the next
one gets B and the next one gets C, and we do this systematically. So, A, B, C, this is again
going to be A. A, B, C, A, B, C, A, A, B, C, this is also going to be an A. This is going to be
a B and these positions are going to be the C positions. So, we looked at the board position
and we gave colours or we labeled each of these positions as A, B and C. Now we are in
position to determine the value of the board. So the value of the board is determined by
combining the colours given to the positions which have marbles in them. So let us say the
positions which do not have marble, we can probably cross it out. So initially these positions
did not have a marble, and all the other positions contain marble.

So, at any position, the value of the board is obtained by combining the values of the
positions, I mean, the value of the board is determined by combining the colours at the
positions where there are marbles. So initially there are 32 such positions, they are combined
and we get a value. Now how are these colors combined? So we will follow the simple rule,
that whenever 2 distinct colours combine, you get the third colour. So if a red and the green
combine, you will get a blue. If a blue and the red combine you will get a green, and so on.
So 2 distinct colours when they combine you get the third colour, and if a colour combines
with itself you will get a brand new colour. So there is a fourth colour in the picture, lets
denote it by pink. So this is the fourth colour. So let us just write our setup. So we define
what is a value of a board, this is obtained by combining the colours at the occupied
positions. So we have defined the value of a board by combining the colours at the occupied
positions.

So given any board, we know how to compute the value. Now this combination means the
following. If you think of C1 and C2 as a colour and if C1 not equal to C2, then they combine
to produce C3 and if the colours are same, if C1 is equal to C2, then they give rise to C4. So
C4 is a new colour. Now this combination, rule 1 when you think about it, this applies only to

731
colours red, green and blue. If it was a fourth colour, if it was the pink colour then that colour
combines with any colour, it does not change anything. So that is third rule. C4 does not
change any colour, in the sense if you combine a red and the pink, what you will get is a red.
If you combine a green and a pink, you will get a pink. If you combine a blue and a pink then
you will get a pink colour, and pink and pink when they combine will still get a pink colour.
So we have described the rules of combining colours. There are 3 colours on the board, but
when you determine the value, we had introduced a fourth colour. So 2 colour combines and
give rise to the colour which is not present if these 2 colours are red, green and blue.

The fourth colour was used at any time that will not change the colour of anything that it
combines with. So now we have defined formally what the value of a board is. Now let us
look at the initial value of the board. So the way in which these colours combine does not
really matter, you can convince yourself that the order of combinations do not really matter,
that means whether red first combines with the blue and then combines with the green or the
other way really does not matter. Because, so let us say we write down these colours, C1, so
if you arrange them as alpha 1, alpha 2, alpha k, these are the colours. Now at any point,
alpha 1, alpha 2 same as, if you combine alpha 1 and alpha 2 that is the same as combining
alpha 2 and alpha 1. So any rearrangement of colours is still okay. So any order of combining
them will give rise to the same colour.

So for this board, what is the initial value of the board? Think about it for a few seconds. So
we can determine it by looking at the rose looking at this diagram carefully. So if you look at
the top most row, the red and green combines to give a blue. Blue and blue combines to give
a pink. So any 3 of them, because all these were occupied positions, all these give rise to pink
and the pinks when they combine with each other, they do not do anything. They would not
bring in any change. So you are going to get a pink by combining this. And the red and blue
where occupied positions, where as a green in the center was unoccupied, so the total product
is going to be just the product of, or the total value of the board is going to be the value that
you get when you combine a red with a blue and that is going to be a pink, and the others
combine and give a pink. Pink and pink combines to give you a pink, so the initial value is
going to be pink.

Now here is a crucial observation, which is where we use a invariant property. Moves do not
affect the value of the board. So look at the value of a board, before and after a move is

732
performed, the value remains the same. Why is this so? Because let us look at any move,
think of it as horizontal, and vertical is similar. So you have 3 positions. So let us say, we
denote them by alpha 1, alpha 2 and alpha 3. Two of them are going to be occupied, let's say
the occupied positions is what we call as alpha 1 and alpha 2, and alpha 3 is the unoccupied
position.

So this is the unoccupied position, and this will change to 2 unoccupied positions and 1
occupied position. The initial value of the board is going to be obtained by looking at all the
other positions, so let us say this is the initial board or let's say before move configuration. So
let us say A denotes the remaining of the board, and then there are these positions alpha 1 and
alpha 2. The occupied positions inside this they will combine and let us say it will give some
value A. When alpha 1 and alpha 2 combine what you will get is clearly alpha 3. So the
initial value is A combined with alpha 3. So I will write as A followed by alpha 3, so this is
the initial value, and after the move these 2 positions are unoccupied whereas the other
positions remain unchanged, and alpha 3 which was an initially unoccupied position, now
becomes an occupied position. So the value after the move is going to be again A, because
those were the other positions, the positions other than alpha 1 and alpha 2 and alpha 3, their
value if we denote it by A, that followed by the only occupied position that is alpha 3.

So if you combine these, whatever values you get, that is going be the value of the board. So
you look at any individual move, that does not change the value of the board, and we argued
that at the start, the value is B. So after any move, no matter what move you perform, what
order you perform, the value of the board should still remain B or green. So if you call this as
green position, the value of the final board should be green. So now if we asked whether we
can obtain a board position with precisely 1 marble at this position, we can say that is not
possible because that would be a board position whose value is red. But the only allowable
board positions are the once in which the value is green. So, this was a neat little puzzle,
wherein we solved by identifying a crucial invariant property. What has this got to do with
group theory? So here we talked about objects combining in a certain way, in another
arbitrary way of combining objects. We found combining method very useful to solve our
problem, but many different objects, I mean, can be studied by this kind of methodologies,
wherein we identify some structures in the problem and use that to solve certain problems.

733
(Refer Slide Time: 26:30)

So, what we have in any algebraic structure, so let say what is an algebraic structure. So, an
algebraic structure is a set. In our case, this was colours and some operations on it, and that
was the way of combining. So depending upon what are the properties that you insist of the
operation you will get various algebraic structure. So in our case we had, we had a set of
colours, 4 colours and we told how 2 of them can combine to give rise to a third colour. So
the particular thing that we used, can also be called as a group. So let us understand what are
groups. So formally we need to understand what are binary operations, because these
combining we are restricting to combining 2 at a time, you can combine more at a time you
will get more complex objects, but we will restrict ourselves to binary operations. So let us
understand what binary operations are.

So you have a set, let us call the set as G, and a binary operation, so we could use any symbol
let's say plus, multiplication, subtraction, all these are examples of binary operation. So, that
would be thought of as a function from G cross G to G. That means, take 2 elements and map
it to a third element of the set. So it can be thought of as a function which takes a pair of
objects from a set and gives rise to an element of the set. In the functional form, this can be
written as star followed by x, y, that means it is combining x and y and it is a function with 2
parameters x and y. But when we look at algebraic structures, the most common notation for
writing star (x, y) is x star y. We will write it in between and not use the functional notation.
So the usual binary operations, you can think of plus defined on either as a set of natural
numbers, or the set of integers or on the set of real numbers or complex numbers and so on.

734
So that is a binary operation, it takes 2 elements, combines them and gives a third element in
the same set. Now if you think of the subtraction operation on the set of natural numbers that
is not well defined. So we should not really talk about subtraction on natural numbers,
because if you subtract 3 and 8, if you look at 3 minus 8, the result is not really natural
number. So we want when we say binary operations, by definition they take 2 elements and
return a third element. So 3 minus 8, the answer is minus 5 and that is not an element of N.
So minus cannot be viewed as a binary operation on this particular set, but of course it can be
viewed as a valid binary operation on the set of integers.

So when, in order to define what are groups, we will need another property known as
associativity. So if you have, let us say elements A, B and C belonging to G, we could first
combine A and B, and then combine it with C or we could have first combined B and C and
A is combined with the result of that. If these are both equal, then we say that the operation is
associative. So formally, for every A, B, C, belonging to G, if A star B star C is equal to A
star B star C, use parenthesis to indicate which of the operations is done first. If this is equal,
then we say that star is associative. So note that, in our colour combination example, the
operation was indeed associative. If you had first combined, let's say, if A, B, C were 3
colours, some of them may be equal or unequal, each of them could be one of the 4 colours.

So if you can verify that A star B star C is equal to A star B star C. So that was indeed a an
associative operation. Infact it had an additional property, whether you, I mean, we were just
bothering about combining 2 colours, which one is written as a first colour and which one is
written as a second colour does not really matter, so that is called as commutativity property.
So A star B is same as B star A. So when we looked at the binary operation, that was taking 2
parameters and when you talk about 2 parameters, what is the first parameter and what is the
second parameter is important. But if it is commutative, then the order does not really matter,
A star B is equal to B star A. By our definition of combining colours we did not really think
of it in the functional way, because this operation we wanted it to be commutative. So now,
after we have understood what is binary operations, what is associativity and what is
commutative, we can define what is a group.

735
(Refer Slide Time: 32:55)

So the formal definition, a group is a set G with an operation star, we call it as star, you can
use any other symbol. So it is a pair, the first is a set and the second is an operation such that
this is 4 properties. First, star is a binary operation. So some books might say that this
operation star is, there is closure. But that is, when you say binary operation, it is taken care
of. So this is the binary operation, and you want this to be associative. That means for any 3
elements from G, A, B and C, A star, B star, C, no matter how you parenthesis it, you will get
the same result. And the third property is, Identity, existence of identity, that means there
exist an element, let's call that it as e, such that for all a belonging to G, a star e is equal to e
star a, and that is equal to a.

So there is a special element. We will later on prove that, there can be only once such, I
mean, if it is a group, then there cannot be 2 such elements. So here when we say the
definition, we say that there is an element, the property of identity says that there exists a
special element such that, any element of the group when multiplied with that element, will
give rise to the same element, it does not change things. And the fourth property is, that of
inverse. So formally this means, for every element a belonging to G, there exists an element b
such that ab, a multiplied by b, or a star b is equal to b star a, and that will be equal to
identity. So, if you can find a set equipped with a binary operation, which has these
properties, that is, it is associative, it has an identity and every element has an inverse, then
we say that this is a group.

736
So let us look at our example. We had 4 colours. Let's call these colours as by letters. So this
is our special colour, which we, which will do the role of e, or the identity, and let us call
these as a, b and c. And you can check that for any 3 elements, so let us say, x, y, z. So x star
y star z is equal to x star y star z. If x is identity, so we have to verify for all cases. If x is
identity, then what we have here is y star z that is because e combines with anything to leave
it unchanged. So if x is e, then this is going to be, the left hand side is going to be y star z and
the right hand side x combines with y is going to be y, and the entire product is going to be,
or the entire expression on the right hand side is going to y star z. So when x is equal to e, this
is clear. When any one of y and z, if any of them is e, then the same reasoning works out.

So we may assume that they are all different from e, and if 2 of them are the same, so one
case is when all 3 of them are the same. For all 3 of them are the same, then does not really
matter, because left hand side and right hand side, they all give rise to, so let us say x star x
star x. x star x is going to give you e, and x star e is going to be x, and the other side is also
going be the same thing. So when all them are the same, does not matter. If all of them are
distinct, then x, y and z., when they are all distinct y, z will combine to give x and the entire
product is going to e and the right hand side is similar. x star y is going to be z and z star z is
going to e. So when all of them are same, all of them are different, those cases are taken care
of. The only remaining case is when 2 of them are the same.

So if y equals z and y star z will give you e and x star e is going to be x. If y is equal to z then
the left hand side is going to x, whereas the right hand side, x into y will be the third missing
element which we will call by w and w multiplied by z will give you the missing element
here, which is nothing but x. So in that case also, things are going to be the same. So here we
assume that y and z are equal. If x and y are same, then the same reasoning applies and so on.
So we can verify in a systematic manner, that the operation is associative and the pink colour
or e, automatically serves as the identity and every element has an inverse, namely the
element itself is the inverse. You take any element, it's inverse is itself. Because you multiply
it with itself you will always get identity, or you combine it with itself you get identity. So in
our example, in solitaire problem, we had basically assigned an element of group to each
position in the board and the value of the board was the product of the occupied positions.

737
(Refer Slide Time: 40:09)

So let us look at some properties of groups. The first property that we will state is that the
identity is unique. Every group will have a unique identity. Look at the definition, we just
said that there exist an element. There could be multiple, how do we rule out that case? So
suppose e1 and e2 were both identities of a group, and we want to show that e1 is equal e2.
So we just need to look at e1 star e2, because e1 is the identity, this has to be equal to identity
multiplied by any other element should leave that other element unchanged. So this should be
equal to e2. Also because e2 is an identity, we can say that e1 combined with e2 should leave
e1 unchanged. So this should be equal to e1 and therefore e1 is equal to e2. So there is a
unique identity.

738
The next property is that, the inverses are also unique. So, the inverse, we will denote it, we
will have a notation. So if you have an element x, since it is inverse is unique, there is a
specific element, we will , once we have this particular fact proven we can denote it by x
inverse, that will be the unique inverse of any element x. How do we prove that? Suppose, let
us say a and b are both inverses. So what do we know? Because of this, we can say that xa is
equal to identity, ax is equal to identity, xb is identity and bx is also equal to identity. From
these somehow we need to argue that a and b are same. So let us look at this equation xa is
equal to identity, now if we multiply with b on the left, what we will get is b times xa or b
star xa. So when I write xa, it means I am just swallowing up the star, instead of writing star
each time, I will just write the letter.

So when I write bxa, it means b star x star a and associativity guarantees that the order really
does not matter. So bxa that should be equal to b times e and that is b. But here, bxa we can
write it as b times x times a. Since bx is identity, this is equal to identity times a and that is
equal to a. So we get a is equal to b. So every element x will have a unique inverse, in
particular you if you look at the element e, the identity element its inverse has to be itself,
because it satisfies the property that, so e times e is equal to e, therefore e is an inverse of e
and since it is the, there is only 1 inverse, e has to be it's own unique inverse.
(Refer Slide Time: 44:18)

And another property of inverse is this. If you look at x inverse, that is some particular
element, its inverse is equal to x, that is straight forward because by definition of inverse, x
times x inverse is equal to x inverse times x is equal to identity. Now if you let x inverse play
the role of x, then this equation says that the inverse of x inverse is equal to x.

739
Because if you denote this by let us say alpha, if you denote x inverse by alpha, alpha times x
is equal to x times alpha and that is equal to identity. Therefore, the inverse of alpha must be
x.
(Refer Slide Time: 45:12)

The third property of inverse is, if you multiply 2 elements a and b, and take their inverse,
that is going to be equal to the product of the inverses, but done in the inverted fashion. So
this won't be equal to a inverse times b inverse unless the group is commutative. But in any
case it will be equal to b inverse times a inverse. That is because if you look at the product b
inverse a inverse times ab, this will be equal to identity and this will also be equal to ab times
b inverse a inverse. So this has been bracketed in a certain way, but because of associativity
properties, we can just bracket in other ways and you can combine them and see that they will
give identity. So, we just need to look at b inverse times a times b. So the initial combination
of a inverse and a, will give you identity. b inverse times identity will give you b inverse, that
multiplied by b gives identity. Same thing applied on the other side will again give you
identity. So ab inverse is b inverse times a inverse. So we will stop here and will continue our
exploration of algebraic structures in the coming class.

740
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of computer Science and Engineering,
Indian Institute of Technology Guwahati
Lecture 38
Modular Arithmetic and Groups

So we will continue our study of groups.

(Refer Slide Time: 00:30)

A group is basically an algebraic structure. Let's say we denote it by (G, star). So it is two
components, the first is a set and the second object is a binary relation, sorry, binary function
from G cross G to G. So it takes two elements, combines and gives an element of the set. So

741
that, so it makes it a valid binary operation and this binary operation should have certain
properties. The first property was, it should be associative, and the second requirement is
there should be an identity. And the third property was that, every element should have an
inverse. If these three properties are met for binary operation on G, then we say that G
together with the binary operation is a group. The examples that we have seen, like one of the
example we saw, we can describe it by it's multiplication table. When we specify the binary
operation, we basically need to tell, for each pair of elements, what is result of combining
them?

So that can be given by a table. So if we took G to be e, a, b, and c, we want to describe here


the group that we used to solve our puzzle on reachable configurations on a solitaire board.
The group can be described as e is the identity and the others were the non-identity elements.
e multiplied by e or e combined with e gives you e and everything else, when we combine
with everything else, leaves the other elements unchanged and any element when combined
with itself gives identity. So, a with a is identity, b with b is identity, c with c is identity. a
combined with b gives c, a combined with c gives b. So any two elements combine, you get
the resultant as the third element. You can complete the table, this is what we would get. So
this is a multiplication table, so that table that we have written is a multiplication table. It is
called multiplication table because we chose to describe this operation as, initially the group
operation is viewed as multiplication though it could be very different from arithmetic
operation of multiplication.

Now, this group has certain interesting properties. First is, it is a finite group, because there
are only four elements in the group. So in general, the finite group means the size of G is
finite. A group with finitely many elements in the set corresponding to it that is called as a
finite group. So this is an example of finite group. So far all the groups that we have seen are
finite groups. Now we could look at another property of this particular group, namely this is a
commutative group. Commutative group, by definition what it means is a star b should be
equal to b star a, for all a, b belonging to G. If this is the case, then it is called as a
commutative group, which is the clearly the case for this particular group that we have
considered. In fact all the groups that we have seen so far are finite groups.

Let us see some examples. We will see that this algebraic structure that we defined is, well
there are many cases in mathematics where you come across objects which behave in this

742
particular manner, many familiar examples. For example, if you take the set of integers. So
this is a set of integers. They do form a group. But when we say group, we need to mention
the operation as well. So Z under, so with plus, the usually arithmetic addition is a group. So
we can check each and every property. Clearly you can add any two integers and the resultant
is a new integer. So the operation, the binary operation plus is well defined on the set of
integer. Further we can verify that it is associative and we claim that there is an identity
element, that means there is a element x such that, that plus the identity will be x. So there is
a special elements e in the set of integers, which makes this particular equation, let us call this
as equation 1 that makes equation 1 true. So if we can show that there is one element, then of
course that element is going to be a unique element.

So in particular here e will be 0. So if you look at the 0 integer that can act as the identity. So,
the identity property is checked. Further we need to argue that for every element x, there exist
y such that x combined with y. So here combination operation is y. So this should be equal to
identity which we discovered is 0. In particular, the integer minus x will do the task. So you
take x and add it to minus x, you are going to get 0. So every element has an inverse. So that
was an easy example. Now, not just this, if you look at the set of rational numbers, they also
form a group under addition. If you look at the set of the real numbers, they also form a group
under addition. If you take a set of complex numbers, they also form a group under addition.

So these are examples of group under the usual arithmetic. But note that the addition
operation is, I mean, all though we refer to them as addition, the addition operation on
integers is different from the addition operation on rational numbers, again different from the
addition on real numbers and on complex numbers because, if you think of them as
operations, their domains are very different. Of course when you restrict to the, because these
are subsets of, I mean, there is some subset relationship between Z, Q, R, and C. So if you
restrict to, if you restrict the addition to just the set of rational numbers, then if you look at
the addition on complex numbers and restrict that to just on rational numbers, then of course
we are talking about the same operation. Now, let's see some examples of certain algebraic
operations, which will not be let's say a group. For example, if we take the set of integers and
if you look at multiplication, this is a well-defined binary operation, you can take any two
integers and you can combine them. When you combine them, you will get an integer.

743
And clearly that operation is associative, and is there an element which can act as identity, of
course there is. If you look at the element 1, so 1 multiplied by any element is going to give
you the same element. Even if x is 0, 1 multiplied by 0 is going to be 0. So this is going to be
trivially true. So 1 is the only element which will have this property. Now if you look at the
inverse property, which should be satisfied by every group. If you look at elements other than
1, or may be even minus 1, If you look at let us say 5, is there an element such that 5 times y
is equal to 1? There are, for example 1 by 5 can do the task. But still 1 by 5 is not an integer.
So this does not form a group under multiplication. So Z under multiplication is not a group.
Although this is a well-defined operation, if you look at Z under division, that also is not a
group. Because that is not even, I mean, division, you cannot take 2 arbitrary integers and
divide one by other.

In particular when the number by which you divide is 0, that is not a well-defined operation
and therefor that would not be a group. Now if you look at the set of rational numbers, you
still have, you can look at all the rational numbers, you still have the associative property and
1 is going to serve as the inverse, I mean, sorry, 1 is going to serve as the identity. But all
elements except 0 has an inverse. So, this is almost a group, but not quite there because of 0.
So if you take Q and remove this element 0, from it, whatever you get is going to be a group
under multiplication. But if you look at the same set, this is not going to be a group under
addition. So this is a group with respect to the operation multiply, whereas it is not a group
with respect to, say, addition. Because the inverse, identity under addition was 0 and that is
the element that we have removed from the set and no other element can serve as the identity.

You can look at rational, at real numbers that are not going to be a group under
multiplication. But if you remove 0, then we have a group under multiplication. Same with
complex numbers, if you remove 0, they will form a group under multiplication. So these
groups, all these groups we have looked at here they have this nice property that they are
infinite groups. Z, Q, R, and C, they are all infinite groups. So we have seen examples of
groups such are finite and infinite, and these are groups are all, both the finite groups we have
considered and the infinite groups that we have considered are all commutative groups. Later
on, we will see examples of non-commutative groups as well.

744
(Refer Slide Time: 13:29)

Now let us look at a special group which arises out of number theory. So the operation is a
very familiar operation. It almost mimics the addition. So we will look at modulo n
arithmetic. So that this will be used to define the special groups that we are interested in. So
let us look at the set of integers, and we are going to define a relationship between two
integers. We will say that a is related to b if a minus b is equal to k times n. So, if n divides a
minus b, then we will say that a and b are related. So, we will start by fixing an n. So if your
n was 100, then 1 and 101 are related, because 1 minus 100 is minus 100 that is k times 100
for k equal minus 1. So 201, 301, all these elements are related to each other, whereas 1 and
102, they are not related. So this relation is a special relation, and since it is called as an
equivalence relation.

745
So as you would have learnt from your set theory lectures, an equivalence relation is a
relation which has these three properties, it should be reflexive and it should be symmetric
and the relation should be transitive. So if you look at the set of integers, this is your Z,
clearly for any n, this relation that we have considered is reflexive because a is related to a for
any a because a minus a is zero that is going to be 0 times n and it is symmetric if b is related
to a, this would imply that a is related to b. Because b minus a and a minus b, I mean, if one
of them is a multiple of n, the other is also and it is transitive because a related to b and b
related to c, this would mean that a is related to c. This can be seen by observing that, if a
related to b then a minus b is equal to k1 n and b minus c, if we look at it that is going to be
equal to k2 n, and if you add these equations up you will get a minus c is equal to k1 plus k2
times n. So this modulo n arithmetic, if you, we have not really talked about modulo n
arithmetic yet. If you look at this equivalence relation, if you look at the relation tilde, that is
going to be an equivalence relation. And one thing that we know about equivalence relation
is, if there is an equivalence relation then that is going to split up the underlying set into
different partitions. So if this was you are set Z, if you look at tilde, you will get all the, you
can partition Z into some number of parts. Here we can more specifically state what are the
parts.

So we will call them as 0 prime or 0 bar 1 bar 2 bar up to n minus 1 bar, where 0 bar, or let
us say a bar consist of all element x such that x is related to a. If you take all the elements
related to any particular element that forms one particular equivalence class, and here
therefore, there can be only n minus 1 equivalence classes. Any other element has to be, you
take any arbitrary element alpha, it has to be related to one of these elements 0 bar, 0, 1 up to
n minus 1. And therefore, must fall in one of these equivalence classes. So here we started of
with Z, and by looking at this equivalence relation we have partitioned Z into a finite number
of equivalence classes. So for any particular element i, if you look at i mod n, that is the
remainder that you get when i is dived by n, that is going to be the equivalence class
containing.

So if you look at that number, that number we can think of as a representative element of the
equivalence class containing i. So these are all our classes. So although we are writing, so let
us say our n, think of it is 100 although let's say our classes are 1 bar 2 bar up to 99 bar, we
could say that 101 bar, this can be defined as the equivalence class containing 101 and that is

746
going to be equal to 1 bar. So now what we have is a partition into equivalence classes. Let us
see if we can do arithmetic with these classes, which will mimic our normal arithmetic.

So, suppose I want to multiply to equivalence classes, does it make sense? For example, what
is going to be one bar into three bar? I could define this to be equal to 1, or let us say 5 into,
so 15 bar into 13 bar, I could define that as 15 into 13 whole bar. Now this is going to be a
valid definition only if I take other elements, let us say 115 which is in the same equivalence
class and multiply it into let us say 213 bar. By definition this should have been 115 into 213
the whole bar. But since these both, these classes are the same. These classes also, the classes
on the right hand side also must be same. In other words, 15 into 13 the whole bar should be
equal to 115 into 213 the whole bar if this arithmetic is to make sense, can easily verify that,
that is the case in case of both addition and multiplication. So we can say a bar into b bar is
going to be equal to a into b the whole bar.

Because a bar is going to be all those elements which leave a remainder a and b bar is all
those elements that leave a remainder b. So a bar, you can think of a as let us say remainder a
plus k times n and b is equal to remainder b plus k2 times n and ab is going to be equal to
remainder a remainder b plus all the other terms are going to contain n in them. So this is
some expression alpha times n and this is just, I mean if you look at ab and consider the
equivalence class of ab, that is going to be same as the equivalence class of r_ab. So, the
remainder that a leaves when you divide it by n alone is the, alone determines the equivalence
class containing a. So you can work out the arithmetic and verify that the operation of
addition and multiplication are indeed well defined. So early in all our earlier examples, we
started with an operation which is well defined except in case of division, we had well
defined operations of addition over integers or complex numbers or rationals and so on.

Here we have a new set, which consist of equivalence classes and on theses equivalence class
we have defined two operations, one of additions and one of multiplication. In fact, we have
defined infinitely many operations one for each number. Fix a number n, if n equals 100 you
have a particular split into equivalence classes. If n is 113, we have yet another split into
equivalence classes. Once you fix the collection of sets or the equivalence classes, you can
multiply them, you can add them in a certain way and those operations make sense. They are
valid binary operations on a finite set because each of these equivalence classes although they
might be infinite, the number of equivalence classes is always going to be finite. If you are

747
doing mod n arithmetic, if your equivalence relation is defined on let us say some number,
finite number n, the number of equivalence classes that you would have is a finite number.

And therefore this is a finite set on which we are operating. So we can ask ourselves this
question, is this finite operation? Along with these equivalence classes a group. So we see
that under the addition they do form a group but under multiplication they do not. But there is
way of extracting a group out of these operations for multiplication as we will see, that in the
remainder of this lecture.

(Refer Slide Time: 24:46)

748
So let us first see the easy part. So we will first give some names to this. So Z mod n Z, this is
a notation, when we write Z slash n Z, what we mean is this. This is defined as the
equivalence classes arising out of the equivalence relation tilde. So I am going to write tilde
n, because we say that a is related to b, a tilde n b, if a minus b is a multiple of n. So this is
dependent on the particular n that you chose. And Z mod n Z, that is how this is read. So Z
mod n Z is going to be the equivalence classes arising out of this particular relation, and if we
look at the addition operation of these classes. So Z mod n Z forms a group under plus. So
this is, this plus is not the usual addition of numbers but this is the mod n addition. So you
add two equivalences. We add it in the way we described earlier, and that is same as the
modulo n addition.

Why is it a group? We need to check. First of all that this is a valid binary operation, that is
something that we have seen earlier and clearly it is associative, and the equivalence class
corresponding to 0 will act as the identity and for any element x, if you look at minus x or the
equivalence class containing minus x, that will be the inverse. So we can easily verify inverse
property also. So once you have a finite collection with the property of, with the operation
being associative and identity and inverses being present, we know that forms a finite group.
Let us look at Z mod n Z, the same set under multiplication. So in this case if we look at, let
us say, so let us take an example. If we take n is equal to 10, then our equivalence classes are
0, 1, 2 up to 9.

So 0 bar, 1 bar, 2 bar, and 9 bar. So 0 bar consist of all the numbers which are multiples of 0.
1 consist of all the numbers which are 1 more than multiple of, sorry, 0 bar consist of all the
multiple of 10, 1 bar consist of number which is one more than multiple of 10 and so on. So 0
bar is 0 plus or minus 10 plus or minus 20 and so on. 1 bar is equal to 0 plus 1 plus or minus
10 plus 1 plus or minus 20 plus 1 and so on. So these are your different equivalence classes,
and if we look at 2 bar, that is one of the elements. Is there an element x, such that two bar
multiplied by x will give you identity? Clearly we need to check with only elements 1 to 9.
But one thing we can observe is when you take two bar and multiply it with x and take the
whole compliment and look at the equivalence class containing 2x.

So this is equal to 2x bar. 2x is always going to be even number, and here 1 bar, we can check
that if there is an identity then one has to be that identity. Identity has to unique and it has to
be 1 and 2 times x bar is going to consist of only even numbers and therefore this cannot be

749
equal to 1 bar. So clearly this set of numbers cannot form a group. at least when n is equal to
10. So in general it cannot be a group under multiplication operation.

Is there some way to extant out a sub set of these such that they form a group? So now we
will stop pretending that the equivalence classes are going to be sets, instead we will just
work with the representative elements. That is much more convenient than just let us say
thinking of these 1 bar, 2 bar and so on. We will just replace with number 0, 1, 2, up to 9 or
when you are looking at the equivalence class corresponding to n, when we are looking at
mod arithmetic we will think of equivalence classes denoted by numbers 0 to n minus 1. We
will not put the bar above it and pretend that it is a set or it is an equivalence class. Instead
whatever we do with the numbers themselves can be restated in the language of combining
the sets by the other equivalence relation. So if we take these numbers 0 to n minus 1, can we
somehow make this a group under multiplication? May be the element 0 is going to be the
trouble maker.

So let us first work out an example involving when n is equal to 10. So if you have 0, then
clearly the multiplicative inverse is going to be difficult to combine. Because any element
multiplied by 0 is going to, or a multiple of 10. So x into 0, will always be equal to 0. It will
never give you 1. So 0 cannot any have an inverse. What are the other elements which may
not have an inverse. So we will try to remove the element which does not have an inverse,
and look at the remaining set and hope that will do the trick. Here at least, the evenness is
causing trouble. So let us say we remove all the even numbers, then we will get 1, 3, 5, 7, 9.
But 5 is also trouble maker here, in the sense 5 times x is going to be and this is always going
to be multiple of 5, so you will never get 1.

So maybe we will remove 5 as well, and we will look at 1, 3, 7, 9. So now if we just have
these four elements, let us try and see, let us try and write multiplication table corresponding
to it. So, our elements are 1, 3, 7 and then 9. 1, 3, 7, and 9, 1 into 1 is 1. 1 into 3 is 3. 1 into 7
is 7. 1into 9 is 9. This is going to be a commutative group, so we need to look at only half the
elements, almost half the elements. 3 into 3 is 9, 7 into 7 is 49, that is also going to be 9, 9
into 9 is going to be 81. So this almost look like our e, a, b, c. But that is not the case because
3 times 3 is not going to be 1 and 7 times 3 is going to be 1, because 7 times 3 is 21, which is
1 mod 10. 7 times 9 is going to be 3, 63. 3 times 7 is 1. 3 times 9 is 7, we can fill this up. So
we get a multiplication table and from this table, we can infer that all the properties that was

750
required of the group is basically true. So this has identity, which is one and inverse is also
there.

So namely the inverse of 1 is itself, of 3 is going to be 3 into 7 is 1. So inverse of 3 is 7,


inverse of 7 is going to be 3, inverse 9 is going to be itself. So once you have verified the
identity and inverse, associativity comes for free and we have checked, this is a valid
operation. Therefore these elements form a group. So this is denoted by Z 10, because we
were doing mod 10 operations. 10 Z, the reason why we are writing it as this particular form
Z divided by 10Z or Z mod 10 Z, that will become clear in one of the later classes.

(Refer Slide Time: 36:26)

751
But right now we will just view this as a notation. So but we are not taking all the elements.
So when we write multiplication, here this is what the, although we are writing as Z divided
by 10Z with the multiplication operation this basically means we are considering elements 1,
3, 7 and 9. So what is natural generalisation of this, if it is n Z what does it mean, which are
the elements that stays which are the elements which will go out. Here at least in case of 10,
we could extraxt a subset of elements which form a group among themselves. In case of
general n, can we do this? We can do that and therefore when we write Z n Z star, this is the
multiplicative group under mod n multiplication. So what are the elements that go into this,
we will make it clear in a short while. You can try to guess what are all the elements, that are
going to be there in Z divided by n Z.

It is going to have at most n-1 elements, because 0 is automatically not going to be present
there. So the elements from 1 to n minus 1 could be present and that is present in some
special cases, try and figure out what is the special case. But right now we will describe what
precisely, or more explicitly what are the elements there in Z mod n Z under multiplication.
So let us look at all the elements a, which are less than n and GCD of (a, n) is equal to 1. So if
you look at these collections, we will show that this does form group under multiplication. So
this, look at this collection, we will prove that this forms group under multiplication. But if
you wanted to prove that Z divided by nZ is equal to this, we need to show that, our crosses
was, look at all the elements in Z divided by n Z there are most n elements, 0 to n minus 1.
From these all the elements which could possibly not have inverses we threw them out and
whatever remains they will, they will all have inverses and that is precisely this collection,
why is this so?

So if you take any element, let us say if alpha belongs to Z mod n Z and GCD alpha is not
equal to 1. Then clearly, if you multiply, if you look at alpha x, has a common factor with n.
Because alpha has a common factor with n, alpha x will also have a common factor with n
and therefore alpha x cannot be equal to 1 ever. Because if this was equal to 1 mod n, then
alpha x minus 1 is equal to k times n. alpha x has a factor. So this would mean alpha x minus
k n is equal to 1. Alpha x has a factor with n, so if you take the common factor d out. So d
into alpha by d, x minus k into n by d, this is equal to 1. So this is some number which is
greater than 1 and this going to be an integer. So how can be multiply two integers which are
different from 1, and of which at least one of them is different from one, and get one. So this

752
is impossible, and therefore all the elements which are outside the set that we have describe
here must be excluded.

In other words, any element such that GCD (a, n) is not equal to one cannot have
multiplicative inverses. So that is one part. Now we need to show that every element which
satisfies the property that it's GCD is 1, GCD with n is 1, they will have inverse. So we will
use the following Lemma. If you look at a number, so gcd (a, b) is the smallest positive
number in the set alpha a plus beta b. If you look at alpha a plus beta b, for alpha beta
belonging to integers, and take all the possibilities, we will get an infinite set. In that infinite
set, the smallest positive number is equal to the GCD. We will not prove this statement. It can
be easily proved from the basic number theoretic facts. But this is the, so you can either treat
this as the definition of GCD, let's work out one example. If you look at let us say, 12 and 15.
The numbers that you could form are, let's write few of them. 12 is possible, 24 is possible,
36 and the negative of these are all possible, Minus 12, minus 24, minus 36.

15, minus 15, 30, minus 30, all these numbers are possible and this set has a nice property
that look at numbers of the form alpha a plus beta b, if you take two of them and add them
you still get another number in the collection, for different alpha and beta. So minus 12 and
15 if you add up, you will get 3, so 3 is also going to be an element in the set, and you can
argue that 3 is going to be, there is going to be no other number smaller than 3 and strictly
positive. So at least in this case, you can easily verify that the GCD is in fact the smallest
positive number that can be obtained as a linear combination of 12 and 15. So we can use that
fact that GCD (a, b), smallest positive number of the form alpha a plus beta b. So if GCD (a,
b) is equal to d, then d can be written as alpha plus beta b. So from this fact, if GCD (a, n) is
equal to 1 this would imply that alpha a plus beta n is equal to 1, for some alpha, beta
belonging to integers.

So now if we look at this alpha, alpha is an integer, look at the equivalence class which
contains alpha. If you look at alpha bar, I mean, the representative element for that class is
going to be the inverse for a, because that is clear from this equation. alpha times a is equal to
1 minus beta n. So alpha when multiplied by a, gave you 1 mod n. So if you look at this
equation mod n. So alpha a is equal to 1 mod n. So from this definition, it is straight forward
that any element of the set defined in this particular manner must have a multiplicative
inverse. So if you look at this set, we can look for multiplicative inverses. We can look for, I

753
mean, associativity comes for free, and identity 1 is surely an element of this set, because
GCD of (1, n) is of course 1 and therefore this forms a group under multiplication.

(Refer Slide Time: 44:15)

So we have two new groups, in fact, infinitely many new finite groups. So, this under
addition mod n, and Z under multiplication mod n, although the arithmetic is mod n, the sets
are going to be different in both cases. The first case, surely the set will contain 0 and it will
contain all the other n minus 1 as well, n minus elements as well, whereas if you can consider
the multiplicative group.

754
Multiplicative group will have strictly less elements than n. If n is prime, then and only then,
Z n Z will have n-1 elements. Because if n is prime, the numbers which are relatively prime
to n or the numbers which have GCD with n as 1 are all the numbers less then n, except the 0.
So this gives us many examples of finite groups. We will now see some other examples, from
the set of complex numbers. So let us look at the complex plain. If you look at the complex
plain, then each complex number can be thought of as, so let us say r e raise to i theta, so this
is the angle theta and this distance is r. Now if you look at the roots of unity. So complex
numbers, if you remove the 0 complex number, that is of course going to be a group under
multiplication. Let us look at another example.

So let us look at the unit circle, and let us look at, let us say the n roots of unity. So let us say
n is equal to 10. So consider the n roots of unity. So if the nth root is going to be, let us say
omega, you can check that the other roots are going to be omega square, omega cube and so
on up to omega raise to n minus 1. So 1, omega, omega square,.., omega raise to n minus 1
are going to be the n roots of unity, where omega is e raise to i 2 pi by n, where i is the, where
i is the square root of minus 1. So now let us just focus on these numbers, do they form a
group? Of course we have to think about what is the operation. If you add two of them, if you
take omega raise to 3 and let us say if n is 10, this multiplied by omega raise to nine, this is
going to be omega raise 12, that is going to be omega raise to 10 into omega square, and that
is going to be just omega square. So here the multiplication is a complex multiplication,
restricted to these ten numbers.

So we are looking at this ten numbers, omega, omega square up to omega raise to 9. You can
check that these complex numbers could be multiplied by as, this could be multiplied as
complex numbers, and the resultant is going to in lie in the same collection. So we have a
valid binary operation on the set of nth roots of unity, and that operation, because complex
multiplication is associative, we have an associative operation. You can verify that since one
is there, there is a multiplicative identity and also every element here can be inverted as well,
and therefore these form a group under multiplication. Next example we will, so this is again
a finite group, which is a sub group of an infinite group.

755
(Refer Side Time: 49:03)

If you look at matrices, if you look at the collection of all matrices, that is not going to be a
group in general because, under matrix multiplication there are lot of matrices which cannot
be inverted. So, if you look at invertible matrices, we can show that they also form a group
under matrix multiplication. We will stop here for today, and continue our discussion on
groups in the next class

756
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Lecture 39
Dihedral Groups, Isomorhphisms

(Refer Slide Time: 0:32)

So we had seen many examples of groups. Today we will see one more example, a special
group called as the dihedral group. While we are studying this group, we will also learn that
this is a non-commutative group. We will learn the concept of generators and how a group is
presented or described. So let us understand what the dihedral group is? So consider a regular
polygon having n vertices. So let us think that in R2, in the two-dimensional plane there is a

757
polygon with n sides. It is a regular polygon. We are interested in symmetries of this polygon.
By symmetries, what we mean is all those rigid motions which will leave this polygon in the
same location, it will leave the polygon unchanged. Here we have given labels to the corners
of the polygon. But when we are talking about the regular n-gon, we do not really labels the
vertices, we label vertices for a reason that will become clear shortly.

So we want to look at all those motions, all those rigid motions which will leave the regular
n-gon unchanged. One such motion would be let us say if there are n sides and if we rotate
this n-gon by an angle 2 Pi by n that is the angle in radiance, if we rotate by this much
amount, rotate about the centre of the regular n-gon, then the starting configurations and the
ending configurations would look identical. Of course, if you have labelled the vertices then
whatever was at vertex1 will go to vertex 2 if our rotations were anti-clockwise. So that is an
example of a rigid motion which leaves the n-gon unchanged. So let us consider the
collection of all such rigid motions of this regular n-gon. So fix an n, so let us say n is equal
to 8 or 10, I mean, pick any number and consider a polygon of those many, regular polygon
of those many sides and we are interested in set of all the rigid motions which will leave the
n-gon or the inside polygon unchanged. If you look at this collection, does this collection
form a group and if so under what operation?

Since you are, we can think of these motions as rigid motions as functions which are
transforming one point to, or changing one point to another, we could think of function
compositions as the natural operation. So, suppose S is a rigid motion, it means it is changing,
I mean, it is taking the polygon then moving it around in the 3D plane in R3, and finally it is
placing it in some particular way on the plane and the diagram looks exactly the same that is
what we mean by symmetry. So if you have one such and if you have g, if you do the
operation of g and then follow it by f, that can be written as f composed with g. So if you do
these operations together, then you can reason out that the resultant operation will also leave
the polygon unchanged, those one operation which left it unchanged on the resulting thing,
which is the same as the original one again apply another operation and that would give you
back the original polygon itself and therefore if you combine 2 of them, they will naturally be
a symmetry.

Now how do we describe these symmetries? We could of course think of them as maps from
R3 to R3, but that is a huge object means you are mapping every point from R2 to some point
in R2, when you have given the function. Okay, but here what really matters is where is the

758
vertex 1 being sent to? Okay, so we can in fact represent all symmetry of the regular n-gon by
means of a permutation.

The permutation decides, I mean, if one was this particular vertex, after the rigid motion this
vertex has to end up as some other vertex, which vertex is it? Vertex 2 has to end up in some
other place in one of these other vertices, where does it go to? So, each symmetry can be
described using a permutation. So that is a key observation. So here we are looking at
symmetries of the regular n-gon. So if you fix the polygon and then every symmetry can be
viewed as a permutation on the set, let's say 1 to n, so label the vertices as polygon as 1, 2, 3
up to n and then we can look at permutations over or arrangement of 1 to n over these, and
each rigid motions which is a symmetry can be viewed as a permutation. So in fact the
rotation by 2 Pi by n radians would be the permutation which maps 1 to 1, sorry 1 to 2, 2 to 3
and n back to 1. So n minus 1 would go to n and so on. So, 1 goes to 2 and 2 goes to 3 and 3
goes to 4 and so on.

So if you look at the permutation given by 2, 3 up to n followed by 1, this is a permutation


which encodes or which, this permutation represents the symmetry that is obtained by
rotating the n-gon by an angle 2 Pi by n. So now we have just done just one example, one of
the symmetries can be viewed in this way. You can convince yourself that all symmetry can
be viewed as a permutation, and therefore the collection of symmetries is going to be a finite
set. So this is an example of a finite group, but the simplest way in which this group could be
described, or any group could be described, any finite group could described is by means of
what is called as its multiplication table. So if a, b, c, d, e are the elements, we have a matrix
indexed by a, b, c, d, e. We have a square matrix and at each position, at the position i, j, we
will give the value of combining i with j.

So i star j will be written at the position i, j in the matrix, such a matrix is known as a
multiplication table. So here, in an abstract sense we know that take all the symmetries of this
regular n-gon, you get a collection and these collection has a nice property that, 2 of them can
be combined by means of function composition. If you view them as permutations,
permutations are again functions from 1 to n to 1 from the set, 1, 2, 3 up to n to 1, 2, 3 up to
n, so these functions can be composed and that operation basically is an associative operation.
So function under function compositions, the requirements for the set symmetries to be group
is satisfied.

759
The required properties were, first is closure that means 2 elements should combine and give
an element in the collection, that is of course the case and it is associative, you can verify that
and the remaining properties were inverse, the existence of an inverse, so if you look at any
permutation, it has an inverse permutation and that inverse permutation will basically
correspond to a particular symmetry. Okay, so you are basically reversing. If 1 was sent to n,
look at the permutation which sends n to 1, and the symmetry corresponding to that is the
inverse is the symmetry that we are interested in, and the permutation which maps every
element to itself or the identity map will serve as the identity. So inverse and identity
properties that are required of groups have also, we have checked those as well.

So these forms a group, that is clear from an abstract sense. But can we present this in a more
concrete manner? Okay, what are all the elements, can we lay it out, can you enumerate it?
Okay, so we will show that if you take the regular n-gon, we will assume that n is let say
greater than or equal to 3, because 2 sided polygon really does not make much. It is not very
meaningful quantity, so we will consider that n is greater than or equal to 3 for all these
discussions that we doing. So let us look at any polygon, and we will look at the special
object wherein we are rotating by 2 Pi by n. Okay, let us call that as let us r. So r is a
symmetry or the operation which rotates by 2 Pi by n. So if you look at these elements, 1, r, r
square, r raise to n minus 1, they are all symmetries. So one means do not do anything, or the
identity symmetry and r is rotate by 2 Pi by n, r square is rotate by 2 times 2 Pi by n, r raise to
i is rotate by i times 2 Pi by n and so on.

So these are some of the symmetries, are there most symmetries for the regular n-gon? So let
us introduce one more symmetry, which is, which we will call by the name s, which is, so by
s, we mean the symmetry which looks at the line joining let us say 1 to the centre and flipping
about that point. So s is the special symmetry, which is obtained by looking at the vertex, the
line joining vertex 1 to the centre and flipping it by that. In some sense, the full collection of
symmetries can be described using r and s. Why is this so? So, how many symmetries can the
regular n-gon have?

760
(Refer Slide Time: 13:13)

Okay, so let us, since we agreed that every symmetry can be described by a permutation, so
every symmetry of the regular n-gon has a corresponding permutation, and because of this
reason, we know that the total number of symmetries is going to be at most n factorial. It is
going to be much less than that in our case. Since it is described by a permutation, if you look
at vertex 1 and where does vertex 1 go to, does it go to vertex i, does it go to vertex n minus 3
and so on, so that is one information that we would need. So let us say 1 goes to some
location. Now, if 1 go to vertex numbered i, then 2 being neighbour of 1 in the original
configuration. After you have sent 1 to i, 2 should be sent to either i plus 1 or i minus 1, that
is the only possibility, because anywhere else you have affected the rigidity of the polygon,
or the motions is no longer a rigid motion.

So there are only 2 options for where vertex 2 can land up. Further if you have decided what,
I mean, to which vertex does 1 go to, and which vertex does 2 go to, once these have been
fixed, every other vertex, its position will be fixed. If 1 has gone to some position let us say ,i
and 2 has gone to let us say i minus 1, 3 has only one position left and 4 again has only one
position left and therefore you can argue that every position gets fixed. So clearly from this,
we can conclude that there are at most 2n symmetries. You take any regular n-gon, the
maximum number of symmetries that it can have is 2n and, we will, if you can show, if you
can describe some 2n functions, some 2n rigid motions, some 2n symmetries for the regular
n-gon we know that that is going to be full set, because there cannot be anything outside that
because 2n is the maximum possible.

761
So the symmetries if you want to visualise them, the 2n symmetries they are going to be
slightly different for the n equals even case and n equal odd case. If you look at the square ,4
sided, you can rotate by 90 degree, 1 will go to 2, that is going to be 90 and 180 means 1will
go to 3, and 2 will go to 4 and so on. So that is going to be n symmetries of that kind, and you
can flip about 4 different axis, diagonals and the centre of the sides. Okay so that will give
you additional n symmetries. If this is was a pentagon, 5 rotations and then you can flip about
each of the vertex and when you join the vertex to the centre, it is going to pass to the
midpoint of the opposite side. So that is going to give you 5 different symmetries and these
symmetries are going to be different, because in rotation except for the identity symmetry,
everything else, I mean all positions will be mapped to a different, I mean, there will not be
an i such that i maps to i after the symmetry has been applied.

It goes to the image or the position where i lands is going to be different from i. There is not
going to be a single position which remains invariant under the application of the symmetry.
Whereas, when we flip about a particular axis, a lot of elements, at least more than one
element could remain fixed. So here 1 and 3 are going to be fixed, when you flip about that
particular axis and here when you flip about those axis you can say that there is an exchange
within 2 and 1, whereas in a rotation, there is no exchange of elements. The exchanges are all
going to be, I mean they get flipped by more than a distance 2.

(Refer Slide Time: 18:48)

762
So you can convince yourself that these are the only symmetries. But we will argue about it
in a formal setting. So let us say one, so let us imagine that r was this particular rotation by 2
Pi n, so 1, r, r square, r raise to n minus 1, these are all going to be distinct elements, because
rn-1, if you look at any one of them ri and rj, so one goes to i, whereas if you apply r to the
power j, one goes to j. So ri and rj has to be different, so this accounts for n symmetries. Now
let s be this particular operation that you get. So if you look at a regular n-gon, this is the
vertex number 1 and this is the centre of the polygon, join them. If it is an even sided
polygon, this line will pass through another vertex. If it is an odd sided polygon, it will pass
through middle of another side, does not matter which one is the case. We flip about this
particular line, and that is what we call as s. Now note that, s, sr, sr square all the way up to s
rn-1, these are n symmetries. These n symmetries are also going to be all different, that means
s r raise to i is not equal to s r raise to j when i and j are different.

Clearly these are going to be symmetries, because the first operation is going to leave, flip
about these particular axis, that is going to leave the vertices, that is going to not affect the,
that is going to leave the polygon unchanged and if you follow it up by another operation
which leaves the polygon unchanged, you are finally going to get an operation which leaves
the polygon unchanged. Can two of them be equal? So if sr raise to i is equal to sr raise to j,
that would mean even if you did not apply this s, still things would have been same. That
means, r raise to i will be equal to r raise to j. So this is not the case, because we argued that r
raise to i and r raise to j are different, when i is not equal to j. So here we have n symmetries
and here there is another set of n symmetries and they are all different from, these are all
different, these are all different but there could still be a problem, there could be one element
from this set, so if you call this as A and this as B, may be A intersection B is not equal to 0.
If A intersection B was 0, then we have accounted for all the elements. So let us argue that no
2 elements here are the same, so let us look at, suppose sr raise to i is equal to r raise to j.

Now if you flip twice, that is same as starting permutation that means s square is going to
give you identity. So if you have sr raise to i is equal r raise to j, then we have r raise to i is
equal to sr raise to j. So this statement would imply the second statement by multiplying with
s on both sides. So we could assume without loss of generality that j is a larger quantity, j is
greater that i. So sr raise to i is equal to sr raise to j and therefore we can cancel off.

763
So, if we had done rotation by 2 Pi by n times i in one case, and 2 Pi by n into j in the other
case, we can reduce the number of rotations by smaller amount. So we can argue that s must
be equal to r raise to j minus i. But this clearly cannot happens is a flip, r raise to j minus i is
some rotation, a flip can never be equal to rotation, because flip leaves a particular element
unchanged.

(Refer Slide Time: 24:064)

764
I mean, the vertex about which you are flipping, that is here that's vertex 1, so flip leaves one
unchanged whereas r j minus, r raise to j minus i sends 1 to the vertex j minus i, so there are
no common elements. So we argued that the elements of the dihedral group are 1, r, r square,
r raise to n and s r, s r square up to s r raise to n these are 2n distinct elements, the distinctness
we have already argued. We also argued that these elements do form a group. Now we want
to see, how this group can be understood in terms of just r and s. Let us try and figure out the
multiplication table for this particular group. So, let us look at general element, identity
element how it multiplies it is clear, one multiplies with anything and give the same.

If you have something on from r raise to i, it could either multiply with r raise to j or r raise to
i could multiply with s r raise to j or you could have, let us say, s r raise to i and s r raise to j,
so there are 3 types of multiplications that can happen. So if you call these as a and these as b,
elements of a with itself, b with itself and elements of a with b. Now, r basically was a
rotation by some angle theta. So if we thought of the polygon 1, 2, 3, let us just look at how
the polygon is drawn and if we find where exactly are 1 and 2 going to after the symmetry is
performed that will basically fix the entire polygon, and therefore while trying to understand
the composition of these symmetries it is enough if we know where the composition maps 1
and 2 into. Okay, so let us say one rotation is by angle theta. So if you call r as rotation by an
angle theta, r raise to i is basically equal to rotation by i times theta. So r raise to i and r raise
to j, if you compose it, that is equivalent to first rotating by j theta and then rotating by i theta,
and that is equal to j plus i theta. So rotation by j plus i theta and that is equal to ri plus j. So
this is a simple multiplication.

765
Now these 2 have s coming in between the rotations that is the flips are coming in between
the rotations. So s is basically a flip, so if you look at 1, 2, 3, a flip basically about vertex 1,
takes this to the other side. So after the flip, 2 will be here, 3 will be here and in place of 2,
there will be n and in place of 3, there will be n minus 1 and so on. So in order to understand
r raise to i times sr raise to j, we will first look at the simplest of these. We will look at the
simplest of symmetries involving s, namely sr. So sr basically means we rotate by r and then
do a flip. Okay, so let us draw this. So we had a rotation. So 1, 2 and 3, if we do rotation,
when we say sr, it means first apply r and then apply s. So that will take 1 to 2 and 2 to the
position of 3, and whatever was at n would come to the 1st position, and then if we do an s,
that is flipping about the starting position, so what we will get is n will remain wherever it is,
and 1 will go to the other side and then there will be 2, 1, 2 just above 1. So fixing 1, 2 all the
other positions gets automatically fixed.

Now, can we view this sr as some other expression? So a claim that sr is equal to r inverse s,
that means first you flip and then you do a reverse rotate. So let us try and verify that is the
case. So if you look at 1, 2, the initial part of the polygon, if we flip that means when we
apply an s, we will get 1 at the same position and 2 basically goes to the other side and the
position of 2 will be taken by n. So this is how the polygon will look after we perform an s,
and if we rotate this by r, then the one will go to the position of n, but if we do a reverse
rotate, that is r inverse then what will happen is there will be a clockwise rotation. Therefore
n will come here, 1 will come to this position and 2 will go to this position.

So we can see that whatever we obtained by sr is same as what we get by r inverse s. So we


can claim that sr equal to r inverse s, and this would also mean that sr raise to i is going to be
r minus i s. It can be seen by repeatedly applying, say rule number 1. If you have sr square
that is going to be, sr r which is equal to r inverse sr and again apply the same rule, so you
will get r inverse time r inverse time s that is r raise to minus 2 s. So basically what it means
is, if we have an expression which involves just the r, r raise to i, r raise to j etc, we can just
do addition and if you have an s somewhere, those s's can be propagated to one side.

So send all s to the left side and we will get sr raise to minus i and so on. So once all the s’s
has been accumulated at one end we will have some expression of the form, s raise to k r
raise to j or j prime, and s raise to k we can simplify it to either identity or s because s square
is doing 2 flips that is equal to doing nothing or the identity operation. So by using these rules

766
all the expression that could come out of multiplying r and s is, we have the complete
multiplication table for r and s.

(Refer Slide Time: 32:18)

So sr raise to i, sr raise to j in particular would be r minus i s s r raise to j that is going to be s


square as identity so minus i plus j. All these additions you can think of as been carried out in
mod n. So what this means is, the entire dihedral group can be described as consisting of
basically r and s, and their combinations. So we will say that <r, s> is a generator for D 2n,
with the restriction that r raise to n is equal to identity, s square is equal to identity and rs is
equal to s r inverse and see that this is going to be a non-commutative group. So here if you
look at the group, the group is of order 2n, there are 2n elements and if you look at the
element r, its order, so order we are using in multiple ways.

When we say order of a group, that means the total number of elements and when we say
order of a particular element, so for example the order of r is equal to n, the order of s is equal
to 2. So now that we have seen many examples of groups, we will move on to learn more
about the abstract concept of group.

767
(Refer Slide Time: 33:46)

We will introduce the notion of isomorphism of groups. So let me describe 4 different


groups, and you can think whether these groups are same in a certain sense. The first
collection of elements is a subset of complex numbers namely plus or minus 1 and plus or
minus i, so these are the complex roots of unity. So, or fourth roots of unity, plus or minus 1
and plus or minus i. These are elements such that you can multiply that. So these 4 elements,
the complex multiplication is well-defined and you can combine them and you can verify that
these form a group. Let me call this as G1, okay. If I write the multiplication table for this,
that will look something like this. Here one is the element which serves as the identity. 1,
minus 1, i, minus i, these are the 4 elements, and one will multiply and leave the elements
invariant. Minus 1 square is 1, i square is minus 1, minus i square is also minus 1.

And we can fill up this table. My second group here is elements consisting of 0, 1, 2, 3 and I
am considering addition mod 4. So again the elements at 0, 1, 2, 3, the identity in this case
would be 0. 1 plus 1would be 2, 1 plus 2 would be 3, 1 plus 3 will be 4, which mod 4 is 0. So
both these are groups with 4 elements. Another group is the familiar group, that we had
introduced when we had started with, consisting of 4 elements e, a, b and c and their
multiplication was given as, so we said e is the identity and 2 elements when they, I mean any
element which multiplied by itself gives the identity and two elements, two different elements
when they combine they give the third element, so b times a is c, b times c is a, a times b is c,
a times c is b, c and a combines to give b, c and b combines to give a.

768
The 4th group that we will consider is elements 1, 3, 7 and 9, and the operation that we are
considering here is multiplication mod 10. Okay, so, 1, 3, 7 and 9 are going to serve as the
elements and 1 is going to play the role of identity. 1, 3, 7, 9. 3 into 3 is 9, 3 into 7 is 21 mod
10, that is going to be 1, 3 into 9 is 27 that is going to be giving 7. 7 to 3 is 1, 7 to 7 is 49, that
is going to be 9 mod 10. 9 into 7 is 63, that is 3 mod 10. So we can fill up this table. So these
are all 4 elements groups, but are they really different groups? Or some of them just, I mean,
instead of, see if here, if I had instead of writing 0, 1, 2 and 3, if I had written let us say, if I
named them as, if I had written this entire thing in binary, so, 00, 01, 10, 11, 00, 01, 10, 11
and filled up this table, they are exactly the same in a certain sense, it is just that the names
are different. So up in renaming, these 2 groups are the same. Here also although the numbers
are, whatever is used to write down those groups looks different, are they are just renaming of
each other.

So can we find the renaming of one of these groups, and get the other groups? or is that not
possible? So we will argue that some of these are same up to renaming and certain others are
not. So these 3 groups, G1, G2 and G3 are nothing but renaming off each other, whereas
sorry, G1, G2 and G4 renaming is of each other whereas G3 is fundamentally different. And
this is what we will say, when we are looking at group theory, to state this fact we will say
that G1, G2 and G4 are isomorphic groups, whereas G1 and G3 are non-isomorphic and
therefore G2, G3, G4 they are all non-isomorphic groups. So how do we see that they are
isomorphic? We just have to find renaming. So to show that 2 groups are isomorphic,
conceptually that is straightforward because all you have to do is find a renaming. So some
part of the renaming we already have, I mean 0 acts as the identity, so here 0 should be
mapped to 1.

So 0 will map to 1 and 1 we will map to i. 2 we will map to minus 1 and 3 we will map to
minus i. If you do this mapping, you can verify that this is an isomorphism from G2 to G1.
So, G1 and G2 are isomorphic because if you map 0 to 1, 1 to i, 2 to minus 1 and 3 to minus i
that will basically make this particular group behave exactly like mod 4, and if you want to
convert G2 to G4, so 0 will be mapped to 1, 1 will be mapped to 3, 2 will be mapped to 9 and
3 will be mapped to 7. More easier way of seeing this would be, G2 can be viewed as 1, 1
plus 1, 1 plus 1 plus 1 and 1 plus 1 plus 1 plus 1. So this element if we call it as a, then G2 is
equal to a, a square, a cube, a raise to 4 and if you look at G1 and if we take the element i, the
entire collection can be seen as i, i square, i cube, i raise to 4 and if you look at G3, they are
nothing but 3, 3 square, 3 cube and 3 raise to 4. 3 square is 9, 3 cube is 27, so 3, 9, 7 and 1.

769
So these are, all these 3 are just another way of representing the cyclic group which contains
exactly 4 elements.

Now all that we showed is these 3 are same, how can we say that the 4th, the group G3, I
should have written G4, how do we argue that G3 is very different from G4? So in G3, there
is no element which generates the complete collection. It is not a cyclic group because you
take any element if you square it, so alpha square is equal to identity for every element, and
therefore it cannot be one of the other 3 groups. Okay known renaming, if you look at the
diagonal, diagonal contains only identity whereas in all the other cases, the diagonal contains
exactly 2 different numbers. So these are non-isomorphic groups.

(Refer Slide Time: 44:33)

Okay, so formally a group G is isomorphic to a group H, if first there should exist a bijection,
there exists a bijection, let us call it as f from G to H, and further f of g1 g2, so g1 and g2 here
are combined using the binary operation in G, this should be equal to f of g1 times f of g2.

So f of g1, this is an element of H, f of g2 is an element of H and these when they are


combined using the operation that makes H a group, the resultant is equal to f of g1 g2 and
this should be true for all g1, g2 belonging to G. If this is the case, then we say that f is an
isomorphism between G and H and we can say that, if there is an isomorphism we can say
that the groups themselves are isomorphic. Okay, so we will stop. This is the end of this
lecture. We will continue on group theory in the coming lectures.

770
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science and Engineering
Indian Institute of Technology, Guwahati
Lecture 40
Cyclic groups, Direct Products, Sub groups

(Refer Slide Time: 00:40)

Last lecture, we were studying isomorphism of groups. We will do one more example of
isomorphism. The first group G that we consider will be the symmetries of the equilateral
triangle. So consider the equilateral triangle in a plane. Let's just the name the vertices as A, B
and C, the symmetries of this would be, there would be 6 symmetries, 3 corresponding to
rotation and 3 corresponding to reflection about a vertex. So let us name these elements. So first
is the identity permutation, or which does not do anything, and the second would be the rotation
by 120 degree about the centre. So suppose we look at the centroid of the equilateral triangle and
you consider rotation by 120 degree, you will get one particular element of this group.

So let us call that as r. So that is rotation by 120 degree and then there is a rotation by 240
degree. So r would send, vertex, so if you look at r, this is the permutation that sends A to the
position of C and C to B, if you rotate once more let us denote that by s. Then we would get so

771
and then these are the 3 corresponding rotation, and then you can reflect about any one of the
axis.

So if you reflect about the vertex, about the axis joining A to the centroid, B and C flips. So let
us call that as x. So, that is this permutation which keeps A wherever it is and then flips B and C.
So C comes here and B comes here, so this is the operation flipping about this particular axis,
that we will call by x and then there are 2 more other axis. If you flip about B that we will call as
y and if we flip about let's say the vertex C, what we get is the element z or z. So our group G
will basically consist of these 6 elements, i, r, s, x, y and z. So these elements when they
multiply, these dihedral groups, they multiply in a certain way and you can verify that they do
form a group under the operation of compositions of these operations.

Okay for example if you combine r with x, so r basically rotates if you start with A, B, C as
labels. So when you say r times x, that means, you are first applying x, so when you apply x, B
and C gets flip. So, you will end up in the configuration A C B, and then you are rotating, so you
will reach A, B and C. Now this is same as flipping about B. If you have taken the vertex B, and
if you have flipped about that, what you get is this. So, r x is going to be equal to y. So you can
do all the other operations and verify that these do indeed form a group. So this is our first group.

(Refer Slide Time: 05:00)

772
The second group that we consider, which we will denote by H will consist of matrices. So these
are matrices, special kind of matrices, where the first element, the elements are essentially
coming from Z3. So, H basically consist of matrices whose elements are members of Z3, by Z3
we mean numbers from 0 to 2. So alpha and beta here are going to be members of Z3 and the
other 2 elements are fixed, 0 and 1 and we have this restriction that alpha is not equal to 0. In
particular, if you think of these matrices and if you look at their determinant that is going to be
non-zero, because the determinant is going to be alpha, H consist of all elements of the form
alpha, beta, 0 ,1 where alpha is not equal to 0 and alpha, beta are elements of Z3.

So now we have specified the matrices and if we look at all these matrices what is the matrix
multiplication, or the operation that we are interested in here. What is the operation that makes
these elements a group? So we will consider the usual matrix multiplication but carried out mod
3. Okay, for example, if you take the matrix 2, 2, 0, 1, this is one of the element of H, because 2
and 2 belong to Z3 and 2 is not equal to 0. This time, let us say the same element 2, 2, 0, 1 this
will be equal to, you can do the usual matrix multiplication and then you would have got, but this
is not an element of, so this is a usual matrix multiplication. But we will do the matrix
multiplication mod 3, so then, we will get the product as 1, 0, 0, 1. So in particular if you are
doing alpha, 1, beta 1, 0, 1 times alpha, 2, beta 2, 0, 1 what we will get is and these mod 3 is our
final answer.

So clearly this multiplication is a well-defined operation, and you can verify that if you took
alpha 1 and alpha 2 as not equal to 0, then when you multiply them out you will still get a non-
zero element mod 3. So this is our second set H. So H has how many elements? Okay, H also
will have 6 elements, because alpha, there are 2 choices for alpha namely 1 and 2 and beta has 3
choices 0, 1 and 2. Okay, so the total number of choices is 2 times 3 that is 6. We can write these
6 elements as, when alpha is 1 we have 1 0 0 1, 1 1 0 1, 1 2 0 1 and then 2 0 0 1, 2 1 0 1 and 2 2
0 1. So these are 6 elements and now we have and these form up, you can check that under the
operation that we have defined these 6 elements do form a group.

So now we have 2 groups G and H, which both contains 6 elements, so it is easy to get a
bijection between them. But can we get a bijection which is going to be an isomorphism between
the 2 groups? I will give the isomorphism, you can check it that the function that I give the,

773
bijection that I give is indeed an isomorphism. So 0 maps to let us say capital I, this is capital R,
this is capital S, this we call as X, Y and Z, check that mapping the small letters to the capital
letters indeed is an isomorphism.

(Refer Slide Time: 10:40)

In order to check that the 2 groups G1 and G2 are isomorphic, or G and H are isomorphic, we
need to check 2 things. The first condition is, there exists a bijection, let's say f from G to H and
second for this function f, f of g1 g2 that is if you combine g1 and g2, you get another element of
G, this combination is done as per the operation inside G and if you combine them and look at its
image under f that is going to be same as, if you had looked at the images of g1 and g2 and then
combine it under the operation in H. So whatever is that result, if these 2 results are same, then
we say that G and H are isomorphic. So if only one of these conditions is satisfied, namely the
second condition, then we will call this function f which has it property as a homomorphism.
That means we are unable to find a bijection, but there is some function which satisfy condition
2. Then we will say that this is a homomorphism.

So homomorphism is same as only condition 2 is satisfied. So we will study about


homomorphism in more detail in the coming lectures. So, now that we have understood what is
isomorphism. In the earlier examples may be it was straight forward to see that the sets that we
have considered were isomorphic here. The groups involved are simple groups they are just 6

774
elements, so one can actually try out but it is not obvious to the simple I that these groups are
indeed isomorphic under this particular mapping.

(Refer Slide Time: 12:50)

We will learn some more terms from group theory, mentioned about this particular term, cyclic
groups. So the definition of cyclic groups says that all elements can be written as powers of some
element. So G is cyclic if there exists small g element of G, such that the set G is equal to g
power 0, g power plus-minus 1, g power plus-minus 2 and so on. So when we write g power
minus 1 that would mean the inverse of g power 1 and g power minus i is g power i inverse. We
could also think of this as same as g power i, g power minus 1 multiplied with itself i times.
When as a group all these properties must be true. So a cyclic group is a group where every
element can be written in the form of g power i for some integer i, i could be in particular 0.

In case of, so this is a general cyclic group. In case of finite group, this condition would amount
to G being equal to let's say g power 1, g power 2 and so on up to g power n, for some particular
n. Here n will be the order of the group and it will be the order of the element. So finite group
means every element can be generated by just multiplying one element with itself enough
number of times. So, in particular if you looked at the roots of unity in the complex roots of unity
that is going to be a finite cyclic group of order n.

775
And if you are looking the nth root, then you will get a finite of order n. We can verify that up to
isomorphism, there is only one finite cyclic group or one do not require the condition of
finiteness up to isomorphism there is only one cyclic group of any order. The next concept that
we will learn is that of direct product of groups. So consider a group A and consider a group B.

We can take the direct product of these 2 groups, we will write it as A multiply A direct product
B. So this consist of say all pairs of the form (a, b) such that, a belongs to capital A and b belong
to capital B. If we take these collections of elements, what operation makes them a group? We
could multiply the elements coordinate wise and do the operation in the corresponding group.
For example if we had taken let us say the group that is a Z6, this will consists of element 0, 1 up
to 5. And let us say we consider Z6 star, which will consist all element which are relatively
prime to 6, namely 1, 2 is gone 3, 4. So Z6 direct product Z6 under multiplication, that is going
to be consisting of 12 elements namely, (0,1), (1,1), (2,1), (3,1), (4,1), (5,1) and (0,5), (1,5),
(2,5), (3,5), (4,5) and (5,5). So these are the 6 elements. And if you want to multiply 2 elements,
let us say if you want to multiply (4,5) with (1,5) . So (1,5) multiplied by (4,5), that is going to be
equal to 1 combined with 4.

So 1 and 4 are combined in Z6, so 1 plus 4 that is going to give you 5. If you had taken let us say
instead of (1,5), if you had taken (3,5) and (4,5), 3 and 4 will combine to give 7, the operation
was addition but we have to do it mod 6 and therefore we will get 1. And 5 and 5 will combine to
give, 5 times 5 is 25, but mod 6 that is going to be 1. Okay, so that gives you identity, not
identity that gives (1, 1). The identity of this particular group will in fact be (0, 1). So this is
defined as the direct product of groups.

776
(Refer Slide Time: 18:38)

Let us look at some simply direct products. If you take let us say C2, by C2 we mean the cyclic
group of order n of 2 elements. We can write this as {1, x} and let us say C3 which is a cyclic
group of order 3 and we can refer to the elements as {1, y and y square}. Now, if you take the
direct product of these two, what we will get is a set consisting of (1, 1), (1, y), (1, y square), (x,
1), (x, y), (x, y square), these are the 6 elements. And you can check that if you look at elements
(x, y) and its powers, so, (x, y)'s powers lets write it down, that is going to be (x, y) multiplied
with (x, y), so you will get x square y square as a first element, x cube y cube, x raise to 4, y raise
to 4, x raise to 5, y raise to 5, x raise to 6, y raise to 6 and so on. There might be other elements,
but if you look at x raise to 6, that is going to be identity because x square is identity and y raise
to 6 is going to be identity, because y cube is identity.

So that is, so we do not have to continue further ahead. But are we sure that these 5 elements are
all distinct? Now (x, y) is same as (x, y) this one element that is present, x square is going to be
identity, so this is going to be 1, y square. And x cube is just (x, 1) and x raise to 4, y raise to 4 is
(1, y) because y cube was 1 and this is (x, y square). So note that, this collection instead of (e, e),
we will write it as (1,1), this is a group with 6 elements, 6 distinct elements and therefore if you
look at (x, y) and look at the set generated by (x, y), that means considering all the power of (x,
y), we get the complete collection and therefore C2 times C3, the direct product is isomorphic to
the cyclic group with 6 elements.

777
So, we can wonder whether this is a general rule. That means if you take the group of order i and
order j, the cyclic group of these orders and then multiplied, you get the cyclic group of order i
times j. That is not always the case, we will see by an example and you can try to answer the
question as to. When will these groups be isomorphic? That is when will C i times C j be
isomorphic to C i times j, when will this, what is the necessary and sufficient conditions for such
a thing happening. Now first we will see why this is not always the case, because if you take C2
and C4 and we can ask this question, is it isomorphic to C8? We can immediately see that is not
going to be the case because the cyclic group of order 8 will have an element of order 8. So it is
going to be one element which generates the entire thing, whereas if you take any element in C2
times C4, so let us call that as alpha, beta.

Now if you raise it to the fourth power, alpha, beta power 4 what you will get is alpha raise to 4,
beta raise to 4. Now alpha is from C2, so alpha raise to 4 is going to be the identity of C2, and
beta raise to 4 that is from a group of size 4 and that is going to be equal to identity. Any
element, you can check that any element in C4 if you multiplied with itself or perform the group
operation 4 times, then you are going to get identity and therefore, there is not going to be any
element of order 8 and therefore these are going to be different.

(Refer Slide Time: 23:33)

778
The next concept that we will see is that of sub groups. So, one thing that we did not explicitly
mention was, if we define the direct product of groups in this particular manner, why is it that it
will always be a group? So, this is the question that you can bother, is A direct product B always
a group? We have to check 4 conditions first. Is the group operation well defined? Clearly it is,
because you take 2 elements let us say alpha 1, beta 1 and multiply it with alpha 2 beta 2, alpha 1
and alpha 2 can be combined in A, beta 1 and beta 2 can be combined in B. And resultant
element is clearly an element of the set that we have described and since now the operation is
well defined, we can check if it is associative, this is also yes because that follows from the
associativity of A and B. Because the underlying groups, A and B are associative you can verify
that it translate into associativity of A times B, A direct product B.

The third is the identity property. So let us say e a is the identity of A and e b is the identity of B.
This will act as the identity of A direct product B. This is easy to check for the conditions of
identity. (e a, e b) multiplied with any element (a, b), this is going to be e a combined with a and
e b combined with b. By e a, by the property of e a being be identity of A, this will give you a
and e b times b similarly will give you b. So, that can be easily verified and similarly you can
verify inverse as well. If you have an element (a, b), its inverse in going to be (a inverse, b
inverse), so (a inverse, b inverse) is the inverse of (a, b). So direct product will always be a group
and lot of properties from the group would translate into the properties of the direct product, for
example if you take abelian groups A and B, their product will also remain abelian That is

779
commutativity is preserved under this, but that is not true for all properties. If A and B are cyclic,
that does not necessarily mean that A times B is going to be cyclic.

(Refer Slide Time: 26:40)

The next concept that we will learn is that of subgroups. The notions are very simple. So if you
have a group G. When we say G is a group, we mean that G is the name of the set and there is an
operation defined on that which makes it a group. So if H is a subset of G, and H is a group with
the operation now being restricted, if so we do not take these elements and look at a completely
different operation, but we look at the same operation on G but restricted to the elements of H.

So under that, if H is a group then H is subgroup, this is the definition of a subgroup. So, for
example if you consider, let's say, 0, 1, 2, 3, 4, 5, this is C6, the cyclic group of order 6 and we
can also think of this of this as Z6, its the additive group, this is a group. Now if you look at let
us say elements, 1 and 5 under mod 6 multiplication. If you took 1,5, that is a subset of the group
that you have considered but the operation that you are considering is changed. So this will not
be called as a subgroup of Z6, whereas if you take element 0, 2 and 4, this is a subgroup of Z6.
You are considering just the even elements of Z6, and under the same operation that is mod 6
addition, if you look at these elements they from a subgroup, but 0 acting as a identity and 2's
inverse would be 4 and 4's inverse will be 2.

780
(Refer Slide Time: 28:51)

So now let us consider an arbitrary group G, and let’s define what is Z of G. So this is defined as
all the elements belonging to G, such that z g is equal to g z for all g belong to G. So we are
given a group, let’s call that as G and we are picking those elements such that they commute
with every other element. If you take this collection, we will call this collection by a name, this is
called as the centre of G. Our first question would be, is this non-empty collection? Clearly
identity is one element which has this property, identity times G is equal to G times identity. So
this is a non-empty collection and if you have let us say 2 elements, z1 and z2 which belong to
Z(G), what about z1 times z2? So let us look at z1 z2 times any element, this is equal to by
associativity you can write it as z1 z2 g and z2g, because z2, was an element in the centre this is
equal to z1 g z2.

And now again, we can apply associativity and write it as z1 g times z2, and that is by z1 being
an element of the centre, this is equal to g z1 times z2. So z1 z2 g is equal to g z1 z2 for any
element g, so whatever we did here was true for all g, and therefore if you have 2 elements, their
product is also going to belong to z g. So, if you have some elements, this set has the property
that it is closed under that operation. zg is a collection, it is a non-empty collection and now we
take any 2 elements in Z(G), and you can combine them, and the resultant element is still going
to be an element. So this is one property that Z(G) has. So we can write this as Z(G) is closed
under multiplication. Let us ask another question. What about z inverse?

781
So z belong to G, does z inverse also belong to G? Now this would mean, z inverse small g is
equal to g z inverse, this is what we need to prove, that is almost the same statement as the
corresponding statement for z, where z g is equal to g z. So if you just multiply, so consider this
equation and multiply with z inverse on both left and right. So both sides if you multiply with z
inverse, what you get is z inverse z g z inverse is equal to z inverse g z z inverse. Now, these
combine to give identity and this also combine to give identity and therefore you get gz inverse
is equal to z inverse g, and that is what we wanted at the start, that is same as this equation. So
Z(G) has this additional property, that this is closed under taking inverses. So you have a
collection, which is a property that you take 2 elements, combine them by the operation of the
group, still the element is going to be there inside this collection and you take any element it's
inverse is also present. Now, any collection which is these 2 properties, will in fact be a group.
So that is, if you take G.

(Refer Slide Time: 33:35)

So take a group G, consider any subset of G, which is closed under inverses and is closed under
multiplication. So, multiplication here means the group operation. So let us call the subset as H,
then H will be a sub group of G, so that is a theorem, we will prove that. But if you assume this
theorem, we now automatically know that Z(G) will in fact be a group.

782
For any g, finite or infinite does not matter, take arbitrary group take any arbitrary group and if
you can find a sub-collection of this, which is closed under these 2 operations, then that will be a
subgroup. In particular you could take the entire group and trivially these 2 properties are true
and trivially the, I mean the entire group is a sub-group of itself. Okay, there are 2 trivial sub-
groups of a group, namely the full group and the group consisting of only the identity element is
also a subgroup. Clearly the statement is true for both these cases, it is true for all the other cases
as well, we will see that in a while. How do we show that H is a subgroup? First of all, if you
take H, this operation is going to be well defined. So let us say star is the operation. Now star is
well defined in H itself, because H is closed under the multiplication operation.

Also, clearly as star was associative, it will still remain associative. So star remain associative on
H and will identity be present inside this? So look at any element, so this, so take any group, take
any subset, so here we mean a non-empty subset. Okay, so since it is non-empty there is at least
one particular element a and clearly because it is close under inverses, so a belong to H, therefore
a inverse also belong to H because it is closed under inverses or since a inverse is there, a star a
inverse also belongs to H, therefore identity belong to H. So we have a collection which contains
identity, it is associative, the operation is well defined. The only additional property that we need
to verify is that, every element has an inverse. But that's something that we have already taken
care of in one of our assumptions, and therefore H will be a subgroup of G, or this inverse
condition is a special condition that is required only when the group is an infinite group, if it is
not infinite,, suppose you look at a finite group then the only condition that you have to check is
that the subset is closed under multiplication. Why is that so? If we look at any collection, so
suppose G is finite, then second condition suffices. Why is that so? Take any element, and if H
contained only the identity element, then clearly it is a subgroup. So if there was a non-identity
element present inside it, then call it as a. If we look at a, a square so on, at some point, a to the
power k has to appear, such that this is equal to identity, because otherwise this is going to be
any infinite collection. So, we can just simply argue that, since when we are considering a, a
square and so on.

Some point of time, there should be a repetition, and let us say that the first repetition is at i and
j, ai is equal to aj, but if this was the case, if we look at the inverses of a in the original group,
let’s call that as a raised to minus i. So inverse of a raised to i is a raise to minus i, that when

783
multiplied, we will get a raised to j minus i is equal to identity. So, clearly if you consider one
element and its powers, at some point of time identity is surely going to appear. So we can
assume that, if you consider a, a square, so for some a raised to k, you are going to get this as
identity and therefore the inverse of a is, a raise to k minus 1. Because a raise to k minus 1
multiplied by a is a raised to k, that is going to be equal to identity. Since this is true for any
element a, we will definitely have an inverse, when this is a, when G is a finite group. We will
stop here and continue our study of group theory in the next lecture.

784
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Lecture 41
Cosets, Lagrange’s theorem

(Refer Slide Time: 0:32)

In this lecture, we will learn about the cosets and quotient group. So let us consider a group, let
us say this is the group G, and suppose this has a subgroup H. So let us define what are known as
the cosets of G with respect to H. So consider all the elements of the form aH. So aH is defined
as all those g belonging to G such that g is equal to a times a for h belonging to capital H. So this
is our set H and we are going to multiply all the elements in H by a, what we, and once we do
that we will get another set which we will call as aH. And this is going to be called as a coset. In
particular, this is a left coset because we are multiplying on the left. So now if you take an
arbitrary group and construct all its left cosets, we will get different cosets, let us call them as
a1H, a2H and so on. And this collection of cosets is going to, I mean, if you take the union of
them that is going to be the complete collection. In other words, union over all a belonging to G,
aH will be equal to the set G.

785
But what is more interesting is, if you take 2 cosets, they either are one and the same or they are
disjoint. Okay so this is a claim. The cosets partition G, so if you have, so what this means is
suppose aH and bH are cosets, then either aH is equal to bH or aH intersection bH is equal to
empty. How do we prove this? And if we have this claim, it is easy to see that if you take all the
cosets that will, and take the union of that you will get the entire set, because if you look at H, H
is a subgroup, in particular the identity belongs to H. And if you take g times e, for any arbitrary
g belonging to the group, we are going to generate the element g. So if you consider the coset gH
that contains g. So the union of gH over all G will certainly exhaust the complete collection of
groups. Why is it true that is it either empty or two cosets are the same? So let us assume that the
cosets have some non-trivial intersection. If they did not have any intersection, then we are fine
because that is just the condition aH intersection bH is empty.

(Refer Slide Time: 4:08)

786
So suppose, let us say, alpha belongs to aH intersection bH. So that would mean that alpha is
equal to ah1, there exists an h1, an element h1 such that alpha is equal to ah1 and alpha is equal to
bh2. And therefore, we can say that ah1 is equal to bh2, so ah1 is equal to bh2, therefore a is
equal to bh2 h1 inverse. And since h2 and h1 are members of a group, we can say that, this is a
subgroup. So h2 times h1 inverse will be equals to bh3 for some h3 belonging to H.

Let k is equal to ah, be an arbitrary element of aH. We want to show that k will belong to bH,
and exactly same reasoning will show that any element in bH will also belong to aH. So let us
first do the, the proof that any element of aH will belong to the set bH. So k is equal to ah, so a
can be written as b times h3. So k is equal to ah, can be written as b times h3 times and that is
equal to b times h4. So this means that k must belong to bH. So we took an arbitrary element of
aH and showed that it belongs to bH, if there is an intersection between aH and bH. And
similarly you can show that any arbitrary element of bH must belong to aH, if there was a
common element. So that concludes the proof that the cosets are either disjoint or they have a,
they are one and the same. So now this will help us prove the following theorem known as
Lagrange's theorem. So let G be a finite group, and let H be a subgroup of G. So clearly H will
also be a finite group. Then, order of H divides order of G.

Why is this true? If you consider, so all that one has to do is, consider cosets of G with respect to
H. So, order of H is number of elements in H and order of G is number of elements in G. So if
you consider the cosets with respect to H, they are going to be let's say a1H, a2H and akH. So let

787
us say these are the distinct cosets, some of these, I mean, if we have taken all the elements of G,
some of the cosets may overlap, but here we are just counting the, we are just enumerating the
distinct cosets.

Let us say k of them are there, which we call it as a1H, a2H and akH, and these cosets, these
collections, partition the collection G. So their sizes if you add up, we will get G. So size a1H
plus size a2H plus size akH is equal to size of G. But note that every coset is of identical size.
Size aiH is equal to size ajH, why is this so? This is equal to the size of H. Now if we assume
this fact, what we can show is, what we can immediately conclude is that, the total sum on the
left hand side is equal to k into size of H, and that is equal to the RHS, which is the size of G. So
this would be the proof. But all that we need to see right now is that why are two cosets of
identical size.

(Refer Slide Time: 8:55)

Now the set H is a finite set, and let us say its elements are h1 up to hm. Let the sub group of G,
its elements be h1 to hn. Now if you look at ah1, ah2 and ahm, these are m elements and they are
all distinct, because ahi cannot be equal to ahj for i not equal to j. This is because, if we assume
ahi is equals ahj, you can multiply both sides of the equation with a inverse, and you will get hi
equals hj, so this cannot be equal. And therefore there are exactly m elements in ah, where m is a
number of elements in the subgroup.

788
(Refer Slide Time: 10:06)

So that concludes the proof. The fact that this is the finite group, was used when we said that
these partitions has a fixed number of elements, and each of this coset is of finite size. So let us
consider a group and these are the cosets corresponding to H. So let us say this is aH, bH and so
on. If G were the set of integers and let H be the multiples of n, for example if we took n as 10,
then H is all those elements of the form 0 plus or minus 10, plus or minus 20 and so on. Now
these cosets are basically the equivalence classes. a is related to b if a minus b is equal to 0 mod
n. Here n is 10. These cosets are precisely those equivalence classes, and we could, we had seen
earlier that we could do arithmetic or we could add and multiply these, these equivalence classes
themselves. So the cosets could themselves be added and multiplied. And when can we really do
this? Can we impose a group structure on the cosets? So that is the question that we will try to
answer. And in some cases, we can do that when the group, the underlying subgroup has some
nice properties, we can have a group structure on the cosets that is known as a quotient group.

In some sense, you are using the group, the subgroup H to divide the group G into different parts
and then we carry out some operations, we define some group structure on the parts obtained. So
in order to look at these things more closely, we will look closely at the concept of
homomorphism. We had seen the notion of isomorphism earlier, and homomorphism between G
and H, so let us say G is a group and H is another group, a function from G to H is called as

789
homomorphism, if f of g1 g2 is equal to f of g1 times f of g2. g1 g2 is computed in G and f (g1)
times f (g2) that is computed in H.

If this is satisfied for all g1, g2 belonging to G then we will say that f is a homomorphism from G
to H. So suppose G is this, and H is some other group, and there is a map f and it satisfies this
equation 1, then we say that this is a homomorphism. Now while we are studying
homomorphism, we can restrict our attention to basically the image of f, so let us say this is a
subset to which f is mapping G to. We will restrict our attention to just those elements, the others
does not really matter. So if you call this as image under f, we will just study the effect of the
homomorphism by restricting our attention to just the image, the other elements do not really
matter or in other words we can say that we are looking at homomorphism which are onto
functions. In other words, we will just study those homomorphism where G is mapping to the
full set H.

Now, there is an easy claim that you can verify. Image of a homomorphism will always be a
subgroup. So even if we have taken the larger collection and if we were just looking at the
image, that will be the subgroup of H. Okay, so that is the reason why we can restrict our
attention to homomorphism which are onto.

(Refer Slide Time: 15:15)

790
So let us see some properties of homomorphism. Let us say if f is a homomorphism of G to H,
then f of identity will be equal to identity. It maps the identity in G to the identity in H. And the
second thing is f of g inverse, so let say g is any element and if you look at its inverse and look at
the image inverse that will be equal to f g inverse. Both of these statements are easy to verify, we
will just verify the second part. So let us look at the f of g times g inverse. f by the virtue of
being a homomorphism this must be equals to f g times f g inverse and f of g times g inverse, g
times g inverse is going to be identity. So this is f of identity, and f of identity, this is equal to
identity. Now this is, e is the identity in G and e prime is the identity in H. So what we can
conclude is f g times f g inverse, this is equal to identity in H. So this is an element of H and this
is another element of H. So, and when they multiply, we get identity that means those elements
are inverses of each other. So we can write f g inverse is equal to f g the whole inverse, so that
basically is fact 2.

(Refer Slide Time: 17:02)

The next concept that we will see, is what the kernel of a homomorphism is. So homomorphism
is a map from one group to another which preserves the group structure. And we argued that
when we have a homomorphism, the identity will map to the identity. Now we can look at all the
elements, there might be multiple elements mapping to the identity. So, all of them will be
mapping to the identity. That collection is known as the kernel.

791
Okay, so kernel is the set of elements in G whose image is identity. So kernel of a
homomorphism is the collection of all those elements in G, which maps to the identity. There is a
simple fact. Kernel is a subgroup of G, if we denoted the entire group by G, this is the kernel.
Okay, so kernel will invariably be a subgroup of G. How do we verify that? So in order to verify
that a subset of a group is a subgroup, what we need to do is verify that for every a, b belonging
to K, a b inverse belongs to K. If you verify this part, then, we can conclude that the subset is, the
subset K is going to be a subgroup.

Now it is easy to check because f of a b inverse, if we can show that this is identity then that
means for every a b belonging to K, a b inverse also belongs to K. f of a b inverse is nothing but
f of a times f of b inverse which is f b the whole inverse. Now f a is identity, because a belongs
to the kernel and f b also is identity because b belongs to the kernel and inverse of identity is
identity, so this entire thing is e bar. So kernel is always a subgroup of G.

(Refer Slide Time: 20:09)

792
Now here we are in a position to state our main result, which is a connection between the kernels
and the quotient structure. We were asking this question, when can we define a substructure on
the cosets? If we look at kernels, kernel we know is a subgroup, and we can construct cosets with
respect to a kernel K. So let us say this is a kernel and then you consider it's cosets. For this you
can, I mean if you had these cosets a1K, a2K and so on and if you define the operation on these
cosets as, so let us say aiK times ajK, so aiK is the coset containing the element ai and ajK is the
coset containing the element aj.

793
And if you define this as a coset ai aj K, that is the coset containing the element ai aj. Now we
can verify that this is a valid definition, because let us say ai prime is some other element of the
same coset and aj prime is some other element of the coset ajK, if we multiply them out, the rule
says that they should be equal to ai prime aj prime K. But this should also be equal to ai aj K, so
this is what we need to verify. In order to verify this, this is a set and this is another set. When
are these sets identical? If you that one is contained inside the other, when A is contained inside
B and B is contained inside A, then the sets A and B are the same. We will just verify for one
direction, the other direction is automatic, the same reasoning would work. So an arbitrary
element of the LHS is ai prime aj prime K.

Now note that ai prime is equal to ai into K1 and aj prime is aj into K2, and therefore ai prime aj
prime K can be written as ai K1 aj K2 into K. If we could somehow show that ai K1 is equal to
K1 prime ai. So suppose these were true, then what we can do is, we can just rearrange the
elements and get ai aj times K1 prime K2 prime K, and K being a subgroup these things
multiplies into some K3. So we will get ai aj and K3, and that is going to be an element of the set
ai aj K. So what we need to show really is that ai K1 is equal to K1 prime ai, so this may not be
true for arbitrary subsets, but the subsets for which this is true have a special name those are
called as normal subgroups. So we can assume the theorem to be complete, assuming that a K is
equal to Ka, if aK was equal to Ka for all a, then ai K1 is equal to K1 prime ai. So assuming that
the kernel is a normal subgroup we are done with our proof, because this was a normal subgroup
then this K1 aj, you can flip that and push the aj to the left side and push the K1 to the right side.

So, all that remains is to show that the kernel is a normal subgroup. So how do we show this? So
in order to show that some subgroup is a normal subgroup, what we need to show is aK is equal
to Ka. So let say, consider an arbitrary element aK and we need to show that ak belongs to Ka.
So consider the element a ka inverse, so a ka inverse we claim that this belongs to K. This is
because, if you look at the homomorphism f and if you consider f of ak inverse, this is equal to f
(a) times f (k) times f a inverse. Now f (k) is identity because k belongs to the kernel, so this is
equal to f(a) times f a inverse and this is equal to identity because f (a) and f a inverse are,
inverses of each other. So ak a inverse belongs to K, therefore if we take ak a inverse as k prime,
k prime into a is equal to ak a inverse times a which is equals to ak .

794
So we have written ak as a product of an element belonging to the kernel and a, where the
multiplication with a is on the right side. So the kernel of a homomorphism is a normal
subgroup. So to put things together, what we verified is that, you take any homomorphism, its
kernel if you consider the kernel it is going to be a subgroup not just that it is going to be a
normal subgroup.

(Refer Slide Time: 27:12)

And once you have a normal subgroup, the multiplication of cosets is well defined and therefore
that induces a, the homomorphism induces a quotient structure or a group structure with respect

795
to the cosets. And what we will show next is that, this is the only possibility. In the sense if we
can make the cosets into a multiplicative structure, there is a, if this definition of multiplication is
to be valid, then K must indeed be a normal subgroup. So that is the second part. Suppose the
coset multiplication is well defined, then the subgroup is a normal subgroup. So coset
multiplication we defined as, if you have aH and bH, their product is defined as ab times H. Now
this definition is to make sense, that means if it is to be well defined then H must indeed be a
normal subgroup. The way we would show this, prove this theorem is by showing that H is the
kernel of some homomorphism.

So we will construct a homomorphism for which H is the kernel. It is easy to construct. So here
we have a group, and this is a subgroup, and the subgroup gives rise to various cosets. Coset,s let
me call it as a1H, a2H and so on. Now, let me just consider the collection of these cosets. So let
C be the collection of cosets. So C is a set, and each element of C is a coset and now the, what
we have assuming is that the coset multiplication is well defined, you can verify that the set C is
a group under coset multiplication. Since the operation is well defined, we can check whether
that operation is an associative operation, whether inverse is present, where identity is present
and so on. You can verify all these things and conclude that this collection C under the coset
multiplication is a group. So now, let us consider our main group G, and consider a function
which maps elements of G to this collection C.

So an element x is mapped to the coset xH. So clearly every element of C is the image of some
particular element in G, and we can verify that this function f, function f is a homomorphism
from G to C. Why is that so? So f of let's say xy, this is equal to xy times H. By definition f maps
any element to the coset containing that particular element and that is xyH. And since our
multiplication was well defined, we know that this is nothing but xH times yH and xH is f(x) and
yH is f(y). So we verified that the function f is a homomorphism from G to C, and the kernel of
that homomorphism is precisely H. Kernel of f is all those elements, which map to the identity
element of C. The identity element of C is basically H and therefore, you can verify that every
element in the coset H will map to H and therefore, since the kernel of f is H, by our earlier
theorem H has to be a normal subgroup. So this concludes our study of groups and subgroups
and the quotients structure that is induced by a normal subgroup.

796
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Lecture 42
Rings and Fields

In this lecture we will learn about 2 new algebraic structures, namely rings and fields. We had
learnt about the groups. One way in which rings and fields are different from groups is, while we
were talking about the groups there was only one operation, but when we talk about rings and
fields there are two operations and we will call these operations as addition and multiplication
operations. So let us formally understand what a ring is.

(Refer Slide Time: 1:01)

797
So there is a set. So ring is a set R, equipped with 2 operations, let us call is as the addition and
the multiplication operations. And these operations have to satisfy some properties. The first
property is, if you just consider the set, with the addition operation, this should be an abelian
group and if you look at the operation of multiplication, then that should be associative and there
should be a multiplicative identity. So that would mean if you have elements a, b and c, a times b
into c should be equal to a into b into c and there should exist an element e, which can act as the
multiplicative identity. So a times e should be equal to e times a should be equal to a. There
should exist an element of this kind.

The third property is how these, these operations interact with each other. So that is the
distributive laws, namely if you multiply a with b plus c the product should be equal to a times b
plus a time c, for every a, b and c. And if you have a plus b into c, so no matter whether you
multiply on left or whether you multiply on right, the distributive law holds. So this will be equal
to a into c plus b into c. So if you have a set equipped with 2 operations which behave in this
manner, then that is called as a ring. Let us look at some key differences or some non-
requirements. This multiplication need not be an invertible operation. When we are looking at
addition, for every element of the group, there is an element which can act as the additive
inverse. The multiplicative inverses need not be present. Not just that, even cancellation laws
need in hold. We will see examples of all those. So that is the way in which this is, the general
ring is different from let us say if you consider the ring of real numbers.

798
So the ring of real numbers, is many additional properties that, that we do not insist for a general
ring. So let us see some examples. If you just take the set of real numbers with usual addition and
the usual multiplication then clearly under addition real numbers form an abelian group and the
multiplication has the distributive properties and it is associative and one can serve as the
identity, so this is the ring. Now let us look at the set of integers under the usual addition and
multiplication, this is also a ring. But here you can see that there is, I mean, under multiplication,
inverses are not defined. For example, if you take the element 9, there is no multiplicative
inverse, there is no element which you can multiply with 9 to get a 1. But this is a commutative
ring, in the sense, the multiplication operation is commutative.

Same applies for R, but Z is a commutative ring. But it's, the non-zero element, if you look at it,
not all of them are invertible, only invertible elements in this ring are the plus 1 and the minus 1.
Let us look at another ring, if you look at the set of numbers, modulo 10. So the operations are
mod 10 operations. So, now the the addition is mod 10 addition and the multiplication is mod 10
multiplication. Now if you look at this ring, clearly this is a, I mean you can verify that it will be
a ring because addition is associative and the multiplication, there is a multiplicative identity and
the multiplication is associative and the distributive laws hold. But now you can see that, even
cancellation laws, does not really work. For example, if you have let us say 4 into 5, so that will
be 20, that is equal to 0, is equal to let us say 6 into 5.

Both are 0, but we just cannot cancel off 5 and say that 4 equals 6. But this is again a
commutative ring. But if you now take the set of matrices, if you look at matrices, so let's say 2
cross 2 matrices where the entries are from the ring of integers and the addition is the usual
addition and the multiplication is the matrix multiplication. Clearly, this is not a commutative
operation. For example, if you take, say the matrix A is equal to 2 1 0 1 and B is equal to 1 0 2 2,
so AB, if you compute the first element or the 1-1 element will be 2 into 1, 1 into 2, it will be 4
and the other elements would be something, whereas if you compute BA that is going to be 1
into 2 and 0 into 0, so this is going to start with 2. So clearly AB is not equal to BA. So matrix
multiplication is not commutative and therefore these matrices, 2 cross 2 matrices with integer
entries, if you take those matrices they do form a ring but it does not have many other properties

799
that we would have in other rings. Like, it is not commutative and there is no cancellation may or
may not be possible and so this is an example of a non-commutative ring.

(Refer Slide Time: 9:20)

Now let us further explore the rings. So we will first define what are called as the units of the
ring. Since we have a multiplication operation and we have a multiplicative identity, we could
consider all the elements that are invertible. So units are nothing but invertible elements of the
ring. For example, the first question that we should ask is, if an element is invertible, is the
inverse unique? So what do we mean by invertible elements?

800
So x is invertible, if there exist y such that xy is equal to yx is equal to 1, where 1 is the identity.
We know that in the rings there is an identity. Now, if there is one such x, will it be unique? It
will be unique because suppose there is a y prime, suppose there is a y prime with similar
properties, for an invertible element x, suppose there is a y prime. So if you consider xy equals 1
and multiply both side of this equation with y prime, so y prime into 1 this is equal to y prime
and we can use associativity here and say, that y prime times x that is going to be identity
because y prime was an inverse. So this will be 1 into y and that is going to be equal to y,
because 1 is the identity, identity multiplied by any element you see 1. So this would imply that y
is equal to y prime. So if an element is invertible, its inverse is unique. Now let us collect all the
invertible elements together, that is, they are going to be called as units and you can verify this,
the invertible elements of R forms a group under multiplication.

We have already seen this when we were looking at certain groups. If we were looking at Z10,
that is a ring and invertible elements are namely, 1, 3, 7 and 9. These were the numbers which
are relatively prime to 10 and if you take those elements, those alone are the invertible elements
and you can check that the inverse of 1 is going to be itself, 3 inverse is 7, because 3 into 7 is 1
and 7 inverse is 3 and 9 inverse is 9 itself, because 9 into 9 is 1.

(Refer Slide Time: 13:18)

801
Now we can define, what is a field? So we will rule out the trivial cases by saying that whenever
we are thinking about rings or fields, the additive identity and the multiplicative identity, they are
going to be separate, they are going to be, they are going to be, they are 2 different distinct
things. So there will be at least 2 elements in all the rings that we are looking at. So can 0 be a
unit? Can 0 be inverted? So, you can verify that this cannot be the case and therefore in any ring
the best that we can hope for is the non-zero elements are invertible, and such a ring with the
additional property that it, that the multiplication is commutative is called as a field. So let us
defined what a field is. So again we will denote the set by F, and there are 2 operations namely,
plus and multiplication. So this is a field if F minus this 0 element forms a group under
multiplication, and of course, F has to be a ring for these operations and then the third
requirement is multiplication is commutative.

So if these conditions are satisfied then that is called as a field. So we will see some examples of
field. If you look at the real numbers under the usual addition and multiplication, this is a field.
Because if you take the non-zero elements, each of them is invertible and they form a group
under multiplication and of course multiplication is commutative. We will see far more
interesting fields when we are looking at finite cases. If we look at let us say Zp under mod p
addition and multiplication, this is also a field if and only if p is prime. Now in our group theory,
we had checked that if you consider elements modulo, multiplication modulo p, the elements 1 to
p minus 1 will form a group, when p is a prime number and if it is not a prime number, they will
not form a group.

802
So this automatically means that we will have fields of size p, for any prime. Can there be a field
of size 6? Can there be a field of size 9 and so on? These are important questions and we will see
the answers to some of these questions. So let us first look at some examples of fields. So let us
look at matrices. So let us consider 2 cross 2 matrices, which are skew symmetric. We have a
very specific form for these, let us say they are of the form x y and then minus y x.

So look at 2 cross 2 matrices of this form, where x and y, these are elements of a field. Do these
matrices, this collection of matrices form a field? Clearly they form a ring, because if you have a
matrix x y minus y x and if you add it to another matrix of that kind x prime, y prime minus y
prime x prime, what you get is another 2 cross 2 matrix which is of the same form. So x plus x
prime will appear on the diagonal and y plus y prime and its negative will appear on the off
diagonal. So closure property is there and you can verify that under addition these do form a
group. The multiplication is also well defined and you can see that the multiplication will in fact
be commutative. If you just consider the matrices of this kind, so let's just say, let's compute x y,
let us take this matrix and find its product with another such matrix.

So the diagonal entries are going to be x x prime plus y y prime, so the product is going to be
minus y x prime minus x y prime which is the negative of the 1, 2 entry. And the diagonal
elements are going to be x x prime minus y y prime. And you can see that the result is symmetric
in x and x prime or y and y prime, because if you just change, I mean if you compute them
product of x y prime minus y prime x prime with x y minus y x, that is going to be exactly same
as, so if you call this as AB, this is also going to be equal to AB. So AB equal to BA and
therefore these matrices when you multiply them out, they are, I mean the multiplication is
commutative. But is it invertible is the question. So that answer is going to depend on the
particular field from which are choosing, so if you have a matrix x y minus y x, if it is invertible,
let us first try and do the symbolic inversion of this matrix. So if you have another matrix which
inverts this, then it should be other formed a, b minus b, a and this product should be equal to 1 0
0 1, because this is going to be the, for matrix multiplication this is going to be the multiplicative
identity, nothing else can act as the multiplicative identity.

803
(Refer Slide Time: 18:48)

So when will matrices be invertible? So here the requirement is x a, the product would be xa
minus yb, that should be equal to 1, that is one of the requirements and the other requirement will
be that xb plus ay should be equal to 0. We can solve these simultaneous equations for a and b,
and so if you call this as equation 1 and this as equation 2, if we just multiply 1 into x plus 2 into
y, what we will get is x square a plus y square a is equal to x and therefore a is equal to x by x
square plus y square and similarly you can show that b is equal to minus a y by x and that is
going to be equal to minus y by x square plus y square, and therefore we can carry out this
operation as long as x square plus y square is not equal to 0. So these are going to be invertible
if, so the matrix A, if we consider this as the matrix A, A is invertible if x square plus y square is
not equal to 0.

We had assumed that the elements come from a field, if they come from a field, all the non-zero
entries will be invertible. So x square plus y square its inverse can be calculated, multiplied with
x you will get the value of a and similarly you can find the value of b. So if you look at our field
and if x square plus y square is not equal to 0, for any choice of x and y, ofcourse when x is, x
and y are both 0 x square plus y square will be equal to 0 but in that case the matrix we are
talking about is the all 0 matrix. So all 0 matrix we need not, I mean that will not have to
inverted, so all that we have to check is, are all the non-zero matrices invertible? Okay, and that
is the case if x square plus y square is equal 0.

804
Now if you take Z3, this is 3 elements namely 0, 1 and 2. So x has 3 possibilities, y has 3
possibilities, out of it 0 0 possibility we could discount and the other 8 possibilities we can
manually check. So let us say 0 1 0 2, 0 0 we had skipped. So these are the x y values, and then
10, 11, 12, 20, 21, 22 and x square plus y square, all those things computed mod 3, this is 1, this
is 1, 1, 2, 1 square plus 2 square is 5 mod 3, that is again 2, this is 1, 2 square is 4 plus 1 5 so this
is 2, 8 mod 3 that is again 2. So, none of these are equal to 0, so if you choose elements from Z
cube, they are going to form a field. So what we have proved is the following. Matrices of the
form, x y minus y x, whose entries are elements of Z3 form a field and this field has exactly 9
elements, because x has 3 choices and y has 3 choices, once x and y has been fixed the matrix
gets fixed. So and this is a matrix with, this is a field with 9 elements, because we have
constructed a field with 9 elements. We will see other methods for constructing fields.

Now maybe a similar thing would work, if we take Z5, we will get a field with 25 elements. If x
and y, so let us try and choose Z from, x and y from Z5. Okay, so that has these elements 1, 2, 3,
0, 1, 2, 3, 4, but here if we take x square plus y square, if you take 2 and 1, so 2 square plus 1
square that is equal to 0. So if we take elements from Z5, they do not form a field. So we will
probably have to explore other methods to come up with fields of size 25.

(Refer Slide Time: 25:00)

805
In order to do that, we will learn something about polynomials. So what are polynomials? So
they are familiar. We would write a polynomial in one variable x as a0 plus a1 x plus an x raise to
n. So, this is a polynomial of degree n, if an is not equal to 0. So let us assume that an is not equal
to 0 and then this becomes a polynomial of degree n, so if you think of this as a function of x, we
can say that f(x) is a polynomial of degree n. So the degree of a polynomial is a largest d, such
that if you consider x to the power d, it's coefficient is non-zero.

So largest d is such that, x to the d has non-zero coefficient. So this a0, a1, an is what we refer to
as a coefficients. We are not really interested in evaluating these polynomials at particular points
x, so we are just interested in the form of the polynomial. So x you can think of as a formal
variable, and therefore f(x) you can put it in one to one correspondence with a sequence, of let us
say n plus 1 terms. So polynomial for us is now just a sequence. So a polynomial is a finite
sequence of elements from a ring. We will insist that these elements a0, a1 up to an, they are
coming from a ring, because we want to basically add and multiply polynomials. So let us look
at these operations of sum and product, so you have one sequence a0, up to an and another
sequence b0 to bm, we can just look at terms and add them. So suppose m is not equal to n, and
let us say m is greater than n, then we can basically pad off the other polynomial with enough
number of 0s and then we may assume that m is equal to n, and therefore the new sequence, its
c_i th term will be equal to ai plus bi, so that is the sum.

806
So if you had this polynomial, x square plus x plus, x square plus 2x plus 5 and another
polynomial x cube plus 8x plus 9, so this is your 1st polynomial, let us say this is ax and this is
bx. So ax plus bx would basically be x cube plus x square plus 10 x plus 14. And here we were
looking at these coefficients and the addition of these coefficients were happening in the ring, 8
plus 2 that gave us 10, 9 plus 5 we got 14, because we assumed that these numbers were from the
ring of integers. We could also define their product, so ax times bx would be equal to, so x cube
into all these would give x raise to 5 plus 2 x raise to 4 plus 5x cube plus 8x when you multiply
you will get 8x raise to 3 plus 16 x square plus 40x plus 9x square plus 18x plus 45. So that can
be combined and written as x raise to 5 plus 2x raise to 4 plus these terms combined to give us
13x cube plus, these two terms combined to give us 25x square plus 58x plus 45.

So multiplication is little more complex operation, but we are essentially looking at the
coefficients and adding and multiplying the coefficients in the ring of integers. So formally, we
could, I mean, if you multiply this then the ith term will be, if we just denote ci or let say di is a
product, so di is equal to sum over let us say j varying from 1 to i, or 0 to i, aj bi-j. So a0
multiplied by, that is equal to a0 into bj plus a1 into b, a1 into bi minus 1 plus a2 into bi-2 all the
way up to ai times b0. So the ith term's coefficient will be basically this, sum of products. Again
all those operations are carried out in the ring of integers when we have considered this particular
example. So now if these elements, when instead of carrying out these operations in the ring of
integers, if we had to carry out these operations mod 10, so instead of integers if we were
considering Z10, then this sum would be different. I mean, in that case the answer would be x
cube plus x square plus 4, because 10x is 0 times x and that just 0 and 14 mod 10 is going to be
4. And if you look at the product, that will be equal to x raise to 5 plus 2x raise to 4 plus 3x cube
plus 5x square plus 8x plus 5. So that is going to be a different polynomial than, let us say what
we do when we had considered the ring of integers. So when we are considering arbitrary rings,
the degrees, I mean there are some not, so the way the degree behaves is different from the usual
behaviours. In the sense, if you add two polynomials, their degrees could come down.

807
(Refer Slide Time: 32:19)

For example, if you take x square plus 3x and 9x square plus which is plus 9x and if you add
them, if you add them mod 10, if you are doing this, I mean, if this is from the ring of integers
mod 10, then when you sum the result would be 10x square which is 0 and plus 12x and that
could be, that is just 2x. So suppose you had polynomials 2x raise to 4 plus 7x and 5x square plus
5. If you multiply them out, the usual, the answer would be 2 into 5 x raise to 6 plus 7 into 5 x
cube plus 2 into 5 x raise to 4 plus 5 into 7 x, multiplication when carried out mod 10, this would
give us 0, this would give us 5 and this would give us 0 and this would give us 5.

808
So mod 10, the answer is 5 x cube plus 5x, so we had taken a larger polynomial, I mean the
polynomial of larger degree and when you multiply it with some other non-zero polynomial, you
are getting some result where the degree is reducing. So the degree could behave in, in very
funny manner. So we can take care of these kinds of situations by just working in a field. We
insisted that these numbers, these coefficients of the polynomials were coming from a ring, if
you had taken these coefficients to come from a field, then we can show that when you multiply
two polynomials their degree cannot decrease. So let us have some notations. So if you have a
ring R and if you look at polynomials whose coefficients come from R, then we will write it as
R[x]. So this is the notation for polynomials in x, where the coefficients are from R.

And if we have a field, we might usually write it as F of x to indicate the difference between a
ring and a field. So when we say F[x] or R[x], it means, we are considering polynomials in the
variable x and the coefficients are coming from R or F. So if we have a, if we have polynomials,
if we consider the polynomials, where the coefficients are from a field, then we will have the
following observation. If a(x) and b(x) are polynomials in F[x], where F is a field then degree of
a(x) times b(x) is equal to degree of a(x) plus degree of b(x). We should assume that a(x) is not
equal to 0 and b(x) is not equal to 0, because if they were 0 and when you take the product, you
will get 0. So if they are non-zero polynomials, then the degrees add up, easy to check. So if a(x)
is a polynomial and if its degree is m, and there is a term am x to the power m and in b(x), there is
a term bn x to the power n and then you take the product, is going to be this term am times bn into
x raise to m plus n. And since am and bn are coming from a field and they are non-zero elements
of the field, they cannot multiply to give a 0 element. So x raise to m plus n will have a non-zero
coefficient and that, m plus n is going to be the largest term and therefore, I mean it is the term of
the highest degree and therefore the degree of the product would be equal to m plus n. So we will
stop here and continue in the next class.

809
Discrete Mathematics
Professor Benny George
Professor Sajith Gopalan
Department of Computer Science and Engineering
Indian Institute of Technology Guwahati
Lecture 43
Construction of Finite Fields
(Refer Slide Time: 0:39)

So we will study about the finite fields in this lecture. A field is a commutative ring such that the
non-zero elements form a multiplicative group. The examples are the set of complex numbers,
the set of real numbers and the set of rational numbers and the non-examples are the set of
integers. If you look at the set of integers, they do form commutative ring but the elements are
not invertible. All these are infinite sets, so these are examples of, C, R and Q are examples of
infinite fields and in this lecture what we will concentrate is on trying to understand the finite
fields. An example of a finite field is integers under mod p multiplication, mod p addition and
mod p multiplication. And here we have to insist that p is prime. If we look at elements in Z10,
so doing mod 10 arithmetic, they do form a commutative ring. But since 10 is not a prime
number, this will not be a field. If you multiply 2 and 5, they are non-zero elements and they
multiply to give you 0. So this would in particular mean that 2 is not going to be invertible.
There is not going to be any element with which you can multiply 2 or 5 and get identity.

810
Because 2 and 5 are both factors of 10, whereas Z101, 101 is a prime number, so you can verify
that under mod 101 operations, these form a field of size 101. The elements, 0 to 100 forms an
additive group and the elements, the non-zero elements, 1 to 100 they will form a multiplicative
group. Every element you can check is invertible mod 101.

So if you take any prime p, if and you look at Zp, this will have p elements, so that is a finite
field of size p. Can we have finite fields of size let's say 75? Can we have finite fields of size
128? Okay, so finite fields of what size exist? That is a question to which we will try to answer.
We will provide some partial answers to this particular claim, this particular question. What we
will show is the following.

(Refer Slide Time: 3:30)

811
The first thing that we will show is every finite field is of size p to the r, for some prime p and
integer r. In particular, after this we can conclude that there will not be a finite field of size 75,
because 75 is not a prime power. And then we will give a construction of a finite filed of size p
to the r, and what is true but we will not show in this lecture is that there is only one finite field
of a given size. That means, if we construct a finite field of size p to the r, that field can be
thought of as a representative finite field of that size. Every other finite field of that side is
isomorphic to the one that we will construct. So let us continue with this thread. First define a
notion of a characteristic of a field. Look at the identity element, and keep on adding it to itself.
In other words, consider the additive subgroup formed by the unit element. This is clearly a finite
collection, because every element that you generate is going to belong to the field and a field
itself is a finite collection and therefore this is going to be a finite cyclic group.

The size of this particular cyclic group is what we call as a characteristic of the field. Note that
when you consider infinite group, the additive subgroup formed by the unit element could
possibly be infinite. So in that case we might refer to the field as having characteristic 0 or
infinite, but here we are restricting ourselves to finite fields, therefore the characteristic is always
a positive integer.

812
Now we will claim the following thing about characteristic. The characteristic of every finite
field is a prime number. How do we show that? So by the definition, we are basically
considering 1, 1 plus 1, 1 plus 1 plus 1 and so on. Look at the first time when it becomes 0.
Because this is a cyclic group and it should, the process should some, at some stage give rise to 0
and that can be thought of as the characteristic. Because these are the distinct elements, after 0
has been generated if you add 0 the same structure is going to repeat again and again. So the
number of times 1 is being added to itself, to get 0 that can be thought as of the characteristic of
the field. So let us say this is t and t can be the smallest such number, this is the minimum
number of times that you have to add 1 to itself so that you get 0. Now if t is not a prime, we can
write t is equal to a into b, and then we can look at this particular equation 1 plus 1 plus 1 a times
multiplied by 1 plus 1 plus 1 b times.

By the distributive law this should exactly be equal to 1 plus 1 plus 1 added ab times, and 1
added to itself a times we can denote that by a and 1 added to itself b times we can call it as b, so
these are two elements of the fields, two non-zero elements of the field because if either of them
were 0, then we know that t is not the smallest number such that 1 when added t times gives rise
to 0. So, a and b we may assume is smaller than t and therefore now we have two non-zero
numbers which multiply and give rise to 0. And that is not possible in a field, unless one of a and
b are 0, which we had, neither of a and b can be 0 because we assumed the t was the smallest. So
this contradicts the assumption that t could have been written as product of 2 things which are
not, I mean product of a and, so this contradicts the assumption that t was a composite number.
So the characteristic of every finite field is p. A prime number and if you look at p times t this is
equal to 0, for all t belonging to the field. So take an a element of the field, multiply it with p,
you will get 0 because p times 1 is 0, and p times 1 times t is equal to p times t, that is also going
to be 0.

813
(Refer Slide Time: 9:46)

814
So now we are in a position to show, why every finite field is of size p to the r for some integer r
and prime p. So let F be a finite field, and let us look at a collection of elements of F which
generates F. Consider a minimal collection or minimal subset of F which generates F and what
does it mean for a collection to generate F? So we are looking at a set f1 f2 fk, some finite set,
such that if we consider these elements and add them up, add them up enough number of times
we can generate the entire collection. So we are looking at n1 f1 plus n2 f2 plus nk fk.

Consider all, suppose S is the set f1 f2 fk, so let P(S) be the set of all elements of the form n1 f1
plus n2 f2 plus nk fk where n1 is an integer. So that will be the set generated by S and we are
looking at a set which generates the complete set F and we want that set to be the smallest.
Smallest or the minimal one in the sense, none of its subsets can generate F. So the definition of
minimal means, no subset can generate F. Clearly, there are such sets, which generate the entire
collection. If you take the entire F that will generate F because we can, I mean n1 n2 can all be 1
or 1 and 0s. So if you take the full collection, the full field that can generate itself. So you can
prune off elements systematically and probably get a small subset such that no smaller subset can
generate the entire F. So let S be one such collection and further our previous observation that if
p was the characteristic, then p times any f is going to be equal to 0. So while we are considering
elements of this form, we can restrict to 0 to p minus 1, because n1 if we write it as, let's say q
times p plus r, then n1 f1 is going to be q times p plus r times f1, which can be written as q times p
f plus r times f1. So n times f1 is nothing but r times f1, where r is lying between 0 and n minus 1.

815
So when we are looking to generate the set F, these coefficients of f1 f2 fk can be chosen from 0
to p minus 1. So now if you look at f1 f2 fk, and look at everything that is generated by f1 f2 fk, the
possibilities are, the n1 could belong to 0 to p minus 1, n2 can belong to 0 to p minus 1, so there
are p possibilities or p choices for each ni. So in all, the total number elements that we can form
is no more than F. So, the total elements in F are no more than p to the k. Because each of these
n1, n2, nk you can select it from 0 to p minus 1.

What we will show now is that, it is exactly p to the power k. So our claim is for distinct choices
of ni, the element generated are unique. Why is this so? So suppose this is not the case. Then, let
us say n1 f1 plus n2 f2 plus nk fk this is equal to let us say m1 f1 plus m2 f2 plus mk fk. And we are
assuming that n1, n2, nk, if you consider that as a tuple, that is different from m1, m2, mk
considered as a tuple. And since these two expressions generate the same element, if you subtract
them, you must get 0. So therefore we can rewrite this as n1 minus m1 times f1 plus n2 minus m2
times f2 plus all the way up to nk minus mk times fk, must be equal to 0. Now, since ni is, ni minus
mi, at least one of them should be non-zero because if all of them were 0, the ni’s and mi’s are
equal. So look at the first non-zero entry. So let us say this is 0, this is 0, so ni minus mi is not
equal to 0, so times fi plus some other terms, involving fi plus 1 all the way up to fk, this is going
to be equal to 0.

We can rewrite this equation by taking ni minus mi times fi on one side, this is equal to let us say
alpha i plus 1 fi plus 1 plus alpha i plus 2 fi plus 2, all the way up to alpha k fk. Multiply with the
inverses of ni minus mi on both sides, we can do that because ni minus mi is some number
between 0 to p, then it has an inverse, multiply that and we can express fi as linear combination
of fi plus 1 up to fk. This would mean that, in this collection f1 into fk, which we assumed is
minimal one of the elements can be expressed as linear combination of the other elements. So
there is a smaller set, namely S minus fi which generates the same collection, that is a, that
contradicts our initial assumption, that S was the minimal subset okay and therefore we can
conclude that our claim is valid, that there cannot be two distinct choices for n1, for the tuples
and them, such that they generate the same element. So this basically means that, take any finite
field, its size had to be of the form p to the power k or p to the power r where p is some particular
prime.

816
(Refer Slide Time: 18:18)

817
So the next question is, can we really generate that particular field? So that is the last part of this
lecture, constructing a finite field of size p to the k, no p to the r. So let us consider polynomials.
So the way we will do this is, we will construct a set of polynomials and we will define a
multiplication and addition on polynomials and with respect to that addition and multiplication,
this collection of polynomials will basically be a finite field.

So the set basically consist of polynomials, whose coefficients are from 0 to p minus 1. So
consider Zp and look at polynomials whose coefficients are in Zp. So if you take p equals 3, the
polynomials that we will consider are polynomials of degree less than r and coefficients from Zp.
So if I, if we choose r is equal to 2, and p is equal to 3, then our polynomials are going to be 0, 1,
2, this is the constant polynomials and then we have the polynomial x, x plus 1 and x plus 2 then
2x, 2x plus 1 and 2x plus 2. There are 9 elements here, namely p to the r, where p is 3 and r is 2.
This is the collection of polynomials. Of course if you multiply two of these polynomials their
result is not going to be one of these collection, for example if I take x plus 1 into x plus 1 that is
going to x square 2x plus 1, that is not an element of this collection.

So under the normal multiplication, these do not form a finite field. So you have to define a new
multiplication and addition. So the first thing is, coefficients would be computed mod p. So, in
particular if multiply 2x and 2x we will get 4x, but 4 mod 3 is 1. So that will be, sorry 4x square
and that will be x square. But x square is still not an element of this collection. So what we will

818
do is, we will reduce the degree by taking remainder upon dividing by a degree two polynomial
or degree r polynomial.

This cannot be an arbitrary polynomial, but let us say for the time being, let's just say we will
divide it with x square plus 1, so when you multiply two of these polynomials, what we do is,
whatever is the resultant, we will convert the coefficients into mod p and whatever is the
resultant polynomial that is being divided by a polynomial of degree r. When you divide a
polynomial a(x) by another polynomial b(x), the remainder is going to be polynomial whose
degree is going to be less than the degree of b(x). So when we do this, when we divide by r
whatever we get will be of degree at most r minus 1. So now at least what we have is, we have a
valid addition and multiplication operation defined on this collection. We need to verify that this
is indeed a group. I mean, this indeed a field. So it will not be a field when this is an arbitrary
polynomial, so this makes it a ring. Because the addition is well defined, addition is invertible
whereas multiplication may not be invertible.

We will introduce a concept called as irreducible polynomial and if the polynomial by which we
are dividing is an irreducible polynomial, then we will show that, these collection of polynomials
will basically be a field. So before we introduce the notion of irreducible polynomial, let me just
state whatever we were doing in a generalized setting. So what we do is, consider polynomials of
degree less than r with coefficients in Zp. So we are looking at a0 plus a1 x plus a2 x square all
the way up to ar-1 x raise to r minus 1. Each ai has p choices and each of those polynomial is a
different polynomial, even when you are dividing by a degree r polynomial. So these are distinct
polynomials, they are precisely p raise to r such polynomials.

Addition is component wise, in the sense if you have a and b, so a is a sequence or a polynomial
and b is another polynomial of degree r, then so let us just say this corresponds to the coefficients
are ai, coefficients of b are bi then a plus b is basically ai plus bi mod p, that is simple enough.
And a times b is the normal polynomial multiplication. So if we denote it by ci, ci is equal to
summation j going from 0 to i ai bi minus, sorry aj b i minus, so ci is aj bi-j summed up over all
values of j. That will be the normal polynomial multiplication and then whatever polynomial you
get, so let us just say c is the polynomial, which we obtained as c0 plus c1 x plus all the way up to
ck x raise to k, now this k, note that it could be greater than r. Now divide c(x) by a special
polynomial k(x) and declare the remainder as product of a and b. And this k(x) has to be an

819
irreducible polynomial, we have not defined what an irreducible polynomial is, we will come to
that.

And when we do this, the property of irreducibility, makes sure that the collection forms a field.
So in case of our field of size 9, if you looked at these particular polynomials and x square plus 1
happens to be an irreducible polynomial. So let us see at the example of one particular
multiplication, if you took 2x plus 2 and multiplied with 2x plus 1, this is going to be 4x square
plus 2x plus 4x plus 2. That is going to be 4x square plus 6x plus 2 and mod 3, the coefficients
have to be converted to mod 3, you will get x square. Because 3x square is, 3 is 0 and 6x is 0
times x, so that is gone and we will get this as x square plus 2. When we divide this by x square
plus 1, this can be written as 1 into x square plus 1 plus 1, so 2x plus 2 into 2x plus 1 is will be
reported as 1. So this will be 1 because mod x square plus 1, when you divide it by x square plus
1, the remainder is 1 and that happens to be one of the elements in the collection.

So whatever you multiply, since we are taking the remainder when you divide by x square plus
1, where the division is now a polynomial division, the remainder is going to be a polynomial of
degree less than x square plus 1, that is going to be a polynomial of degree utmost 1. And we
have listed out all the degree 1 polynomial in our collection. This is a well-defined operation.
The only thing we need to really check is that, under this multiplication, every element in this
collection has a multiplicative inverse. This collection of polynomials, this general collection,
will be a ring, irrespective of whether x square is reducible or irreducible, but the irreducibility
property will ensure that each element has an inverse.

820
(Refer Slide Time: 28:43)

So now let us define the notion of irreducibility. So let us look at the polynomial k(x). Suppose
we can write k(x) is equal to h(x) times g(x), then either h(x) is a constant polynomial or g(x) is a
constant polynomial. The polynomial which satisfies this condition is called as irreducible
polynomial. So let us call this as condition 1, any polynomial which satisfies condition 1 is called
an irreducible polynomial. So this is a notion very similar to that of primality, a prime number if
you write it as product of two objects, I mean, two numbers then one of the numbers is 1.

821
It is always, I mean, one of the, mean if a number is written as a into b, and if a or b is 1, it is
always the case, then we say that the number p is prime. The only way in which you can write a
prime number as product of two other elements is by choosing one of those elements as 1. But
here instead of 1, we have what is a constant polynomial. So the only factorization possible for
k(x), involves a constant polynomial, then we will say k is an irreducible polynomial. In other
words, this does not have common factors with any other polynomial other than the constant
factors. So now let us see how the notion of irreducibility helps us in constructing the finite field.
So the condition that we have to prove is that every element in the collection is invertible, this is
what we need to prove. So let us say a(x) is one particular element, since k(x) is an irreducible
polynomial, if we look at the greatest common divisor of a(x) and k(x), this must be equal to 1.

We can adjust the constant polynomial, so that the coefficients are, it is not anything different
from 1, it is exactly 1. Here, in the definition of irreducibility, we could have constant factor
which is a constant polynomial but we can adjust the constant polynomial to be 1. So GCD of
a(x) and k(x), if it is equal to 1 because k(x) is a irreducible polynomial, reason is irreducibility.
Because if it is anything other than 1, then a(x) and k(x) has a common factor and that common
factor is a common, is a factor of k(x). So GCD (a(x), k(x)) equals 1 and the definition of GCD is
the smallest positive linear combination. So what this also means is, we can find alpha x and beta
x, such that alpha x a(x) plus beta x k(x) is equal to 1. Now this would mean that alpha x into
a(x) is equal to 1 minus beta x times k(x). So if you look at this particular polynomial, let's call
that as g(x). What we know is, alpha x when multiplied by a(x) gives g(x) and if we look at this
mod k(x), this g(x) is nothing but 1. So, alpha x into a(x) is 1 mod k(x). So that means, if we look
at the polynomial corresponding to alpha x, means, consider the remainder obtained on dividing
alpha x by k(x), let us call that as alpha prime x, alpha prime x is the multiplicative inverse of
a(x).

So we took an arbitrary polynomial from our collection, and since GCD with k(x) is 1, we can
find a linear combination of a(x) and k(x), which gives us 1and from that we can extract an
inverse of a(x), that means that every element is invertible and therefore this collection forms a
group. What needs to be further shown is that for every degree, there is an irreducible
polynomial. It is little beyond the scope of this course, so we would just assume that, that for any

822
particular degree, there is an irreducible polynomial of that degree. So that is the end of this
lecture.

823
THIS BOOK
IS NOT FOR
SALE
NOR COMMERCIAL USE

(044) 2257 5905/08


[Link]
[Link]

You might also like