Normalizatio
1
Normalization
The biggest problem needed to be solved in
database is data redundancy.
Why data redundancy is the problem? Because it
causes:
Insert
Anomaly
Update
Anomaly
Teacher
Delete Subject Teacher Tel
Degree
Anomaly
Sok San Database Master's 012666777
Van Sokhen Database Bachelor's 017678678
Sok San E- Master's 012666777
Commerce
2
Normalization (Cont.)
Normalization is the process of removing redundant
data from your tables to improve storage
efficiency, data integrity, and scalability.
Normalization generally involves splitting existing
tables into multiple ones, which must be re-joined
or linked each time a query is issued.
3
Steps of Normalization
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
In practice, 1NF, 2NF, and 3NF are enough for
database.
4
First Normal Form (1NF)
The official qualifications for 1NF are:
•Each attribute name must be unique.
[Link] attribute value must be single(atomic).
[Link] stored in a column should be of the same domain
5
First Normal Form (1NF) (Cont.)
Example of a table not in
1NF :
Group Topic Student Score
Group A Intro MongoDB raju 18 marks
srinu 17 marks
Group B Intro MySQL smith 19 marks
martin 16 marks
It violates the 1NF because:
Attribute values are not
single.
Repeating groups exists.
6
First Normal Form (1NF) (Cont.)
After
eliminating:
Group Topic Student Score
Name
A Intro MongoDB raju 18
A Intro MongoDB srinu 17
B Intro MySQL smith 19
B Intro MySQL martin 16
Now it is in 1NF.
7
Functional Dependencies
We say an attribute, B, has a functional dependency on
another attribute, A, if for any two records, which have
the same value for A, then the values for B in these two
records must be the same. We illustrate this as:
AB (read as: A determines B or B depends on
A)
8
Functional Dependencies (cont.)
EmpNum EmpEmail EmpFnam EmpLnam
e e
123 jdoe@[Link] John Doe
456 psmith@[Link] Peter Smith
555 m
alee1@[Link] Alan Lee
633 m
pdoe@[Link] Peter Doe
787 alee2@[Link] Alan Lee
m
If EmpNum is the PK then the
FDs: EmpNu EmpEmail, EmpFname,
m EmpLname
must
exist.
9
Functional Dependencies (cont.)
EmpNu EmpEmail, EmpFname,
m EmpLname
3 different
ways you
might see FDs
depicted
EmpEmail
EmpNu
EmpFnam
m
e
EmpLname
EmpNu EmpEm EmpFna EmpLna
m ail me me
10
Determinant
Functional
Dependency
EmpNu EmpEmail
m
Attribute on the left hand side is known as
the
determinant
•EmpNum is a determinant of EmpEmail
11
Second Normal Form (2NF)
The official qualifications for 2NF
are:
•A table is already in 1NF.
•All nonkey attributes are fully dependent on the
primary key.
All partial dependencies are removed to place in
another table.
12
Example of a table not in
2NF:
CourseID SemesterI Num Course Name
D Student
IT101 201301 25 Database
IT101 201302 25 Database
IT102 201301 30 Web Prog
IT102 201302 35 Web Prog
IT10 20140 2 Networkin
3 1 0 g
Primary Key
The Course Name depends on only CourseID, a
part of the primary key not the whole primary
{CourseID, SemesterID}.It’s called partial
dependency.
Solution:
13
Remove CourseID and Course Name together to
create a new table.
CourseID SemesterI Num
D Student
IT101 201301 25
IT101 201302 25
IT102 201301 30
IT102 201302 35
IT103 201401 20
THE TABLE IN 2NF
Finally, connect CourseI Course
relationshi
the D
IT101 Name
Database
p. IT102 Web Prog
IT103 Networking
14
Third Normal Form (3NF)
The official qualifications for
3NF are:
•A table is already in 2NF.
•Nonprimary key attributes do not depend
on other nonprimary key attributes
(i.e. no transitive dependencies)
All transitive dependencies are removed
to place in another table.
15
Example of a Table not in
3NF:
StudyID Course Name Teacher Name Teacher Tel
1 Database Sok Piseth 012 123 456
2 Database Sao Kanha 0977 322 111
3 Web Prog Chan Veasna 012 412 333
4 Web Prog Chan Veasna 012 412 333
5 Networking Pou Sambath 077 545 221
Primary Key
The Teacher Tel is a nonkey attribute,
and the Teacher Name is also a nonkey
atttribute. But Teacher Tel depends on
Teacher Name.
It is called transitive dependency.
Solution:
Remove Teacher Name and Teacher Tel
together to create a new table.
16
Teacher Name Teacher Tel
Done?
Oh no, it is still not in 1NF
Sok Piseth 012 123 456
yet. Remove Repeating
Sao Kanha 0977 322 111 row.
StudyID Course Name [Link]
Chan Veasna 012 412 333
1 Database T1
Chan Veasna 012 412 333
2 Database T2
Pou Sambath 077 545 221
3 Web Prog T3
Teacher Teacher 4 Web Prog T3
Name
Sok Tel
012 123 5 Networking T4
Piseth
Sao 456
0977 322
Kanha
Chan 111
012 412
Veasna
Pou 333
077 545
Sambath 221
ID Teacher Teacher Tel
Note about primary key: Name
-In theory, you can choose T1 Sok Piseth 012 123 456
T2 Sao Kanha 0977 322 111
Teacher Name to be a primary
T3 Chan Veasna 012 412 333
key.
17 in practice, you should add
-But T Pou 077 545