Student copy
Introduction to Database
Systems
L6 – Normalization
Sem2 24-25
Recap
Forthe Crow’s Foot symbols below, what are
their corresponding cardinalities
2
Recap
Ifan entity occurrence requires the
occurrence of another entity, then their
relationship is said to be _____________.
This is known as relationship (participation /
strength)
If
one entity is existent-independent of
another one, then their relationship is said to
be ___________.
This is known as relationship (participation /
strength)
3
Normalization
Isa process for evaluating and correcting
table structures to minimize data
redundancies reducing data anomalies
Involves assigning attributes to tables based
on the concept of determination
Works through a series of stages called
Normal Forms (NF), from first normal form
(1NF) to third normal form (3NF)
From the structural point of view, 2NF is better
than 1NF, and 3NF is better than 2NF
4
Normalization - Objectives
Toensure that each table conforms to the
concept of well-formed relations, i.e. each table
should have the following characteristics:
Each table represents a single subject
No data item will be unneccessarily stored in more
than one table
All non-prime attributes in a table are dependent on
the primary key, i.e. all data are uniquely identifiable
by the primary key
Each table has no insertion, update or deletion
anomalies 5
Functional Dependency
Attribute B is fully functionally dependent on
Attribute A if each value of A determines one
and only one value of B
Partial Dependency – only part of the primary
key is needed to determine the value of a
dependent attribute
E.g. suppose primary key (A,B) determines
attributes (C,D) but B also determines C; then
BC is a partial dependency
6
Functional Dependency
Transitive Dependency – when there are
functional dependencies such that XY and
Y Z, and X is the primary key; then the
dependency XZ is a transitive dependency
as X determines Z via Y
Although the actual transitive dependency is
XZ, but this is only because of the
dependency between two non-prime attributes
YZ;
For simplicity, this functional dependency between
non-prime attributes are referred to as transitive
dependency 7
Normalization Process
Given a report:
8
Unnormalized table
Turn the report into table form
9
Normalization Process -
Convert to 1NF
Step 1 – Eliminate the Repeating Groups
Repeating groups – a group of multiple entries of
the same type for any single key attribute
To eliminate repeating groups, fill in the nulls with
appropriate data value
10
Normalization Process -
Convert to 1NF
Step 2 – Identify the Primary Key
11
Normalization Process -
Convert to 1NF
Step 3 – Identify all Dependencies
12
Normalization Process -
Convert to 1NF
1NF table:
All of the key attributes are defined
There are no repeating groups in the table – each
row/column intersection contains one and only
one value, not a set of values
All attributes are dependent on the primary key
13
Normalization Process -
Convert to 2NF
Step
1 – Make new tables to eliminate partial
dependencies
For each component of the primary key that acts as
a determinant in a partial dependency, create a new
table with a copy of that component as the primary
key; i.e. the component will become the primary key
in a new table
E.g. in fig 6.3, PROJ_NUM and EMP_NUM will
become primary keys in new tables; and the original
table will be divided into 3 tables, namely
PROJECT, EMPLOYEE, and ASSIGNMENT
14
Normalization Process -
Convert to 2NF
Step 2 – Reassign corresponding dependent
attributes
The attributes that are dependent in a partial
dependency are removed from the original table
and placed in the new table with its determinant
Attributes that are no dependent in a partial
dependency will remain in the original table
15
Normalization Process -
Convert to 2NF
16
Normalization Process -
Convert to 2NF
2NF table:
It is in 1NF, and
It includes no partial dependencies
It is possible that a 2NF table still has transitive
dependency
17
Normalization Process -
Convert to 3NF
Step1 – Make new tables to eliminate
transitive dependencies
For each transitive dependency, make a copy of
its determinant as a primary key for a new table;
E.g. JOB_CLASS CHG_HOUR
Hence a new table JOB is created with primary key
JOB_CLASS
18
Normalization Process -
Convert to 3NF
Step 2 – Reassign corresponding dependent
attributes
Place the dependent attributes in the new tables
and remove them from their original tables
E.g. CHG_HOUR is removed from the
EMPLOYEE table and placed in the JOB table
19
Normalization Process -
Convert to 3NF
20
Normalization Process -
Convert to 3NF
3NF table:
It is in 2NF, and
It contains no transitive dependencies
21
Discussion
Is the following table in 3NF?
22
Boyce-Codd Normal Form
(BCNF)
BCNF – where every determinant in the table
is a candidate key
E.g. a table has two candidate keys – A+B,
and A+C
23
Boyce-Codd Normal Form
(BCNF)
24
Denormalization
When all tables are normalized to 3NF, the
number of tables in the database (increases /
decreases).
Data redundancy problems are minimized
Generating information from tables may
require joining of many tables longer
processing time
25
Denormalization
Denormalized relations – introduce small
amount of redundant data in the model
3 cases:
1. Redundant data – e.g. suppose CUS_ZIP determines
CUS_CITY, then the following table is in (1NF / 2NF /
3NF)
Should it be normalized to 3NF?
CUS_CODE CUS_STREET CUS_CITY CUS_ZIP CUS_PHONE
26
Denormalization
2. Derived data – e.g. TOTAL_LINE_PRICE is a derived
attribute, i.e. PROD_PRICE and PROD_QTY
determines TOTAL_LINE_PRICE
Should it be normalized to 3NF?
INV_NUM LINE_NUM PROD_CODE PROD_PRICE PROD_QTY TOTAL_LINE
_PRICE
= PROD_PRICE *
PROD_QTY
27
Denormalization
3. Information requirements – using a temporary
denomalized table to hold report data
This is fine as long as the table is used only for
reporting purpose. Therefore there is no data anomaly
problems.
28