FUNCTIONAL DEPENDENCY
A constraint between two attributes or two sets of attributes.
For any relation R, attribute B is functionally dependent on attribute A. The
functional dependency of B on A is represented by on arrow, as follow.
A B
The following figure shows the functional dependency.
Emp-Id NameSalary Deptno
Emp-Id Name, salary, Deptno
Emp-Id is the only determinant (The attribute on the left hand side of the
arrow in a functional dependency) in this relation. All of the other attributes are
functionally dependent on EMP-ID.
NORMALIZATION
Normalization is the process of decomposing relations with anomalies to
produce smaller, well-structural relations.
Normalization can be accomplished and understand in stages, each of
which corresponds to a normal form. A normal form is state of a relation that
results from applying simple rules regarding functional dependencies (or
relationships between attributes) to the relation.
We describe these rules detail in the following.
1. First Normal Form : Any multi-valued attributes have been removed.
2. Second Normal Form : Any partial functional dependencies have been
removed.
3. Third Normal Form : Any transitive dependencies have been
removed.
4. Boyee/Codd Normal Form: Any remaining anomalies that result from
functional
dependencies have been removed.
5. Fourth Normal Form : Any multi-valued dependencies have been
removed.
6. Fifth Normal Form : Any remaining anomalies have been removed.
The following figure shows the steps in normalization.
1
First Normal Form : -
A relation that contains no multi-valued attributes
The following table contains the repeating groups.
a) Table with repeating groups:
Emp_Id Name Salary Course_Title Date_Completed
100 Simpson 20000 Oracle 2/2/04
C 10/2/04
101 Blake 10000 Accounts 11/2/04
102 Chris 12000 C++ 13/2/04
Java 14/2/04
103 Davis 10000 NetWorks 20/2/04
The above table contains the repeating groups. The table can be
converted to the relation EMP by removing the multi-valued attributes. The
following table illustrates the EMP relation.
b) EMP Relation:
Emp_Id Name Salary Course_Title Date_Completed
100 Simpson 20000 Oracle 2/2/04
100 Simpson 20000 C 10/2/04
101 Blake 10000 Accounts 11/2/04
102 Chris 12000 C++ 13/2/04
2
102 Chris 12000 Java 14/2/04
103 Davis 10000 NetWorks 20/2/04
So, The above EMP relation in first normal form because the relation EMP
contains no multi-valued attributes.
Second Normal Form: -
A relation in first normal form in which every non-key attribute is
fully functionally dependent on the primary key.
A relation that is in first normal form will be in second normal form if any
one of the following conditions applies:
1. The Primary Key consists of only one attribute.
2. Every non-key attribute is functionally dependent on the full set of
Primary Key attributes.
The following figure shows the functional dependencies in EMP.
Emp_Id Course_Title Name Salary Date_Completed
The above relation that is not in second normal form. The Primary Key for
this relation is the Composite Key Emp_Id, Course_Title. Therefore non-key
attributes Name and Salary are functionally dependent on part of the Primary
Key (Emp_Id) but not on Course_Title. These dependencies are shown in the
above figure.
A partial functional dependency is a functional dependency in which
one or more non-key attributes (Name,Salary) are functionally dependent on
part of the Primary Key(Emp_Id).
So, the EMP relation to convert to second normal form, we decompose the
relation into new relations that satisfy one of the conditions described above.
The EMP relation is decomposed into the following EMPLOYEE and EMP_COURSE
relations.
EMPLOYEE
Emp_Id Name Salary
This relation satisfies above condition 1. So, it is in second normal form.
3
EMP_COURSE
Emp_Id Course_Title Date_Completed
This relation satisfies above condition 2. So, it is also in second normal
form.
Third Normal Form: -
A relation that is in second normal form and has no transitive
dependencies.
A transitive dependency in a relation is a functional dependency between
two (or more) non-key attributes. For example, consider the relation.
SALES (Cust_Id, Name, Sales_Person, Region)
Sample data for this relation appear in following figure.
SALES
Cust_Id Name Sales_Person Region
1 Srinvias Smith South
2 Jones Williams West
3 Ramu Smith South
4 Vasu Eagle East
5 Kishore Williams West
6 Ford Naren North
The functional dependencies in the SALES relation are shown graphically
in the following figure.
SALES
Cust_Id Name Sales_Person Region
Cust_Id is the Primary Key, so that all of the remaining attributes are
functionally dependent on this attribute. Region is functionally dependent on
Sales_Person and Sales_Person is functionally on Cust_Id. This is a transitive
dependemncy.
The following are update anomalies in SALES.
1. Insertion Anomaly: - A new sales person Robinson assigned to the
North region cannot be entered until a customer has been assigned.
2. Deletion Anomaly: - If a customer number 4 is deleted from the table,
we lose the information that Sales_Person Eagle is assigned to the East
Region.
3. Modification Anamoly: - If a sales person Smith is reassigned to the
East region several rows must be changed.
4
These anamolies arise as a result of the transitive dependency. The
transitive dependency can be removed by decomposing SALES into two
relations. The two relations SALES1 and SPERSON are shown in the following
figure.
SALES1 SPERSON
Cust_Id Name Sales_Person Sales_Person Region
1 Srinivas Smith Smith South
2 Jones Williams Williams West
3 Ramu Smith Eagle East
4 Vasu Eagle Naren North
5 Kishore Williams
6 Ford Naren
Note that Sales_Person, which is the determinent in the transitive
dependency in SALES, becomes the Primary Key in SPERSON. Sales_Person
becomes a foreign key in SALES1.
The following figure shows the new relations are in third normal form.
Since no transitive dependencies exist. You should verify that the anomalies
that exist in SALES are not in present in SALES1 and SPERSON.
SPERSON
Sales_Person Region
SALES1
Cust_Id Name Sales_Person
BOYCE – CODD Normal Form: -
A relation in which every determinant is a candidate key.
When a relation has more than one candidate key, anomalies may result
even though that relation is in 3NF. For example, the STUDENT_ADVISOR
relation shown in figure(a). This relation has Student_Id, Major, Advisor and
Maj_GPA.
Figure (a) relation in 3NF, but not BCNF.
STUDENT_ADVISOR
Student_Id Major Advisor Maj_GPA
1 Physics Prasad 4.0
1 Computers Kumar 3.3
2 Maths Venkat 3.2
3 Computers Chandu 3.7.
4 Physics Prasad 3.5
Figure(b) Functional dependencies in STUDENT_ADVISOR
5
Student_Id Major Advisor Maj_GPA
In the above figure (b), the Primary Key for this relation is the composite
key consisting of Student_Id and Major. Thus, the two attributes Advisor and
Maj_GPA are functionally dependent on this key. Major is functionally depending
on Advisor. So that is not a transitive dependency.
Anomalies in STUDENT_ADVISOR
1. Insert Anomoly: - Suppose we want to insert a row with the information
that Krishna advises in computer science. It cannot be done until at least
one student major in computer science is assigned Krishna as an advisor.
2. Update Anomoly: - Suppose that in physics the advisor Prasad is
replaced by Srinivas. This change must be made in two rows in the table.
3. Deletion Anomoly: - If a Student number 3 withdrawn from college. We
lose the information that Batch advises in computers.
The STUDENT_ADVISOR is in 3NF but not BCNF the relation is converted
to BCNF using a simple two-step process.
In this first step the relation is modified (revised). The revised
STUDENT_ADVISOR relation is shown in figure.
Student_Id Advisor Major Maj_GPA
The determinant Advisor becomes part of the composite Primary Key. The
attribute major in which functionally dependent on Advisor and a non-key
attribute.
You will discover that the new relation has a partial functional
dependency. So, the relation is in first normal form.
A second step in the process is to decompose the relation to eliminate the
partial functional dependency. This result is shown in following two relations.
Student_Id Advisor Maj_GPA Advisor Major
These relations are in 3NF and also in BCNF. Since there is only one
candidate key (the primary key) in each relation.
The two relations named STUDENT and ADVISOR with sample data are
shown in figure. You should verify that these relations are free of the anomalies.
6
STUDENT ADVISOR
Student_Id Advisor Maj_GPA Advisor Major
1 Prasad 4.0 Prasad Physics
1 Kumar 3.3 Kumar Computers
2 Venkat 3.2 Venkat Maths
3 Chandu 3.7 Chandu Computers
4 Prasad 3.5
Fourth Normal Form: -
A relation in BCNF that contains no multi-valued dependencies.
When a relation is in BCNF, there no longer any anomalies that result
from functional dependencies. However, there may still be anomalies that result
from multi-valued dependencies are defined below.
For example consider the table shown in figure.
OFFERING OFFERING
Course Instructor Textbook Course Instructor Textbook
Managemen Mohan Dennis Management Mohan Dennis
t Krishna Ritche Management Mohan Ritchie
Rama rao Management Krishna Dennis
Rajiv Ching Management Krishna Ritche
Finance Chang Management Rama rao Dennish
Management Rama rao Ritche
Finance Rajiv Ching
Finance Rajiv Chang
(a) Table of Courses, Instructors & Textboks (b) Relation in BCNF
This shows for each course the instructors who teach that course and the
textbooks that are used.
In the above figure (b) that table has been converted to a relation by
filling in all of the empty cells. So, that table offering in first normal form. The
Primary Key of this relation consists of all three Course, Instructor and Textbook
attributes. Since there are no determinants other than the Primar Key, so the
relation OFFERING in BCNF.
The relation contains all much redundant data. For example, suppose that
we want to add a third textbook. Robinson to the Management Course. This
change would require the addition of three new rows to the relation in figure(b).
The type of dependency shown in this example is called a multi-valued
dependency (there are at least three attributes A, B and C is a relation, and for
each value of A there is a well-defined set of values of B and a well-defined set
of values of C. However, the set of values of B is independent of set C and vice-
versa.)
To remove the multi-valued dependency from a relation OFFERING we
divide the OFFERING relation into two new TEACHER and TEXT relations. These
two relations are shown in following figure.
7
TEACHER TEXT
Course Instructor Course Textbook
Management Mohan Management Dennis
Management Krishna Management Ritchie
Management Rama rao Finance Ching
Finance Rajiv Finance Chang
TEACHER contains the Course and Instructor attributes since each course
there is a well-defined set of instructors. The TEXT contains the attributes
Course and Textbook. However, there is no relation containing the attributes
instructor and course. Since these attributes are independent.
A relation is the fourth normal form if it is in BCNF and contains no
multi-valued dependencies. You are easily verified that the TEACHER and TEXT
are in 4NF. Also, you can verify that you can reconstruct the original relation
OFFERING.
Fifth normal form: -
A relation R in fifth normal form if and only if every join
dependency in R is implemented by the candidate key of R. This is also
called as project-join normal form.
To explain this normal form let us consider the SPJ relation ship.
SPJ(S#,P#,J#)
S# P# J#
S1 P1 J2
S1 P2 J1
S2 P1 J1
S1 P1 J1
The SPJ relation into SP, PJ and JS projections.
SP(S#,P#) PJ(P#,J#)
JS(J#,S#)
S# P# P# J# J# S#
S1 P1 P1 J2 J2 S1
S1 P2 P2 J1 J1 S1
S2 P1 P1 J1 J1 S2
Combine SP and PJ over P#
SP(S#,P#) PJ(P#,J#)
SPJ(J#,S#)
S# P# P# J# S# P# J#
8
S1 P1 P1 J2 S1 P1 J2
S1 P2 P2 J1 S1 P1 J1
S2 P1 P1 J1 S1 P2 J1
S2 P1 J2
S2 P1 J1
Join over P#
The resulting relation in join with the JS over (J#,S#)
JS(J#,S#) SPJ(S#,P#,J#)
J# S# S# P# J# S# P# J#
J2 S1 S1 P1 J2 S1 P1 J2
J1 S1 S1 P1 J1 S1 P1 J1
J1 S2 S1 P2 J1 S1 P2 J1
S2 P1 J2 S2 P1 J1
S2 P1 J1
Join over (J#,S#)
The original SPJ
The basic SPJ table into three projections and join those projections with
respect to their candidate keys. The above diagrams illustrate the project-join
operations.