Unit 2
The Relational Data Model and
Relational Database Constraints
Relational Algebra
Copyright © 2004 Pearson Education, Inc.
Chapter Outline
Relational Model Concepts
Relational Model Constraints and Relational Database
Schemas
Update Operations and Dealing with Constraint
Violations
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-2
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Model Concepts
The relational Model of Data is based on the
concept of a Relation.
A Relation is a mathematical concept based on the
ideas of sets.
The strength of the relational approach to data
management comes from the formal foundation
provided by the theory of relations.
We review the essentials of the relational approach
in this chapter.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-3
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Model Concepts
The model was first proposed by Dr. E.F. Codd of
IBM in 1970 in the following paper:
"A Relational Model for Large Shared Data
Banks," Communications of the ACM, June 1970.
The above paper caused a major revolution in the field of
Database management and earned Ted Codd the coveted
ACM Turing Award.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-4
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
INFORMAL DEFINITIONS
RELATION: A table of values
– A relation may be thought of as a set of rows.
– A relation may alternately be though of as a set of columns.
– Each row represents a fact that corresponds to a real-world entity or
relationship.
– Each row has a value of an item or set of items that uniquely
identifies that row in the table.
– Sometimes row-ids or sequential numbers are assigned to identify the
rows in the table.
– Each column typically is called by its column name or column header
or attribute name.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-5
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
FORMAL DEFINITIONS
A Relation may be defined in multiple ways.
The Schema of a Relation: R (A1, A2, .....An)
Relation schema R is defined over attributes A1, A2, .....An
For Example -
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
Here, CUSTOMER is a relation defined over the four
attributes Cust-id, Cust-name, Address, Phone#, each of
which has a domain or a set of valid values. For example,
the domain of Cust-id is 6 digit numbers.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-6
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
FORMAL DEFINITIONS
A tuple is an ordered set of values
Each value is derived from an appropriate domain.
Each row in the CUSTOMER table may be referred to as a
tuple in the table and would consist of four values.
<632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000">
is a tuple belonging to the CUSTOMER relation.
A relation may be regarded as a set of tuples (rows).
Columns in a table are also called attributes of the relation.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-7
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
FORMAL DEFINITIONS
A domain has a logical definition: e.g.,
“USA_phone_numbers” are the set of 10 digit phone
numbers valid in the U.S.
A domain may have a data-type or a format defined for it.
The USA_phone_numbers may have a format: (ddd)-ddd-
dddd where each d is a decimal digit. E.g., Dates have
various formats such as monthname, date, year or yyyy-mm-
dd, or dd mm,yyyy etc.
An attribute designates the role played by the domain. E.g.,
the domain Date may be used to define attributes “Invoice-
date” and “Payment-date”.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-8
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
FORMAL DEFINITIONS
The relation is formed over the cartesian product of the sets;
each set has values from a domain; that domain is used in a
specific role which is conveyed by the attribute name.
For example, attribute Cust-name is defined over the domain
of strings of 25 characters. The role these strings play in the
CUSTOMER relation is that of the name of customers.
Formally,
Given R(A1, A2, .........., An)
r(R) dom (A1) X dom (A2) X ....X dom(An)
R: schema of the relation
r of R: a specific "value" or population of R.
R is also called the intension of a relation
r is also called the extension of a relation
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-9
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
FORMAL DEFINITIONS
Let S1 = {0,1}
Let S2 = {a,b,c}
Let R S1 X S2
Then for example: r(R) = {<0,a> , <0,b> , <1,c> }
is one possible “state” or “population” or
“extension” r of the relation R, defined over domains
S1 and S2. It has three tuples.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-10
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
DEFINITION SUMMARY
Informal Terms Formal Terms
Table Relation
Column Attribute/Domain
Row Tuple
Values in a column Domain
Table Definition Schema of a Relation
Populated Table Extension
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-11
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Example - Figure 5.1
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-12
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
CHARACTERISTICS OF RELATIONS
Ordering of tuples in a relation r(R): The tuples are not
considered to be ordered, even though they appear to be in
the tabular form.
Ordering of attributes in a relation schema R (and of
values within each tuple): We will consider the attributes
in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> to be
ordered .
(However, a more general alternative definition of
relation does not require this ordering).
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-13
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
CHARACTERISTICS OF RELATIONS
Values in a tuple: All values are considered atomic
(indivisible). Mutivalued and composite attributes are not
allowed. Flat Relational Model
Multivalued separate relation
Composite component attributes
A special null value is used to represent values that are
unknown or inapplicable to certain tuples.
NULL – Not applicable,
Value exists but is unknown,
Not known whether the value exists
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-14
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
CHARACTERISTICS OF RELATIONS
Notation:
- We refer to component values of a tuple t
by t[Ai] = vi (the value of attribute Ai for
tuple t).
Similarly, t[Au, Av, ..., Aw] refers to the
subtuple of t containing the values of
attributes Au, Av, ..., Aw, respectively.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-15
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
CHARACTERISTICS OF RELATIONS-
Figure 5.2
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-16
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Model
Constraints
Constraints are restrictions on data that can be specified on a
relational database
Three main categories:
1) Inherent Model based / Implicit Constraints
2) Schema based / Explicit Constraints
3) Application based / Semantic Constraints / Business Rules
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Slide 1-17
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Schema based / Explicit Constraints
1) Domain Constraints – Each attribute value must be
atomic and drawn from a domain - Datatype
2) Key Constraints – Super, Candidate & Primary Keys
3) Constraints on NULLs – Some attributes can’t be
NULL
4) Entity Integrity Constraints – Primary key value
cannot be NULL
5) Referential Integrity Constraints – A tuple in one
relation that refers to another relation must refer to an
existing tuple in that relation.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Slide 1-18
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Integrity Constraints
Constraints are conditions that must hold
on all valid relation instances. There are
three main types of constraints:
1. Key constraints
2. Entity integrity constraints
3. Referential integrity constraints
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-19
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Key Constraints
Superkey of R: A set of attributes SK of R such that no
two tuples in any valid relation instance r(R) will have
the same value for SK. That is, for any distinct tuples t1
and t2 in r(R), t1[SK] t2[SK].
Key of R: A "minimal" superkey; that is, a superkey K
such that removal of any attribute from K results in a set
of attributes that is not a superkey.
Example: The CAR relation schema:
CAR(State, Reg#, SerialNo, Make, Model, Year)
has two keys Key1 = {State, Reg#}, Key2 = {SerialNo}, which are also
superkeys. {SerialNo, Make} is a superkey but not a key.
If a relation has several candidate keys, one is chosen
arbitrarily to be the primary key. The primary key
attributes are underlined.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-20
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Key Constraints
5.4
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-21
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Entity Integrity
Relational Database Schema: A set S of relation schemas
that belong to the same database. S is the name of the
database.
S = {R1, R2, ..., Rn}
Entity Integrity: The primary key attributes PK of each
relation schema R in S cannot have null values in any tuple
of r(R). This is because primary key values are used to
identify the individual tuples.
t[PK] null for any tuple t in r(R)
Note: Other attributes of R may be similarly constrained
to disallow null values, even though they are not members
of the primary key.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-22
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Referential Integrity
A constraint involving two relations (the previous
constraints involve a single relation).
Used to specify a relationship among tuples in two
relations: the referencing relation and the referenced
relation.
Tuples in the referencing relation R1 have attributes FK
(called foreign key attributes) that reference the primary
key attributes PK of the referenced relation R2. A tuple t1
in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK].
A referential integrity constraint can be displayed in a
relational database schema as a directed arc from R [Link] to
R2.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-23
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Referential Integrity
Constraint
Statement of the constraint
The value in the foreign key column (or columns)
FK of the the referencing relation R1 can be either:
(1) a value of an existing primary key value of the
corresponding primary key PK in the referenced
relation R2,, or..
(2) a null.
In case (2), the FK in R1 should not be a part of its own
primary key.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-24
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Other Types of Constraints
Semantic Integrity Constraints:
- based on application semantics and cannot be
expressed by the model per se
- E.g., “the max. no. of hours per employee for all
projects he or she works on is 56 hrs per week”
- A constraint specification language may have
to be used to express these
- SQL-99 allows triggers and ASSERTIONS to
allow for some of these
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-25
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
5.5
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-26
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
5.6
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-27
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
5.7
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-28
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Update Operations and Dealing with
Constraint Violations
29
The operations of the relational model can be
categorized into retrievals and update.
There are three basic update operations:
� Insert
� Delete
� Modify
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
The Insert Operation
30
The insert operation provides a list of attribute
values for a new tuple t that is to be inserted
into a relation R.
Insert can violate any of the four types of
constraints.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Example…
31
1. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, null, ‘1960-04-05’, ‘6357
Windy Lane, Katy, TX’, F, 28000, null, 4> into EMPLOYEE.
1. Insert <‘Alicia’, ‘J’, ‘Zelaya’, ‘999887777’, ‘1960-04-05’, ‘6357
Windy Lane, Katy, TX’, F, 28000, ‘987654321’, 4> into
EMPLOYEE.
1. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-05’,
‘6357 Windswept, Katy, TX’, F, 28000, ‘987654321’, 7> into
EMPLOYEE.
1. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-05’,
‘6357 Windy Lane, Katy, TX’, F, 28000, null, 4> into
EMPLOYEE.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
The Delete Operation
32
The delete operation is used to delete the tuples from the
relation.
Eg:
Delete the WORKS_ON tuple with ESSN = ‘999887777’
and PNO = 10.
Delete the EMPLOYEE tuple with SSN = ‘999887777’.
Delete the EMPLOYEE tuple with SSN = ‘333445555’.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
The Update Operation
33
The update(or Modify) operation is used to change the values of one or
more attributes in a tuple(or tuples) of some relation R.
Eg:
Update the SALARY of the EMPLOYEE tuple with SSN = ‘999887777’
to 28000.
Update the DNO of the EMPLOYEE tuple with SSN = ‘999887777’ to 1.
Update the DNO of the EMPLOYEE tuple with SSN = ‘999887777’ to 7.
Update the SSN of the EMPLOYEE tuple with SSN = ‘999887777’ to
‘987654321’.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
The Transaction Concept
34
A database application program running against a relational database
typically runs a series of transactions.
A transaction involves reading from the database as well as doing
insertions, deletions and updates to existing values in the database.
These transactions must leave the database in a consistent state.
A single transaction may involve any number of retrieval operations
that reads from the database and any number of update operations.
A large number of commercial applications running against relational
databases in the Online Transaction Processing(OLTP) systems are
executing transactions at rates several hundreds per second.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Update Operations on Relations
INSERT a tuple.
DELETE a tuple.
MODIFY a tuple.
Integrity constraints should not be violated by the update
operations.
Several update operations may have to be grouped
together.
Updates may propagate to cause other updates
automatically. This may be necessary to maintain integrity
constraints.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-35
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Update Operations on Relations
In case of integrity violation, several actions can
be taken:
– Cancel the operation that causes the violation (REJECT
option)
– Perform the operation but inform the user of the
violation
– Trigger additional updates so the violation is corrected
(CASCADE option, SET NULL option)
– Execute a user-specified error-correction routine
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-36
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
In-Class Exercise
(Taken from Exercise 5.15)
Consider the following relations for a database that keeps
track of student enrollment in courses and the books adopted
for each course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a relational schema diagram specifying the foreign
keys for this schema.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-37
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
The Relational Algebra
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Chapter Outline
Example Database Application (COMPANY)
Relational Algebra
– Unary Relational Operations
– Relational Algebra Operations From Set Theory
– Binary Relational Operations
– Additional Relational Operations
– Examples of Queries in Relational Algebra
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-39
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Database State for COMPANY
All examples discussed below refer to the COMPANY database shown here.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-40
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra
The basic set of operations for the relational model is known
as the relational algebra. These operations enable a user to
specify basic retrieval requests.
The result of a retrieval is a new relation, which may have
been formed from one or more relations. The algebra
operations thus produce new relations, which can be further
manipulated using operations of the same algebra.
A sequence of relational algebra operations forms a
relational algebra expression, whose result will also be a
relation that represents the result of a database query (or
retrieval request).
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-41
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra
Operations
Categories of RA Operations:
RA operations from Mathematical Set Theory
Union, Intersection, Set Difference and Cartesian
Product
RA operations for databases
Select, Project, Join
Aggregate Functions
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-42
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Unary Relational Operations
SELECT Operation
SELECT operation is used to select a subset of the tuples from a relation that
satisfy a selection condition. It is a filter that keeps only those tuples that
satisfy a qualifying condition – those satisfying the condition are selected
while others are discarded. It does a horizontal partitioning of the relation.
Example: To select the EMPLOYEE tuples whose department number is
four or those whose salary is greater than $30,000 the following notation is
used:
DNO = 4 (EMPLOYEE)
SALARY > 30,000 (EMPLOYEE)
In general, the select operation is denoted by <selection condition>(R) where the
symbol (sigma) is used to denote the select operator, and the selection
condition is a Boolean expression specified on the attributes of relation R
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-43
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
SELECT…contd.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-44
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Unary Relational Operations
SELECT Operation Properties
– The SELECT operation <selection condition>(R) produces a relation S
that has the same schema as R
– The SELECT operation is commutative; i.e.,
<condition1>(< condition2> ( R)) = <condition2> (< condition1> ( R))
– A cascaded SELECT operation may be applied in any order; i.e.,
<condition1>(< condition2> (<condition3> ( R))
= <condition2> (< condition3> (< condition1> ( R)))
– A cascaded SELECT operation may be replaced by a single selection
with a conjunction of all the conditions; i.e.,
<condition1>(< condition2> (<condition3> ( R))
= <condition1> AND < condition2> AND < condition3> ( R)))
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-45
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Unary Relational Operations (cont.)
PROJECT Operation
This operation selects certain columns from the table and discards the other
columns. The PROJECT creates a vertical partitioning – one with the needed
columns (attributes) containing results of the operation and other containing
the discarded Columns.
Example: To list each employee’s first and last name and salary, the
following is used:
LNAME, FNAME,SALARY (EMPLOYEE)
The general form of the project operation is <attribute list>(R) where
(pi) is the symbol used to represent the project operation and <attribute list>
is the desired list of attributes from the attributes of relation R.
The project operation removes any duplicate tuples, so the result of the
project operation is a set of tuples and hence a valid relation.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-46
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Unary Relational Operations (cont.)
PROJECT Operation Properties
–
The number of tuples in the result of projection <list> Ris
always less or equal to the number of tuples in R.
– If the list of attributes includes a key of R, then the number of
tuples is equal to the number of tuples in R.
–
<list2> R)<list1> Ras long
<list1>
as<list2>contains theattributes in<list1>
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-47
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-48
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Unary Relational Operations (cont.)
Rename Operation (cont.)
The rename operator is
The general Rename operation can be expressed by any of the
following forms:
S (B1, B2, …, Bn ) ( R) is a renamed relationS based on R with column names B1, B1,
…..Bn
S ( R) is a renamed relationS based on R (which does not specify column names).
(B1, B2, …, Bn ) ( R) is a renamed relationwith column names B1, B1, …..Bn which
does not specify a new relation name.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-49
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Unary Relational Operations (cont.)
Rename Operation
We may want to apply several relational algebra operations one after the
other. Either we can write the operations as a single relational algebra
expression by nesting the operations, or we can apply one operation at a time
and create intermediate result relations. In the latter case, we must give
names to the relations that hold the intermediate results.
Example: To retrieve the first name, last name, and salary of all employees
who work in department number 5, we must apply a select and a project
operation. We can write a single relational algebra expression as follows:
a) FNAME, LNAME, SALARY( DNO=5(EMPLOYEE))
OR We can explicitly show the sequence of operations, giving a name to each
intermediate relation:
DEP5_EMPS DNO=5(EMPLOYEE)
RESULT FNAME, LNAME, SALARY (DEP5_EMPS)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-50
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
b)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-51
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra Operations From
Set Theory
UNION Operation
The result of this operation, denoted by R S, is a relation that includes all
tuples that are either in R or in S or in both R and S. Duplicate tuples are
eliminated.
Example: To retrieve the social security numbers of all employees who either
work in department 5 or directly supervise an employee who works in
department 5, we can use the union operation as follows:
DEP5_EMPS DNO=5 (EMPLOYEE)
RESULT1 SSN(DEP5_EMPS)
RESULT2(SSN) SUPERSSN(DEP5_EMPS)
RESULT RESULT1 RESULT2
The union operation produces the tuples that are in either RESULT1 or
RESULT2 or both. The two operands must be “type compatible”.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-52
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra Operations From
Set Theory
Type Compatibility
– The operand relations R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn)
must have the same number of attributes, and the domains of
corresponding attributes must be compatible; that is,
dom(Ai)=dom(Bi) for i=1, 2, ..., n.
– The resulting relation for R1R2,R1 R2, or R1-R2 has the
same attribute names as the first operand relation R1 (by
convention).
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-53
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra Operations From
Set Theory
UNION Example
STUDENTINSTRUCTOR
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-54
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra Operations From Set
Theory (cont.)
INTERSECTION OPERATION
The result of this operation, denoted by R S, is a relation that includes all
tuples that are in both R and S. The two operands must be "type compatible"
Example: The result of the intersection operation (figure below) includes
only those who are both students and instructors.
STUDENT INSTRUCTOR
STUDENT INSTRUCTOR
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-55
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra Operations From Set
Theory (cont.)
Set Difference (or MINUS) Operation
The result of this operation, denoted by R - S, is a relation that includes all
tuples that are in R but not in S. The two operands must be "type compatible”.
Example: The figure shows the names of students who are not instructors,
and the names of instructors who are not students.
STUDENT-INSTRUCTOR
INSTRUCTOR-STUDENT
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-56
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-57
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra Operations From Set Theory (cont.)
Notice that both union and intersection are commutative operations; that is
R S = S R, and R S = S R
Both union and intersection can be treated as n-ary operations applicable to any number
of relations as both are associative operations; that is
R (S T) = (R S) T, and (R S) T = R (S T)
The minus operation is not commutative; that is, in general
R-S≠S–R
Intersection can be expressed in terms of union and set difference:
R S = R S – (R – S) – (S – R)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-58
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Relational Algebra Operations From Set
Theory (cont.)
CARTESIAN (or cross product) Operation
– This operation is used to combine tuples from two relations in a
combinatorial fashion. In general, the result of R(A 1, A2, . . ., An) x S(B1,
B2, . . ., Bm) is a relation Q with degree n + m attributes Q(A1, A2, . . ., An,
B1, B2, . . ., Bm), in that order. The resulting relation Q has one tuple for
each combination of tuples—one from R and one from S.
– Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then
| R x S | will have nR * nS tuples.
– The two operands do NOT have to be "type compatible”
Example:
FEMALE_EMPS SEX=’F’(EMPLOYEE)
EMPNAMES FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS EMPNAMES x DEPENDENT
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-59
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Binary Relational Operations
JOIN Operation
– The sequence of cartesian product followed by select is used
quite commonly to identify and select related tuples from two
relations, a special operation, called JOIN. It is denoted by a
– This operation is very important for any relational database
with more than a single relation, because it allows us to process
relationships among relations.
– The general form of a join operation on two relations R(A1, A2,
. . ., An) and S(B1, B2, . . ., Bm) is:
R <join condition> S
where R and S can be any relations that result from general
relational algebra expressions.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-60
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Binary Relational Operations (cont.)
Example: Suppose that we want to retrieve the name of
the manager of each department. To get the manager’s
name, we need to combine each DEPARTMENT tuple
with the EMPLOYEE tuple whose SSN value matches
the MGRSSN value in the department tuple. We do this
by using the join operation.
DEPT_MGR DEPARTMENT MGRSSN=SSN
EMPLOYEE
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-61
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Binary Relational Operations (cont.)
EQUIJOIN Operation
The most common use of join involves join conditions with equality comparisons only.
Such a join, where the only comparison operator used is =, is called an EQUIJOIN. In
the result of an EQUIJOIN we always have one or more pairs of attributes (whose
names need not be identical) that have identical values in every tuple.
The JOIN seen in the previous example was EQUIJOIN.
NATURAL JOIN Operation
Because one of each pair of attributes with identical values is superfluous, a new
operation called natural join—denoted by *—was created to get rid of the second
(superfluous) attribute in an EQUIJOIN condition.
The standard definition of natural join requires that the two join attributes, or each pair
of corresponding join attributes, have the same name in both relations. If this is not the
case, a renaming operation is applied first.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-62
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Binary Relational Operations (cont.)
Example: To apply a natural join on the DNUMBER attributes of
DEPARTMENT and DEPT_LOCATIONS, it is sufficient to write:
DEPT_LOCS DEPARTMENT * DEPT_LOCATIONS
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-63
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Complete Set of Relational Operations
The set of operations including select ,
project , union , set difference - , and
cartesian product X is called a complete set
because any other relational algebra expression
can be expressed by a combination of these five
operations.
For example:
R S = (R S ) – ((R S) (S R))
R <join condition> S= <join condition> (R X S)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-64
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Binary Relational Operations (cont.)
DIVISION Operation
– The division operation is applied to two relations
R(Z) S(X), where X subset Z. Let Y = Z - X (and hence Z
= X Y); that is, let Y be the set of attributes of R that are
not attributes of S.
– The result of DIVISION is a relation T(Y) that includes a
tuple t if tuples tR appear in R with tR [Y] = t, and with
tR [X] = ts for every tuple ts in S.
– For a tuple t to appear in the result T of the DIVISION, the
values in t must appear in R in combination with every tuple
in S.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-65
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Division Operator (÷): Division operator A÷B or A/B
can be applied if and only if:
• Attributes of B is proper subset of Attributes of A.
• The relation returned by division operator will have
attributes = (All attributes of A – All Attributes of B)
• The relation returned by division operator will return
those tuples from relation A which are associated to
every B’s tuple.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-66
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
DIVISION Operation
Relation A Relation B Relation A / B
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-67
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Recap of Relational Algebra Operations
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-69
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Additional Relational Operations
Aggregate Functions and Grouping
– A type of request that cannot be expressed in the basic relational algebra
is to specify mathematical aggregate functions on collections of values
from the database.
– Examples of such functions include retrieving the average or total salary
of all employees or the total number of employee tuples. These functions
are used in simple statistical queries that summarize information from
the database tuples.
– Common functions applied to collections of numeric values include
SUM, AVERAGE, MAXIMUM, and MINIMUM. The COUNT
function is used for counting tuples or values.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-70
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 5-71
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Additional Relational Operations (cont.)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-72
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Additional Relational Operations (cont.)
Use of the Functional operator ℱ
ℱMAX Salary (Employee) retrieves the maximum salary value from
the Employee relation
ℱMIN Salary (Employee) retrieves the minimum Salary value from
the Employee relation
ℱSUM Salary (Employee) retrieves the sum of the Salary from the
Employee relation
DNO ℱCOUNT SSN, AVERAGE Salary (Employee) groups employees by
DNO (department number) and computes the count of
employees and average salary per department.[ Note: count
just counts the number of rows, without removing duplicates]
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-73
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Additional Relational Operations (cont.)
Recursive Closure Operations
– Another type of operation that, in general, cannot be specified in the
basic original relational algebra is recursive closure. This operation is
applied to a recursive relationship.
– An example of a recursive operation is to retrieve all SUPERVISEES of
an EMPLOYEE e at all levels—that is, all EMPLOYEE e’ directly
supervised by e; all employees e’’ directly supervised by each employee
e’; all employees e’’’ directly supervised by each employee e’’; and so
on .
– Although it is possible to retrieve employees at each level and then take
their union, we cannot, in general, specify a query such as “retrieve the
supervisees of ‘James Borg’ at all levels” without utilizing a looping
mechanism.
– The SQL3 standard includes syntax for recursive closure.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-74
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Additional Relational Operations (cont.)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-75
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Additional Relational Operations (cont.)
The OUTER JOIN Operation
– In NATURAL JOIN tuples without a matching (or related) tuple are eliminated
from the join result. Tuples with null in the join attributes are also eliminated.
This amounts to loss of information.
– A set of operations, called outer joins, can be used when we want to keep all the
tuples in R, or all those in S, or all those in both relations in the result of the
join, regardless of whether or not they have matching tuples in the other
relation.
– The left outer join operation keeps every tuple in the first or left relation R in
R S; if no matching tuple is found in S, then the attributes of S in the join
result are filled or “padded” with null values.
– A similar operation, right outer join, keeps every tuple in the second or right
relation S in the result of R S.
– A third operation, full outer join, denoted by keeps all tuples in both
the left and the right relations when no matching tuples are found, padding them
with null values as needed.
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-76
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Additional Relational Operations (cont.)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-77
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Additional Relational Operations (cont.)
OUTER UNION Operations
– The outer union operation was developed to take the union of tuples from two
relations if the relations are not union compatible.
– This operation will take the union of tuples in two relations R(X, Y) and S(X, Z)
that are partially compatible, meaning that only some of their attributes, say X,
are union compatible.
– The attributes that are union compatible are represented only once in the result,
and those attributes that are not union compatible from either relation are also
kept in the result relation T(X, Y, Z).
– Example: An outer union can be applied to two relations whose schemas are
STUDENT(Name, SSN, Department, Advisor) and INSTRUCTOR(Name, SSN,
Department, Rank). Tuples from the two relations are matched based on having
the same combination of values of the shared attributes—Name, SSN,
Department. If a student is also an instructor, both Advisor and Rank will have a
value; otherwise, one of these two attributes will be null.
The result relation STUDENT_OR_INSTRUCTOR will have the following
attributes:
STUDENT_OR_INSTRUCTOR (Name, SSN, Department, Advisor, Rank)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-78
Copyright © 2004 Ramez Elmasri and Shamkant Navathe
Examples of Queries in Relational Algebra
Q1: Retrieve the name and address of all employees who
work for the ‘Research’ department.
RESEARCH_DEPT DNAME=’Research’ (DEPARTMENT)
RESEARCH_EMPS (RESEARCH_DEPT DNUMBER= DNOEMPLOYEE EMPLOYEE)
RESULT FNAME, LNAME, ADDRESS (RESEARCH_EMPS)
Q6: Retrieve the names of employees who have no
dependents.
ALL_EMPS SSN(EMPLOYEE)
EMPS_WITH_DEPS(SSN) ESSN(DEPENDENT)
EMPS_WITHOUT_DEPS (ALL_EMPS - EMPS_WITH_DEPS)
RESULT LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)
Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 6-79
Copyright © 2004 Ramez Elmasri and Shamkant Navathe