Chapter 1: Introduction
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See [Link] for conditions on re-use
Database Management System (DBMS)
Definition: It’s a Collection of interrelated data and a set
of programs to access those data.
Collection of data usually referred to as a database which
contains information relevant to an enterprise.
The primary goal of a DBMS is to provide a way to store
and retrieve database information that is both convenient
and efficient.
Designed to manage large volumes of information.
Defining structures for storage of information and
providing mechanisms for manipulation of information.
Databases touch all aspects of our lives
Database System Concepts - 6th Edition 1.2 ©Silberschatz, Korth and Sudarshan
Database Applications:
Banking and Finance: Custmors, Accounts, loans,
transactions
Airlines: reservations, schedules
Universities: Student information, course
registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized
recommendations
Manufacturing: production, inventory, orders, supply
chain
Human resources: employee records, salaries, tax
deductions
Databases can be very large.
Database System Concepts - 6th Edition 1.3 ©Silberschatz, Korth and Sudarshan
Purpose of DataBase Systems
Data redundancy and inconsistency
Multiple file formats, duplication of information in
different files
Difficulty in accessing data
Need to write a new program to carry out each new
task
Data isolation — multiple files and formats
Integrity problems
Integrity constraints (e.g., account balance > 0)
become “buried” in program code rather than being
stated explicitly
Hard to add new constraints or change existing
ones
Database System Concepts - 6th Edition 1.4 ©Silberschatz, Korth and Sudarshan
Purpose of DataBase Systems (Cont.)
Atomicity of updates
Failures may leave database in an inconsistent state with
partial updates carried out
Example: Transfer of funds from one account to another
should either complete or not happen at all
Concurrent access by multiple users
Concurrent access needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
– Example: Two people reading a balance (say 100) and
updating it by withdrawing money (say 50 each) at the
same time
Security problems
Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
Database System Concepts - 6th Edition 1.5 ©Silberschatz, Korth and Sudarshan
View of Data
• Data Abstraction
• Instances and Schemas
• Data Models
Database System Concepts - 6th Edition 1.6 ©Silberschatz, Korth and Sudarshan
A database system is a collection of interrelated data
and a set of programs that allow users to access and
modify these data.
A major purpose of a database system is to provide
users with an abstract view of the data.
That is, the system hides certain details of how the data
are stored and maintained.
Database System Concepts - 6th Edition 1.7 ©Silberschatz, Korth and Sudarshan
Data Abstraction
An architecture for a database system
Database System Concepts - 6th Edition 1.8 ©Silberschatz, Korth and Sudarshan
Levels of Abstraction
Physical level: The lowest level of abstraction describes how the data is
actually.
Logical level: describes what data are stored in database, and the
relationships exists among the data.
The logical level thus describes the entire database in terms of a small number
of relatively simple structures.
View level: The highest level of abstraction describes only part of the entire
database.
Even though the logical level uses simpler structures, complexity remains
because of the variety of information stored in a large database.
Many users of the database system do not need all this information; instead,
they need to access only a part of the database.
The view level of abstraction exists to simplify their interaction with the system.
Database System Concepts - 6th Edition 1.9 ©Silberschatz, Korth and Sudarshan
Instances and Schemas
Instance – the actual content of the database at a particular point
in time.
Schema -The overall design of the database is called the
database schema.
Schemas are changed infrequently.
These can be understood by analogy to a program written in
programming language.
A database schema corresponds to the variable declarations.
Analogous to type information of a variable in a program
Physical schema: database design at the physical level
Logical schema: database design at the logical level
Database System Concepts - 6th Edition 1.10 ©Silberschatz, Korth and Sudarshan
Data Models
Data Model: A collection of conceptual tools for describing
Data
Data relationships
Data semantics
Consistency Constraints
A way to describe the design of a database at a physical, logical,
and view levels.
Relational model
Entity-Relationship data model (mainly for database design)
Object-based data models (Object-oriented and Object-relational)
Semi structured data model (XML)
Other older models:
Network model
Hierarchical model
Database System Concepts - 6th Edition 1.11 ©Silberschatz, Korth and Sudarshan
Relational Model
The relational model uses a collection of tables to represent
both data and the relationships among those data.
Example of tabular data in the relational model
Columns
Rows
Database System Concepts - 6th Edition 1.12 ©Silberschatz, Korth and Sudarshan
A Sample Relational Database
Database System Concepts - 6th Edition 1.13 ©Silberschatz, Korth and Sudarshan
Database Languages
Data-Definition Language to specify the database schema and
a Data-Manipulation Language to express database queries
and updates.
Not two separate languages, parts of SQL (Structured Query
Language).
Database System Concepts - 6th Edition 1.14 ©Silberschatz, Korth and Sudarshan
Data Manipulation Language (DML)
Language for accessing and manipulating the data organized by the
appropriate data model
DML also known as query language
The types of access are:
Retrieval of information stored in the database
Insertion of new information into the database
Deletion of information from the database
Modification of information stored in the database
There are Two classes of DML
Procedural – user specifies what data is required and how to get
those data.
Declarative (nonprocedural) – user specifies what data is required
without specifying how to get those data.
SQL is the most widely used query language.
Database System Concepts - 6th Edition 1.15 ©Silberschatz, Korth and Sudarshan
Data Definition Language (DDL)
Specify a database schema by a set of definitions expressed by a
special language called a data-definition language (DDL).
The DDL is also used to specify additional properties of the data.
Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
Database systems implement integrity constraints.
Database System Concepts - 6th Edition 1.16 ©Silberschatz, Korth and Sudarshan
Data Values stored in the Database must satisfy certain
consistency constraints.
Domain Constraints:
It is a most elementary form of integrity constraint.
A domain of possible values must be associated with every
attribute (for example, integer types, character types, date/time
types).
They are tested easily by the system whenever a new data item is
entered into the database.
Referential Integrity.
Assertions.
Authorization.
Database System Concepts - 6th Edition 1.17 ©Silberschatz, Korth and Sudarshan
SQL
SQL: widely used non-procedural language
Example: Find the name of the instructor with ID 22222
select name
from instructor
where [Link] = ‘22222’
Example: Find the ID and building of instructors in the Physics dept.
select [Link], [Link]
from instructor, department
where instructor.dept_name = department.dept_name and
department.dept_name = ‘Physics’
Application programs generally access databases through one of
Language extensions to allow embedded SQL
Application program interface (e.g., ODBC/JDBC) which allow SQL
queries to be sent to a database
Database System Concepts - 6th Edition 1.18 ©Silberschatz, Korth and Sudarshan
Database Design
Database System Concepts - 6th Edition 1.19 ©Silberschatz, Korth and Sudarshan
Database Design – Introduction-
Design process
Database systems are designed to manage large bodies of
information.
They are part of the operation of some enterprise whose end
product may be information from the database.
The initial phase of database design, then, is to characterize
fully the data needs of the prospective database users.
The database designer needs to interact extensively with
domain experts and users to carry out this task.
The outcome of this phase is a specification of user
requirements.
Next, the designer chooses a data model, and by applying the
concepts of the chosen data model, translates these
requirements into a conceptual schema of the database.
Database System Concepts - 6th Edition 1.20 ©Silberschatz, Korth and Sudarshan
The schema developed at this conceptual-design phase
provides a detailed overview of the enterprise.
The designer reviews the schema to confirm that all data
requirements are indeed satisfied and are not in conflict with
one another.
Database System Concepts - 6th Edition 1.21 ©Silberschatz, Korth and Sudarshan
In the relational model, the conceptual-design process involves
decisions on what attributes we want to capture in the database
and how to group these attributes to form the various tables
Business decision – What attributes should we record in the
database?
Computer Science decision – What relation schemas should we
have and how should the attributes be distributed among the
various relation schemas?
Database System Concepts - 6th Edition 1.22 ©Silberschatz, Korth and Sudarshan
There are two ways to tackle the problem.
The first one is to use the entity-relationship model
Entity Relationship Model (Chapter 7)
Models an enterprise as a collection of entities and relationships
Entity: a “thing” or “object” in the enterprise that is
distinguishable from other objects
– Described by a set of attributes
Relationship: an association among several entities
Represented diagrammatically by an entity-relationship
diagram:
The other is to employ a set of algorithms (collectively known as
normalization) that takes as input the set of all attributes and
generates a set of tables.
Database System Concepts - 6th Edition 1.23 ©Silberschatz, Korth and Sudarshan
The Entity-Relationship Model
Models an enterprise as a collection of entities and relationships
Entity: a “thing” or “object” in the enterprise that is distinguishable
from other objects
Described by a set of attributes
Relationship: an association among several entities
Represented diagrammatically by an entity-relationship diagram:
Database System Concepts - 6th Edition 1.24 ©Silberschatz, Korth and Sudarshan
E-R Diagram for a University Enterprise
Database System Concepts - 6th Edition 1.25 ©Silberschatz, Korth and Sudarshan
Database Design for University Data
Base
Database System Concepts - 6th Edition 1.26 ©Silberschatz, Korth and Sudarshan
A fully developed conceptual schema indicates the
functional requirements of the enterprise.
specification of functional requirements, users
describe the kinds of operations (or transactions) that
will be performed on the data.
Database System Concepts - 6th Edition 1.27 ©Silberschatz, Korth and Sudarshan
Normalization:
Another method for designing a relational database is to use a
process commonly known as normalization.
The goal is to generate a set of relation schemas that allows us
to store information without unnecessary redundancy, yet also
allows us to retrieve information easily.
To understand the need for normalization, let us look at
what can go wrong in a bad database design.
Among the undesirable properties that a bad design
may have are:
• Repetition of information
• Inability to represent certain information
Database System Concepts - 6th Edition 1.28 ©Silberschatz, Korth and Sudarshan
Database Design?
Is there any problem with this design? – Faculty Table
Normalization Required
Database System Concepts - 6th Edition 1.29 ©Silberschatz, Korth and Sudarshan
Data Storage and Querying
Database System Concepts - 6th Edition 1.30 ©Silberschatz, Korth and Sudarshan
Data Storage and Querying
A database system is partitioned into modules that deal with each
of the responsibilities of the overall system.
The functional components of a database system are
Storage manager
Query processor
The storage manager is important because databases typically
require a large amount of storage space.
The query processor is important because it helps the database
system to simplify and facilitate access to data.
Database System Concepts - 6th Edition 1.31 ©Silberschatz, Korth and Sudarshan
Storage manager is the component of that provides the
interface between the low-level data stored in the database
and the application programs and queries submitted to the
system.
The storage manager is responsible to the following
tasks:
Interaction with the file manager.
Efficient storing, retrieving and updating of data.
The storage manager translates the various DML
statements into low-level file-system commands.
Database System Concepts - 6th Edition 1.32 ©Silberschatz, Korth and Sudarshan
Database System Structure
Database System Concepts - 6th Edition 1.33 ©Silberschatz, Korth and Sudarshan
Data Storage and Querying Cont..
The storage manager components include:
Authorization and integrity manager, which tests for the satisfaction
of integrity constraints and checks the authority of users to access data.
Transaction manager, which ensures that the database remains in a
consistent (correct) state despite system failures, and that concurrent
transaction executions proceed without conflicting.
File manager, which manages the allocation of space on disk storage
and the data structures used to represent information stored on disk.
Buffer manager, which is responsible for fetching data from disk
storage into main memory, and deciding what data to cache in main
memory.
The buffer manager is a critical part of the database system, since it
enables the database to handle data sizes that are much larger than
the size of main memory.
Database System Concepts - 6th Edition 1.34 ©Silberschatz, Korth and Sudarshan
The storage manager implements several data structures as
part of the physical system implementation:
• Data files, which store the database itself.
• Data dictionary, which stores metadata about the structure
of the database, in particular the schema of the database.
• Indices, which can provide fast access to data items. Like
the index in this textbook, a database index provides
pointers to those data items that hold a particular value.
Database System Concepts - 6th Edition 1.35 ©Silberschatz, Korth and Sudarshan
The Query Processor
The query processor components include:
• DDL interpreter, which interprets DDL statements and records
the definitions in the data dictionary.
• DML compiler, which translates DML statements in a query
language into an evaluation plan consisting of low-level instructions
that the query evaluation engine understands.
The DML compiler also performs query optimization;
•Query evaluation engine, which executes low-level instructions
generated by the DML compiler.
Database System Concepts - 6th Edition 1.36 ©Silberschatz, Korth and Sudarshan
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
Database System Concepts - 6th Edition 1.37 ©Silberschatz, Korth and Sudarshan
Transaction Management
What if the system fails?
What if more than one user is concurrently updating the same
data?
A transaction is a collection of operations that performs a single
logical function in a database application.
Atomicity : This all-or-none requirement
Consistency : the value of the sum of the balances of A
and B must be preserved.
all the transactions will be carried out and
Isolation:
executed as if it is the only transaction in the system.
Durability: after the successful execution of a funds
transfer, the new values of the balances of accounts A and
B must persist, despite the possibility of system failure.
Database System Concepts - 6th Edition 1.39 ©Silberschatz, Korth and Sudarshan
Transaction-management component ensures that the database
remains in a consistent (correct) state despite system failures (e.g.,
power failures and operating system crashes) and transaction
failures.
The transaction manager consists of the
The recovery manager.
Concurrency-control manager
Recovery Manager: Ensuring the atomicity and durability properties
is the responsibility of the database system itself—specifically, of the
recovery manager.
Concurrency-control manager controls the interaction among the
concurrent transactions, to ensure the consistency of the database.
Database System Concepts - 6th Edition 1.40 ©Silberschatz, Korth and Sudarshan
Database Users and Administrators
Database
Database System Concepts - 6th Edition 1.41 ©Silberschatz, Korth and Sudarshan
Naïve Users: They don’t interact with the database system
directly. interact with the database by invoking one of the
application programs that have been written previously.
Eg: Web users, agents etc.
Application Programmers: They are interacting with the
database by writing application programs using PHP, .Net or
Java.
Sophisticated Users (Analysts): They interact with the database
system directly using some database query languages.
Specialized Users: Sophisticated users who write specialized
database applications that do not fit into the traditional data-
processing framework.
Database System Concepts - 6th Edition 1.42 ©Silberschatz, Korth and Sudarshan
Database Administrators
Person who has such central control over the system is called a database
administrator (DBA).
The functions of a DBA include:
• Schema definition.
• Storage structure and access-method definition.
•Schema and physical-organization modification
• Granting of authorization for data access.
• Routine maintenance:
Periodically backing up the database, either onto tapes or onto remote
servers, to prevent loss of data in case of disasters.
Ensuring that enough free disk space is available for normal operations,
and upgrading disk space as required.
Monitoring jobs running on the database and ensuring that performance is
not degraded.
Database System Concepts - 6th Edition 1.43 ©Silberschatz, Korth and Sudarshan
Database System Structure
Database System Concepts - 6th Edition 1.44 ©Silberschatz, Korth and Sudarshan
Database Architecture
Database System Concepts - 6th Edition 1.45 ©Silberschatz, Korth and Sudarshan
Database Architecture
The architecture of a database systems is greatly influenced by
the underlying computer system on which the database is running:
Centralized
Client-server
Parallel (multi-processor)
Distributed
Database System Concepts - 6th Edition 1.46 ©Silberschatz, Korth and Sudarshan
Architecture
Database System Concepts - 6th Edition 1.47 ©Silberschatz, Korth and Sudarshan
Database applications are usually partitioned into two or
three parts.
In a two-tier architecture, the application resides at
the client machine.
where it invokes database system functionality at the
server machine through query language statements.
Application program interface standards like ODBC and
JDBC are used for interaction between the client and the
server.
Database System Concepts - 6th Edition 1.48 ©Silberschatz, Korth and Sudarshan
In a three-tier architecture, the client machine acts as merely a
front end and does not contain any direct database calls.
The client end communicates with an application server,
usually through a forms interface.
The application server in turn communicates with a database
system to access data.
The business logic of the application, which says what actions
to carry out under what conditions, is embedded in the
application server.
Three-tier applications are more appropriate for large
applications, and for applications that run on the World Wide
Web.
Database System Concepts - 6th Edition 1.49 ©Silberschatz, Korth and Sudarshan
End of Chapter 1
Database System Concepts - 6th Edition 1.50 ©Silberschatz, Korth and Sudarshan