TO
Database Management Systems
IC2307
Lecture – 02
Introduction to Course
Data:
Data is a real-world entity or an object. Data is a distinct
piece of information or facts that need to be processed.
It can be in any form like text, number, picture,
measurements, and bytes.
Example: Ankit, Delhi, 12, 80.
Factual
Information
Observations
Data Processing Information
Facts and
Statistics
Values and
Figures
Types of Data:
There are two types of Data:
Quantitative: Quantitative data refers
to numerical information like weight,
height, etc.
Qualitative: Qualitative data refers to
non-numeric information like opinions,
perceptions, etc.
Information:
When data are processed, organized, structured,
and interpreted in a given context, so as to make
them useful and meaningful, they are called
information.
Example:
Name - Ankit, City - Delhi, Class – 12, Marks – 80.
Information:
Difference in data and Information:
Data Information
Data is unorganised and unrefined Information comprises processed,
facts organised data presented in a
meaningful context
Data is an individual unit that Information is a group of data that
contains raw materials which do not collectively carries a logical meaning.
carry any specific meaning.
Data doesn’t depend on information. Information depends on data.
Raw data alone is insufficient for Information is sufficient for decision
decision making making
An example of data is a student’s test The average score of a class is the
score information derived from the given
data.
Big Data Types :
Big Data Types
Structured Data Unstructured Data Semi Structured data
Structured Data :
Structured data is generally tabular data that is represented by columns and rows
in a database.
Databases that hold tables in this form are called relational databases.
The mathematical term “relation” specifies a formed set of data held as a table.
In structured data, all row in a table has the same set of columns.
SQL (Structured Query Language) programming language used for structured
data.
Unstructured data is information that either is not organized in a pre-defined
Unstructured Data :
manner or does not have a pre-defined data model.
Unstructured information is a set of text-heavy but may contain data such as
numbers, dates, and facts as well.
Videos, audio, and binary data files might not have a specific structure.
They’re assigned to unstructured data.
Semi Structured data:
Semi-structured data is information that doesn’t consist of Structured data
(relational database) but still has some structure to it.
Semi-structured data consists of documents held in JavaScript Object
Notation (JSON) format. It also includes key-value stores
and graph databases.
Database:
A database is an organized collection of inter-related
data, which helps in insertion, deletion, and
retrieval of data efficiently.
The database is also used to organize the data or
information in the form of tables, views, schemas,
reports, etc.
The main purpose of database is to operate and
handle large amount of Information (data) by
efficiently storing ,retrieving and managing the data
in the database.
Database Management System:
A database management system (DBMS) is a
collection of programs that enables users to create and
maintain a database.
The DBMS is hence a general-purpose software system
that facilitates the processes of defining, constructing,
and manipulating databases for various applications.
Database Management System:
Defining a database involves specifying the data types,
structures, and constraints for the data to be stored in the
database.
Constructing the database is the process of storing the data
itself on some storage medium that is controlled by the
DBMS.
Manipulating a database includes such functions as
querying the database to retrieve specific data, updating the
database to reflect changes in the miniworld, and generating
reports from the data.
Database Management System:
Data Definition: It is used for creation, modification, and
removal of definition that defines the organization of data in
the database.
Data Updation: It is used for the insertion, modification, and
deletion of the actual data in the database.
Data Retrieval: It is used to retrieve the data from the
database which can be used by applications for various
purposes.
User Administration: It is used for registering and
monitoring users, maintain data integrity, enforcing data
security, dealing with concurrency control, monitoring
performance and recovering information corrupted by
unexpected failure.
Database Management System:
Example:
MySQL, MS SQL Server, Oracle, SQL, DB2, Microsoft
Access, etc. are different types of database
management system.
Oracle
MySQL
DBMS
Microsoft
Access
DB2
Database Management System:
Example:
MySQL, MS SQL Server, Oracle, SQL, DB2, Microsoft
Access, etc. are different types of database
management system.
Components of DBMS:
Components of DBMS:
Hardware:
This refers to the physical components of the computer system, such as
the hard drive, CPU, memory, and input/output devices, that are used to
store and access the database.
Software:
This encompasses the DBMS software itself, the operating system, and
any network software used to share data. It provides the interface between
the user and the hardware, allowing users to interact with the database.
Data:
This is the core of the DBMS, representing the raw facts and information
that is stored, organized, and managed. It also includes metadata, which
describes the structure and characteristics of the data.
Procedures:
These are the instructions and rules that govern how the DBMS is
designed, installed, and used. They cover aspects like data entry, backup
and recovery, and report generation.
Database Access Language:
This is a specialized language, like SQL, that allows users to interact with
the database. It enables users to query, insert, update, and delete data.
Database Management System:
Architecture Of A Database Management System (DBMS)
Architecture Of A Database Management System (DBMS):
End User: The End User interacts with the DBMS indirectly through
Application Software, which allows them to interact with the data
without needing to understand the database's inner workings.
DB Administrator: The Database Administrator (DB Administrator)
manages the database using Database Administration Software. This
role is responsible for maintaining, optimizing, and ensuring the smooth
operation of the DBMS.
DB Designer/DB Programmer: The DB Designer and DB Programmer
work with SQL to design and program the structure of the database and
write queries to interact with it.
SQL (Structured Query Language):SQL sits in the middle, serving as
the language used by the database professionals (DB Designer, DB
Programmer, and DB Administrator) to interact with the DBMS.
DBMS: The DBMS (Database Management System) is the core
system that manages the actual databases, facilitating the storage,
retrieval, and manipulation of data.
Databases: At the bottom of the diagram are the databases, where the
data is stored, and the DBMS manages them.
Architecture Of A Web Application System Using MySQL :
This diagram illustrates the architecture of a web application system using MySQL as the
database management system (DBMS).
Database (leftmost): Represents the data storage, which holds all the information
needed by the application.
Web Server: The server that hosts the web application. It handles incoming requests
from users and processes them.
DBMS (Database Management System): The system that manages and organizes the
database. In this case, it’s represented by MySQL, an open-source relational DBMS.
SQL (Structured Query Language): This is the language used to interact with the
database. It helps in querying, inserting, updating, and deleting data in the database.
Database Systems versus File Systems
One way to keep the information on a computer is to store it
in operating system files.
To allow users to manipulate the information, the system has
a number of application programs that manipulate the files.
Keeping organizational information in a file-processing
system has a number of major disadvantages like
Data redundancy and inconsistency,
Difficulty in accessing data,
Atomicity problems,
Security problems etc
Characteristics of DBMS:
A database management system (DBMS) should be able to store
any kind of data in a database.
Any database management system should be able to support ACID
(atomicity, consistency, isolation, durability) properties.
The Database management system allows more than one users to
access the same database at the same time.
Backup and recovery are the two main methods that allow users to
protect their data from damage or loss.
It provides multiple views for different users in one organization.
DBMS follows the concept of normalization to minimize the
redundancy of a relation.
It provides users query language, using which they can easily insert,
retrieve, update, and delete the data in a database.
Need of DBMS:
DBMS is useful in the following ways:
1. Ease of Accessing Data
In the file system, different files are created for each user containing
which data they can access. Also, in the file system, for the
user to extract data, there is a need for code or application.
DBMS removes redundancy by granting access to users and
decides which and how many parts of data is accessible to
them from the database. Users can get easy access to data and
can also specify the type of data they want to extract.
2. Storage and Management of Data
Data cannot be stored in the form of objects in the file system.
The data in the practical world is generally stored in the form of
objects and not files.
So, an application is required to map the data into objects for
further usage. In DBMS, the data can be directly stored in the
form of objects.
Need of DBMS:
3. Easy and Efficient File Management
In the file system, the entire database runs for every query operation as
files are indexed. It takes a lot of time compared to DBMS,
where objects are indexed based on the attribute of data. The complex
management of memory becomes easy to handle. With this, retrieval of
data is faster than the traditional file system.
4. Avoiding duplicates and Redundancy
Data normalization is used in DBMS to avoid duplicate data.
5. Concurrent Data Accessing
Users can access data simultaneously through different applications. In
the file system, this simultaneous access leads to inconsistency. DBMS
uses the ACID approach to tackle the issue.
Advantage of DBMS:
Minimal data redundancy or data duplicacy.
Easy access to data from the database using the query
language.
DBMS provides backup and recovery methods which create an
automatic backup of data from software and hardware failures
and restores the data if required.
Minimized data consistency.
Better data integration.
DBMS can applies integrity constraint to the data in the
database.
DBMS increases consistency and reduces updating errors.
Advantages of DBMS:
Controls database redundancy: It can control data
redundancy because it stores all the data in one single database
file and that recorded data is placed in the database.
Data sharing: In DBMS, the authorized users of an organization
can share the data among multiple users.
Easily Maintenance: It can be easily maintainable due to the
centralized nature of the database system.
Reduce time: It reduces development time and maintenance
need.
Backup: It provides backup and recovery subsystems which
create automatic backup of data from hardware and software
failures and restores the data if required.
Multiple user interface: It provides different types of user
interfaces like graphical user interfaces, application program
interfaces etc
Advantage of DBMS:
Disadvantages of DBMS
A database management system is complex and time
consuming to design.
Cost of software and hardware is high to run DBMS software.
DBMS consumes a large amount of main memory as well as a
huge amount of disk space to make it run efficiently.
If the database is damaged because of any software or
hardware failure, all the application programs will be implicitly
affected, which are dependent on it.
Initial training is required for all the users and programmers to
use the DBMS software.
Database System Applications:
Banking: For customer information, accounts, and loans,
and banking transactions.
Airlines: For reservations and schedule information.
Airlines were among the first to use databases in a
geographically distributed manner—terminals situated
around the world accessed the central database system
through phone lines and other data networks.
Universities: For student information, course registrations,
and grades.
Telecommunication: For keeping records of calls made,
generating monthly bills, maintaining balances on prepaid
calling cards, and storing information about the
communication networks.
Database System Applications cntd..
Finance: For storing information about holdings,
sales, and purchases of financial instruments such
as stocks and bonds.
Sales: For customer, product, and purchase
information.
Manufacturing: For management of supply chain
and for tracking production of items in factories,
inventories of items in warehouses/stores, and
orders for items.
Human resources: For information about
employees, salaries, payroll taxes and benefits, and
for generation of paychecks
Data Abstraction:
Data Abstraction is a process of hiding unwanted or irrelevant
details from the end user. It provides a different view and
helps in achieving data independence which is used to
enhance the security of data.
The database systems consist of complicated data structures
and relations. For users to access the data easily, these
complications are kept hidden, and only the relevant part of
the database is made accessible to the users through data
abstraction.
1) Physical level
2) Logical Level
3) View Level
Data Abstraction:
Physical level:- The lowest level of abstraction describes how
the data are actually stored. The physical level describes
complex low-level data structures in detail.
Logical level:- The next-higher level of abstraction describes
what data are stored in the database, and what
relationships exist among those data.
Database administrators, who must decide what information to
keep in the database, use the logical level of abstraction.
View level:- The highest level of abstraction describes only
part of the entire database. The view level of abstraction
exists to simplify users interaction with the system. The
system may provide many views for the same database.
• Physical Data Independence Rule
• All stored data in a database or an application must be
physically independent to access the database.
• Each data should not depend on other data or an
application.
• If data is updated or the physical structure of the
database is changed, it will not show any effect on
external applications that are accessing the data
from the database.
Logical Data Independence Rule
• It is similar to physical data independence. It means, if
any changes occurred to the logical level (table
structures), it should not affect the user's view
(application).
• For example, suppose a table either split into two
tables, or two table joins to create a single table,
these changes should not be impacted on the user
view application.
Application:
1. ATM:
When withdrawing money from an ATM, users interact with a simple interface to
select their account, enter the amount, and receive cash. They don't see the
complex processes happening behind the scenes, like how the machine reads
the card, verifies the PIN, or dispenses the money.
2. Online Shopping:
When buying clothes online, users see options like size, color, and brand, but they
don't see the details of how the inventory is managed, how the payment is
processed, or how the item is shipped.
3. College Database:
A student accessing a college database might see their course information, grades,
and attendance record, but they won't see how this data is stored, the faculty
information, or the database code. Similarly, a faculty member accessing the
same database will see their information but not the student data or the
database's internal structure.
4. Email:
When accessing an email client like Gmail, users see their emails, but they don't
know where the data is physically stored, the data model used to store it, or the
network infrastructure involved.
5. TV Remote:
A TV remote provides buttons for volume control, channel selection, and
power. Users don't need to know
DBMS Terminologies:
Data Independence
It is defined as a property of DBMS that helps you to
change the Database schema at one level of a database
system without requiring to change the schema at the next
higher level. Data independence helps you to keep data
separated from all programs that make use of it.
Instance/database state or snapshot
Databases change over time as information is inserted and
deleted. The collection of information stored in the database
at a particular moment is called an instance of the
database.
Schema
The overall design of the database is called the database
schema.
DBMS Architecture:
DBMS Architecture:
A DBMS architecture defines how users interact with the
database to read, write, or update information. A well-
designed architecture and schema (a blueprint detailing
tables, fields, and relationships) ensure data
consistency, improve performance, and keep data
secure.
Types of DBMS Architecture:
1- Tier Architecture
2- Tier Architecture
3- Tier Architecture
In One- Tier Architecture the database is directly available to the user,
the user can directly sit on the DBMS and use it i.e.; the client,
server and the Database all present on the same machine. For
Example- To learn SQL we setup SQL server and the database on
the local system.
1- Tier Architecture:
In 1-Tier Architecture, the user works directly with the database on the
same system. This means the client, server, and database are all in one
application. The user can open the application, interact with the data, and
perform tasks without needing a separate server or network connection.
A common example is Microsoft Excel. Everything from the user interface to
the logic and data storage happens on the same device. The user enters
data, performs calculations, and saves files directly on their computer.
This setup is simple and easy to use, making it ideal for personal or
standalone applications. It does not require a network or complex setup,
which is why it's often used in small-scale or individual use cases.
This architecture is simple and works well for personal, standalone
applications where no external server or network connection is needed.
Advantages of 1-Tier Architecture:
Simple Architecture: 1-Tier Architecture is
the most simple architecture to set up, as
only a single machine is required to maintain
it.
Cost-Effective: No additional hardware is
required for implementing 1-Tier
Architecture, which makes it cost-effective.
Easy to Implement: 1-Tier Architecture can
be easily deployed, and hence it is mostly
used in small projects.
Disadvantages of 1-Tier Architecture:
Limited to Single User: Only one person can use
the application at a time. It’s not designed for multiple
users or teamwork.
Poor Security: Since everything is on the same
machine, if someone gets access to the system, they
can access both the data and the application easily.
No Centralized Control: Data is stored locally, so
there's no central database. This makes it hard to
manage or back up data across multiple devices.
Hard to Share Data: Sharing data between users is
difficult because everything is stored on one
computer.
2-Tier Architecture:
The 2-tier architecture is similar to a basic client-server model. The
application at the client end directly communicates with the database
on the server side. APIs like ODBC and JDBC are used for this
interaction. The server side is responsible for providing query
processing and transaction management functionalities.
On the client side, the user interfaces and application programs are run.
The application on the client side establishes a connection with the
server side to communicate with the DBMS.
For Example: A Library Management System used in schools or small
organizations is a classic example of two-tier architecture.
Client Layer (Tier 1): This is the user interface that library staff or
users interact with.
For example they might use a desktop application to search for
books, issue them, or check due dates.
Database Layer (Tier 2): The database server stores all the library
records such as book details, user information, and transaction logs.
The client layer sends a request (like searching for a book) to the
database layer which processes it and sends back the result. This
separation allows the client to focus on the user interface, while the
server handles data storage and retrieval.
2-Tier Architecture:
Advantages of 2-Tier Architecture:
Easy to Access: 2-Tier Architecture makes easy
access to the database, which makes fast
retrieval.
Scalable: We can scale the database easily, by
adding clients or upgrading hardware.
Low Cost: 2-Tier Architecture is cheaper than 3-
Tier Architecture and Multi-Tier Architecture.
Easy Deployment: 2-Tier Architecture is easier
to deploy than 3-Tier Architecture.
Simple: 2-Tier Architecture is easily
understandable as well as simple because of only
two components.
Disadvantages of 2-Tier Architecture:
Limited Scalability: As the number of users
increases, the system performance can slow
down because the server gets overloaded with
too many requests.
Security Issues: Clients connect directly to the
database, which can make the system more
vulnerable to attacks or data leaks.
Tight Coupling: The client and the server are
closely linked. If the database changes, the client
application often needs to be updated too.
Difficult Maintenance: Managing updates, fixing
bugs, or adding features becomes harder when
the number of users or systems increases.
3-Tier Architecture:
In 3-Tier Architecture, there is another layer between the client
and the server. The client does not directly communicate with
the server. Instead, it interacts with an application server
which further communicates with the database system and
then the query processing and transaction management takes
place.
This intermediate layer acts as a medium for the exchange of
partially processed data between the server and the client.
This type of architecture is used in the case of large web
applications.
Example: E-commerce Store
User: You visit an online store, search for a product and add it to
your cart.
Processing: The system checks if the product is in stock,
calculates the total price and applies any discounts.
Database: The product details, your cart and order history are
stored in the database for future reference.
3-Tier Architecture:
Advantages of 3-Tier Architecture:
Enhanced scalability: Scalability is enhanced
due to the distributed deployment of application
servers. Now, individual connections need not be
made between the client and server.
Data Integrity: 3-Tier Architecture maintains Data
Integrity. Since there is a middle layer between
the client and the server, data corruption can be
avoided/removed.
Security: 3-Tier Architecture Improves Security.
This type of model prevents direct interaction of
the client with the server thereby reducing access
to unauthorized data.
Disadvantages of 3-Tier Architecture:
More Complex: 3-Tier Architecture is more complex
in comparison to 2-Tier Architecture. Communication
Points are also doubled in 3-Tier Architecture.
Difficult to Interact: It becomes difficult for this sort
of interaction to take place due to the presence of
middle layers.
Slower Response Time: Since the request passes
through an extra layer (application server), it may
take more time to get a response compared to 2-Tier
systems.
Higher Cost: Setting up and maintaining three
separate layers (client, server, and database)
requires more hardware, software, and skilled people.
This makes it more expensive.
Server System Architectures:
Server systems can be broadly categorized as
transaction servers and data servers.
Transaction-server systems, also called query-server
systems, provide an interface to which clients can send
requests to perform an action, in response to which they
execute the action and send back results to the client.
Usually, client machines ship transactions to the server
systems, where those transactions are executed, and
results are shipped back to clients that are in charge of
displaying the data.
Requests may be specified by using SQL, or through a
specialized application program interface.
Server System Architectures cntd..
Data-server systems allow clients to interact with
the servers by making requests to read or update
data, in units such as files or pages.
For example, file servers provide a file-system
interface where clients can create, update, read,
and delete files.
Data servers for database systems offer much more
functionality; they support units of data—such as
pages, tuples, or objects—that are smaller than a
file. They provide indexing facilities for data, and
provide transaction facilities so that the data are
never left in an inconsistent state if a client
machine or process fails.
Classification of Database
Management Systems
Several criteria can be used to classify DBMS.
Classification Based on Data Model Relational data model
Object Oriented
data model
Classification Based on User Numbers Single user system
Multi User system
Classification Based on Database Distribution
Centralized Distributed Homogeneous
Heterogeneous
Single User
system
Multi User
system
Centralized System
Distributed System
Why is Database Design
important?
The Database Design Process
Database Design:
Database design can be generally defined as a
collection of tasks or processes that enhance
the designing, development, implementation,
and maintenance of enterprise data
management system
It simply means mapping of conceptual model
to implementation model.
The Database Design Process
Database Design:
Phase 1: Requirements Collection and Analysis
Phase 2: Conceptual Database Design
Phase 3: Choice of a DBMS
Phase 4: Data Model Mapping (Logical
Database Design)
Phase 5: Physical Database Design
Phase 6: Database System Implementation and
Tuning
Phases of Database Design (contd.)
Requirements Collections and
Analysis
– Identifying Users
– Interacting with users to gather
requirements
– Time consuming BUT very important
Conceptual Database Design
– Produce a conceptual schema for the
database that is independent of a specific
DBMS
– Involves two parallel activities
• Conceptual Schema Design
• Transaction and Application Design
Slide 12- 66
Approaches to Conceptual Schema
Design
Centralized Schema Design Approach
– Also known as one-shot approach
– Requirements of different applications and user
groups are merged into a single set of
requirements and a single schema is designed
– Time consuming, places the burden on DBA to
reconcile conflicts
View Integration Approach
– Schema is designed for each user group or
application
– These schemas are then merged into a global
conceptual schema during the view integration
phase
– More practical
Slide 12- 67
Strategies for Schema Design
Top Down
Strategy
– Start with a
schema
containing
high-level
abstractions
and then apply
successive top-
down
refinements
Slide 12- 68
Strategies for Schema Design (contd.)
Bottom-Up
Strategy
– Start with a
schema
containing
basics
abstractions
and then
combine or
add to these
abstractions
Slide 12- 69
Strategies for Schema Design (contd.)
Inside-out Strategy
– Start with central set of concepts and then
spread outward by considering new concepts
in the vicinity of existing ones
Mixed Strategy
– Use a combination of top-down and bottom-
up strategies
Slide 12- 70
Phases of Database Design
Many factors to consider for Choice of
DBMS(Phase 3)
– Technical Factors
• Type of DBMS: Relational, object-relational, object
etc.
• Storage Structures
• Architectural options
– Economic Factors
• Acquisition, maintenance, training and operating
costs
• Database creation and conversion cost
– Organizational Factors
• Organizational philosophy
– Relational or Object Oriented
– Vendor Preference
• Familiarity of staff with the system
• Availability of vendor services
Slide 12- 71
Phases of Database Design
(contd.)
Data model mapping (Phase 4): During this phase,
which is also called logical database design, we
map (or transform) the conceptual schema from
the high-level data model used in Phase 2 into the
data model of the chosen DBMS.
We can start this phase after choosing a specific type
of DBMS—for example, if we decide to use some
relational DBMS but have not yet decided on which
particular one. We call the latter system-
independent (but data model-dependent) logical
design.
Phases of Database Design
(contd.)
Physical database design (Phase 5):
During this phase, we design the
specifications for the stored database in
terms of physical storage structures,
record placement, and indexes.
This corresponds to designing the internal
schema in the terminology of the three-
level DBMS architecture.
Phases of Database Design (contd
Database system implementation and
tuning (Phase 6): During this phase, the database
and application programs are implemented, tested, and
eventually deployed for service.
Various transactions and applications are tested
individually and then in conjunction with each other.
This typically reveals opportunities for physical design
changes, data indexing, reorganization, and different
placement of data—an activity referred to as database
tuning.
Tuning is an ongoing activity—a part of system
maintenance that continues for the life cycle of a
database as long as the database and applications
keep evolving and performance problems are detected.