0% found this document useful (0 votes)
19 views20 pages

Database Management System Overview

The document provides a comprehensive overview of Database Management Systems (DBMS), explaining key concepts such as data, information, algorithms, and various database models including hierarchical, network, and relational models. It discusses the evolution of DBMS, its components, advantages, challenges, and emerging trends, emphasizing the importance of DBMS in modern computing for efficient data management. The document serves as an essential guide for understanding the fundamentals of databases and their applications in various fields.

Uploaded by

Keshav Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views20 pages

Database Management System Overview

The document provides a comprehensive overview of Database Management Systems (DBMS), explaining key concepts such as data, information, algorithms, and various database models including hierarchical, network, and relational models. It discusses the evolution of DBMS, its components, advantages, challenges, and emerging trends, emphasizing the importance of DBMS in modern computing for efficient data management. The document serves as an essential guide for understanding the fundamentals of databases and their applications in various fields.

Uploaded by

Keshav Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

17YCS402 – Database Management System

1. Data: Raw facts and figures without meaning.


Example: 101, "John", 25.
2. Information: Processed data that has meaning.
Example: "John is 25 years old."
3. Instruction: A command was given to a computer to operate.
Example: ADD A, B (adds values of A and B).
4. Problem: A task or challenge that needs to be solved using computing techniques.
Example: Finding the most significant number in a list.
5. Solution: The method or process used to solve a problem.
Example: Using sorting to find the most significant number.
6. Algorithm: A step-by-step procedure to solve a problem.
Example: Steps to sort a list in ascending order.
7. Flowchart: A graphical representation of an algorithm using symbols.
Example: A flowchart depicting the steps to calculate the area of a rectangle.
8. Pseudocode: A human-readable representation of an algorithm using simple
language.

BEGIN
INPUT A, B
SUM = A + B
PRINT SUM
END

9. Program: A set of instructions written in a programming language to perform a


task.
Example: A Python script to add two numbers.
10. Hardware: The physical components of a computer.
Example: CPU, RAM, Keyboard.
11. Software: A set of programs that run on a computer.
Example: Microsoft Word.
12. Application Software: Software designed for end users to perform specific tasks.
Example: Web browsers like Google Chrome.
13. System Software: Software that manages hardware and system operations.
Example: Operating Systems like Windows or Linux.
Module 1: Introduction to Database

1.1. Understanding Data and Its Importance

• Data is considered the new oil in the modern digital era, driving decision-making,
automation, and intelligent systems.
• Organizations generate, store, and analyze vast amounts of data to improve efficiency,
enhance customer experience, and gain a competitive edge.
• However, raw data is meaningless without proper structuring and management. This
necessity has led to the evolution of Database Management Systems (DBMS).

1.2. What is a Database?

• A database is an organized collection of related data that enables efficient


information retrieval, insertion, updating, and deletion.
• Instead of storing data in scattered files and documents, a database organizes it
systematically, allowing users to access and manipulate it easily.

Example:

• A university database may contain tables for students, courses, faculty, and
grades, where each table stores structured data meaningfully.

1.3. What is a Database Management System (DBMS)?

• A Database Management System (DBMS) is a software system that allows users to


create, manage, and manipulate databases efficiently.
• It serves as an interface between the user and the database, ensuring data is stored
securely and retrieved accurately.

Key Functions of DBMS:

• Data Storage, Retrieval, and Manipulation


• Data Security and Integrity Maintenance
• Concurrency Control for Multi-user Access
• Data Backup and Recovery
• Enforcement of Data Constraints and Rules

1.4. Evolution of DBMS

• Database management has evolved from simple file-based systems to sophisticated


cloud-based and distributed databases.

Historical Development:
1. File-Based Systems (Pre-1960s) – Data stored in flat files with no efficient retrieval
methods.
2. Hierarchical and Network Models (1960s-1970s) – Data organized in tree or
graph structures.
3. Relational Databases (1980s-Present) – Data stored in tables, supporting SQL
queries.
4. Object-Oriented & NoSQL Databases (1990s-Present) – Handling complex and
unstructured data.
5. Big Data & Cloud Databases (2010s-Present) – Scalable, high-performance
solutions for large datasets.

1.5. Components of DBMS

• A DBMS consists of several components that work together to manage and process
data efficiently.

Major Components:

• Database Engine: The core part of DBMS is responsible for storing and retrieving
data.
• Database Schema: Defines the logical structure of the database.
• Query Processor: Interprets and executes database queries.
• Transaction Management System: Ensures consistency, atomicity, and isolation of
transactions.
• Security and Authorization Module: Controls user access and maintains security.
• Backup and Recovery Module: Ensures data protection against failures.

1.6. Types of Database Models

• DBMS supports different data models to structure and represent data logically.

1. Hierarchical Model – Organizes data in a tree-like structure (e.g., IBM's IMS).


2. Network Model – Uses a graph structure allowing many-to-many relationships.
3. Relational Model (RDBMS) – Stores data in tables with predefined relationships
(e.g., MySQL, PostgreSQL).
4. Object-Oriented Model – Stores data as objects, supporting inheritance and
encapsulation.
5. NoSQL Model – Handles unstructured and semi-structured data, valid for big data
applications (e.g., MongoDB, Cassandra).

1.7. Database Languages

DBMS supports specialized languages for interacting with the database.

• DDL (Data Definition Language) – Defines database structure (CREATE, ALTER,


DROP).
• DML (Data Manipulation Language) – Manages data (INSERT, UPDATE, DELETE).
• DCL (Data Control Language) – Controls access (GRANT, REVOKE).
• TCL (Transaction Control Language) – Manages transactions (COMMIT,
ROLLBACK).

Example of SQL Query:

SELECT Name, age FROM Students WHERE age > 20;

1.8. Advantages of Using DBMS

Using a DBMS offers several benefits over traditional file-based storage.

• Data Integrity and Accuracy – Ensures data consistency through constraints.


• Reduced Data Redundancy – Eliminates duplication via normalization.
• Concurrent Access – Multiple users can access data simultaneously.
• Efficient Query Processing – Provides optimized retrieval of information.
• Scalability and Security – Supports large-scale data with access controls.

1.9. Challenges in Database Management

Despite its advantages, managing databases comes with specific challenges:

• Performance Bottlenecks – Large datasets require efficient indexing and


optimization.
• Security Threats – Databases are vulnerable to cyber-attacks and unauthorized
access.
• Complexity of Design – Poor schema design can lead to inefficiencies.
• Scalability Issues – Traditional databases struggle with rapid data growth.

1.10. Emerging Trends in DBMS

With advancements in technology, databases are continuously evolving.

• Cloud Databases (e.g., AWS RDS, Google Cloud Spanner) – Storing and managing
data on cloud platforms.
• Big Data & NoSQL (e.g., Hadoop, MongoDB) – Handling large-scale unstructured
data.
• Blockchain Databases – Secure, tamper-proof ledger systems.
• AI & Machine Learning Integration – Automating query optimization and
predictions.

A Database Management System (DBMS) is a critical component of modern computing,


enabling organizations to store, manage, and analyze data efficiently. DBMS continues to
evolve from relational models to cutting-edge NoSQL and cloud solutions, shaping how data
is handled in business, healthcare, finance, and various other fields. Understanding DBMS
fundamentals is essential for any aspiring data professional, software engineer, or IT
specialist, as databases remain the backbone of today's digital world.

1.11. Importance of DBMS

• Data Organization: Efficient storage and retrieval.


• Data Security: Protection against unauthorized access.
• Data Integrity: Ensures accuracy and consistency.
• Concurrency Control: Allows multiple users to access data simultaneously.
• Scalability: Handles growing data requirements.
2. Hierarchical Database Model

• The Hierarchical Database Model is one of the earliest database models, which
organizes data in a tree-like structure.
• It represents data in a hierarchy consisting of parent-child relationships, where
each parent can have multiple children, but each child has only one parent.
• This model was widely used in early database management systems such as IBM's
IMS (Information Management System).

Key Characteristics:

• Data is structured in a tree format.


• Relationships follow a one-to-many (1:M) structure.
• It uses pointers or links to establish relationships between records.
• Fast access and retrieval due to predefined relationships.
• Complex and rigid structure, requiring predefined relationships.

2.1. Architecture of the Hierarchical Database Model

The Hierarchical Database Architecture consists of various components, including:

2.1.1 Data Organization

Data is organized hierarchically, similar to a tree structure, where:

• The root node represents the topmost entity in the hierarchy.


• Each node represents a record, and links between nodes define relationships.
• Parent-child relationships are established using pointers.
• Each parent node can have multiple child nodes but only one parent.

Example Hierarchy:

Consider a database for a university where:

University
├── Faculty
│ ├── Departments
│ │ ├── Courses
│ │ │ ├── Students

In this structure:

• The University is the root node.


• Faculty is a child of the University.
• Departments belong to a Faculty.
• Courses are associated with Departments.
• Students enroll in Courses.

2.1.2 Schema Representation

The schema in a hierarchical database is defined using a tree structure where each entity
type represents a node.

Example Schema Representation:

Root (University)
├── Faculty (FID, FName)
│ ├── Department (DID, DName, FID)
│ │ ├── Course (CID, CName, DID)
│ │ │ ├── Student (SID, SName, CID)

Each child node contains a reference to its parent through foreign keys (pointers).

2.1.3 Data Storage and Access Mechanism

Hierarchical databases use pointers to establish relationships. Data is stored using


sequential and indexed file structures.

• Accessing data follows a top-down approach.


• Navigation requires traversing the hierarchy from root to child nodes.
• Searching a record requires following parent-child links, making random access
challenging.

2.1.4 Record Representation in Storage

Records in a hierarchical database are stored in files and linked using pointers.

Example: Employee Database Record Representation

Company (Root)
├── Department
│ ├── Employee
│ │ ├── Project

Each entity has a unique identifier, and pointers establish relationships between records.

2.2. Advantages of the Hierarchical Database Model

1. Efficient Data Retrieval: Predefined relationships enable fast queries.


2. Logical Parent-Child Relationship: Data is naturally categorized.
3. Integrity and Security: The hierarchy ensures referential integrity.
4. Fast Transaction Processing: Suitable for high-speed operations (e.g., banking,
airline reservation systems).
5. Reduced Redundancy: Data is stored in a structured, non-repetitive format.

2.3. Disadvantages of the Hierarchical Database Model

1. Complexity in Relationship Management: Rigid structure; complex to modify


relationships.
2. Redundant Data Storage: Data replication occurs if multiple parents are required.
3. Complex Data Modification: Updating or deleting records requires changes across
multiple nodes.
4. Limited Query Flexibility: Queries must follow predefined paths; cross-hierarchical
access is challenging.

2.4. Applications of Hierarchical Databases

Hierarchical databases are still used in systems requiring structured and high-speed data
access:

• Banking and Financial Systems


• Airline Reservation Systems
• Telecommunications Networks
• Manufacturing and Inventory Control

The Hierarchical Database Model is crucial in database management, particularly in


structured environments requiring fast access and data integrity. Despite its limitations, it
remains relevant in legacy systems and specific applications like IMS databases in banking
and mainframe systems.

• Organized in a tree-like structure.


• One-to-many relationships (1:M).
• Efficient for structured data access.
• Limited flexibility compared to relational and NoSQL databases.
• It is still in use for mission-critical applications requiring high-speed processing.

This model laid the foundation for modern database structures and remains an essential
concept in database management systems (DBMS).
3. Network Database Model

The Network Database Model is a type of database architecture that connects multiple
records to multiple parent and child records. Unlike the Hierarchical Model, which follows
a strict tree structure, the Network Model enables more complex relationships using graph-
like structures with many-to-many relationships.

De�inition

The Network Database Model organizes data using records and relationships, where a
record represents an entity, and relationships (also called sets) define the associations
between records.

Key Characteristics

• Records and Sets: Data is stored in records (similar to tables) and linked through
sets (relationships).
• Many-to-Many Relationships: Unlike the hierarchical model, a child can have
multiple parents.
• Pointer-Based Navigation: Data retrieval follows pointers from one record to
another.
• Improved Data Access: It allows flexible and faster navigation through data.
• Standardized Model: Formalized by CODASYL (Conference on Data Systems
Languages).

3.1. Architecture of Network Database Model

The architecture of a Network Database Model consists of several key components:

3.1.1 Components of the Network Model

1. Records – Equivalent to entities in an ER diagram.


2. Sets – Relationships that link records together.
3. Owner (Parent) and Member (Child) Relationships – Defines hierarchical
relationships but allows multiple parents for a single child.
4. Pointers/Links – Establish direct access paths for navigation.

3.1.2 Structure of the Network Model

A network database is typically represented using graphs where:

• Nodes represent records (entities).


• Edges represent relationships (sets).
Example: Consider a university database where a student can enrol in multiple courses, and
a course can have multiple students.

• Student (Record) → Enrollment (Set) → Course (Record)


• A student can be enrolled in multiple courses.
• A course can have multiple students.

3.2. Representation of the Network Model

The network model is represented using the DBTG (Data Base Task Group) model, which
consists of three elements:

1. Schema – Defines the overall database structure.


2. Subschema – Defines user-specific views.
3. Data Manipulation Language (DML) – Used for accessing and managing data.

3.3. Advantages of the Network Model

• Efficient Data Access: Due to direct pointer-based access, navigation is faster


compared to relational models.
• Supports Complex Relationships: Many-to-many relationships can be effectively
represented.
• Data Integrity and Consistency: Relationships are explicitly defined, preventing
orphaned records.
• Flexible Data Representation: Data relationships are more adaptable than
hierarchical models.

3.4. Disadvantages of the Network Model

• Complex Implementation: Requires significant effort in database design and


maintenance.
• Complicated Query Processing: Querying data requires procedural navigation
rather than declarative SQL.
• Pointer-Based Navigation Issues: Changes in structure require updates to pointer
connections.
• Not Widely Used Today: Replaced by relational and NoSQL databases due to
usability concerns.

3.5. Comparison with Other Models


Feature Network Model Hierarchical Model Relational Model

Data Structure Graph-based Tree-based Table-based

Relationships Many-to-Many One-to-Many Many-to-Many


Navigation Pointer-based Parent-child SQL Queries

Flexibility High Low Very High

Implementation Complexity High Medium Low

The Network Database Model was an essential advancement in database design, allowing
flexible and efficient data relationships. However, its complexity and maintenance
overhead led to its gradual replacement by the Relational Model. Despite this, network
databases are still used in high-performance applications, such as telecommunications
and CAD systems, where complex relationships need efficient representation.
4. Relational Database Model Architecture

4.1. Introduction to Relational Database Model

The Relational Database Model is the most widely used database model, introduced by Dr.
E.F. Codd in 1970. It organizes data into tables (relations) consisting of rows (tuples) and
columns (attributes), where each row represents a unique record, and each column
represents a field within that record.

The relational model provides a mathematical foundation for storing, retrieving, and
managing structured data using relational algebra and calculus.

4.2. Key Components of Relational Database Model

a. Relation (Table)

A relation is a two-dimensional table that stores data. Each relation must have a unique
name and consists of attributes and tuples.

• Tuple (Row): A single record in a relation.


• Attribute (Column): A specific field within a relation.
• Domain: A set of valid values an attribute can take.

b. Schema and Instance

• Schema: The logical design of a relational database, specifying tables, attributes, and
relationships.
• Instance: A snapshot of the database at a particular moment containing data in tables.

c. Keys in Relational Model

Keys help maintain uniqueness and integrity in the relational model:

• Primary Key: Uniquely identifies a tuple (e.g., Student_ID in a Student table).


• Candidate Key: A set of attributes uniquely identifying tuples (Primary key is chosen
from candidate keys).
• Super Key: A set of attributes uniquely identifying a tuple but may contain extra
attributes.
• Foreign Key: A reference to the primary key in another table to establish
relationships.

4. 3. Relational Integrity Constraints

The relational databases enforce the following constraints to maintain data consistency:
a. Entity Integrity

• Ensures that every table has a primary key.


• Primary key values must be unique and cannot be NULL.

b. Referential Integrity

• Ensures consistency between related tables using foreign keys.


• Foreign key values must match primary key values in the referenced table or be NULL.

c. Domain Integrity

• Ensures attributes contain only valid data according to their defined domain.

d. Key Constraints

• A primary key must uniquely identify each record in a table.

4.4. Relational Database Architecture

a. Logical Architecture

The logical architecture consists of three main components:

1. Relations (Tables): The logical storage of data.


2. Keys and Constraints: Define relationships and maintain integrity.
3. Operations (Relational Algebra & SQL): Querying and manipulating data.

b. Physical Architecture

The physical architecture deals with how data is stored and managed:

1. Storage Engine: Manages physical storage on disk.


2. Buffer Management: Optimizes access speed using caching.
3. Indexing: Improves query performance.
4. Transaction Management: Ensures ACID properties (Atomicity, Consistency,
Isolation, Durability).

4.5. Relational Operations and Query Processing

a. Relational Algebra Operations

A set of operations used to retrieve and manipulate data:

• Selection (σ): Extracts specific rows (σ condition (Table))


• Projection (π): Extracts specific columns (π attribute_list (Table))
• Union (∪): Combines tuples from two relations.
• Intersection (∩): Retrieves common tuples from two relations.
• Difference (-): Finds tuples in one relation but not another.
• Cartesian Product (X): Combines all tuples from two relations.
• Join (⋈): Combines related tuples from two relations based on a condition.

b. Structured Query Language (SQL)

SQL is used to interact with relational databases. Common SQL operations include:

• SELECT: Retrieves data (SELECT * FROM Students WHERE Age > 18;)
• INSERT: Adds new data (INSERT INTO Students VALUES (1, 'Alice', 20);)
• UPDATE: Modifies existing data (UPDATE Students SET Age = 21 WHERE Student_ID
= 1;)
• DELETE: Removes data (DELETE FROM Students WHERE Age < 18;)

4.6. Advantages of Relational Database Model

• Data Integrity and Accuracy: Enforces constraints and reduces redundancy.


• Data Consistency: Maintains referential integrity between related tables.
• Security: Role-based access controls prevent unauthorized modifications.
• Scalability: Efficiently handles large datasets.
• Flexibility: Supports complex queries and relationships.

4.7. Disadvantages of Relational Database Model

• Complexity: Requires structured schema design.


• Performance Issues: Joins and complex queries can be slow for massive datasets.
• Scalability Limitations: Not ideal for distributed and NoSQL scenarios.

The Relational Database Model is the foundation of modern Database Management


Systems (DBMS), ensuring structured storage, efficient retrieval, and robust data integrity.
It is crucial for database professionals, software engineers, and data analysts to understand
its architecture, constraints, and operations.
5. Database System Architecture
Database system architecture refers to the overall design and structure of a database
management system (DBMS), defining how data is stored, accessed, and managed. A well-
defined architecture ensures data integrity, security, and efficiency while enabling multiple
users to interact with the database concurrently.

5.1. Components of a Database System

A database system consists of several components that work together to provide data
management functionalities:

• Hardware: Physical devices such as servers, storage disks, and network components.
• Software: The DBMS software that facilitates data access and manipulation.
• Data: The raw information stored and managed within the database.
• Users: Different types of users, such as database administrators (DBAs), developers,
and end-users.
• Procedures: The rules and instructions for database operations.

5.2. Levels of Database Architecture

Database architecture is commonly divided into three levels according to the ANSI/SPARC
model:
5.2.1. Internal Level (Physical Level)

• Defines the physical storage structure of the database.


• Describes how data is stored on disk, including indexing, file organization, and data
compression.
• Determines efficiency in terms of storage space and retrieval speed.

Example: A B-tree index structure improves search performance in large databases.

5.2.2. Conceptual Level (Logical Level)

• Provides a high-level abstraction of the entire database structure.


• Defines the relationships between different data entities without concern for how
they are physically stored.
• Ensures data consistency and security by enforcing integrity constraints.

Example: A university database may define tables for students, courses, and faculty
members with relationships among them.

5.2.3. External Level (View Level)

• Represents the way individual users perceive the database.


• Allows different users to have customized views based on their roles and permissions.
• Enhances security by restricting access to sensitive data.

Example: A professor can access student grades, while students can only view their
academic records.

5.3. Database System Architectures

Database systems can be categorized based on their architecture:

5.3.1. Centralized Database Architecture

• All data resides in a single central database.


• Users access data via a network connection.
• It is easier to manage but can become a performance bottleneck.

Example: A bank maintains all customer records in a central database server.

5.3.2. Client-Server Database Architecture

• The database server processes and stores data while client applications request data.
• Reduces network congestion by allowing client-side processing.
• Used in enterprise applications.
Example: An e-commerce platform where customers place orders through a web
application connected to a database server.

5.3.3. Distributed Database Architecture

• Data is distributed across multiple sites or servers.


• Improves reliability, availability, and performance.
• Requires synchronization and consistency mechanisms.

Example: A multinational corporation has database servers in different geographical


locations.

5.3.4. Cloud-Based Database Architecture

• Databases are hosted in cloud environments, offering scalability and accessibility.


• Managed by third-party providers such as AWS, Google Cloud, or Azure.
• Reduces the need for on-premises infrastructure.

Example: A SaaS (Software as a Service) company storing user data in cloud-hosted


databases.
6. Data Abstraction

DBMS data abstraction refers to simplifying complex database structures by providing only
essential details while hiding unnecessary implementation specifics. This concept helps
manage large-scale databases efficiently and ensures that users interact with data at
different levels of abstraction.

6.1. Objectives of Data Abstraction

• To provide a clear separation between data and applications.


• To minimize the complexity of database design and interaction.
• To improve security by limiting data access at different levels.
• To enhance database performance and manageability.

6.2. Levels of Data Abstraction

Data abstraction in DBMS is typically categorized into three levels:

6.2.1. Physical Level (Internal Level)

• The lowest level of abstraction describes how data is stored in the database.
• It deals with physical storage structures, indexing mechanisms, and file organization
techniques.
• Users at this level include database administrators (DBAs) who optimize
performance and storage utilization.

Example: Storing customer records as B-trees or hash indexes in a disk-based storage


system.

6.2.2. Logical Level (Conceptual Level)

• This level defines what data is stored and the relationships among them without
specifying physical storage details.
• It provides a global view of the database and ensures data integrity.
• Database designers work at this level to define schema, tables, constraints, and
relationships.

Example: Defining a Customer table with CustomerID, Name, and Email attributes without
considering how they are physically stored.

6.2.3. View Level (External Level)

• It is the highest level of abstraction, which presents data in a user-friendly manner.


• It defines multiple views for different users based on their needs.
• Application developers and end-users interact with this level.
Example: A bank teller may access customer balances, while a loan officer can view credit
history but not account balances.

Advantages of Data Abstraction

1. Encapsulation of Data Complexity – Users do not need to understand the


underlying storage and retrieval mechanisms.
2. Data Independence – Changes at one level do not affect higher-level users.
3. Security and Privacy – Access to data can be restricted based on user roles.
4. Efficient Data Management – Enables better organization and optimization of data
retrieval processes.
5. Improved Application Development – Developers can work with simplified data
representations, reducing errors and improving efficiency.

Data independence is a crucial concept related to data abstraction. It ensures that changes
in database schema at one level do not affect other levels.

• Physical Data Independence – Changes in storage structure do not impact the


logical schema.
• Logical Data Independence – Changes in the logical schema do not affect external
views or applications.

DBMS data abstraction provides a structured approach to database management by


segregating concerns into multiple levels. This enhances security, efficiency, and ease of
use while allowing complex databases to be handled effectively. Understanding these levels
is essential for database designers, administrators, and application developers.

• Data abstraction simplifies database interaction.


• There are three levels: Physical, Logical, and View.
• It ensures data independence and security.
• It improves database efficiency and application development.

You might also like