0% found this document useful (0 votes)
7 views15 pages

01 Introduction RDB

The document outlines a course plan on databases, covering topics such as data management, basic concepts, and the relational data model. It emphasizes the importance of databases in modern applications and the advantages of using a Database Management System (DBMS) over traditional file management systems. The course aims to provide a comprehensive understanding of database structures, operations, and user interactions.

Uploaded by

kingofwarlol123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views15 pages

01 Introduction RDB

The document outlines a course plan on databases, covering topics such as data management, basic concepts, and the relational data model. It emphasizes the importance of databases in modern applications and the advantages of using a Database Management System (DBMS) over traditional file management systems. The course aims to provide a comprehensive understanding of database structures, operations, and user interactions.

Uploaded by

kingofwarlol123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

24/2/26

Plan
• 1. Introduction
Lesson 1: • 2. Data management
Introduction to databases • 3. Basic concepts on database
• 4. Relational data model
Nguyễn Thị Oanh
oanhnt@[Link]
SoICT, HUST

1 2

1 2

1. Introduction 1. Introduction
• Major research field with long history (since the • How big is our digital universe?
beginning of computer)
• 90% applications use databases
• Hot jobs in startups, large corporations
• Massive Industry: Oracle, IBM, Microsoft, Google,
AWS

Databases: core technology that powers modern software


systems

Source: [Link]
3 4

3 4

1
24/2/26

1. Introduction
• Data science knowledge stack

2. Data management

Source: [Link]
6 7

6 7

2. Data management 2. Data management


• What is data? • Case study: List of students of your class
‒ data is a collection of discrete values that convey ‒ How to collect and store it?
information, describing quantity, quality, fact, statistics, • Excel? Word? ..
other basic units of meaning, or simply sequences of
symbols that may be further interpreted.
‒ How can you share it?
• Email? Google drive? ...
• What is data management:
‒ Data management is the practice of collecting, keeping, ‒ What will happen if you want to modify its content?
and using data securely, efficiently, and cost-effectively. ‒ What will happen if two people edit simultaneously a file?
‒ If you want to share full information to teachers, and less
infos to students, how will you do?

8 9

8 9

2
24/2/26

2.1. File management system


2. Data management approach
2.1. File management system approach
2.2. Database management system approach
Student Enrollement Lecturer

Student Lecturer

Class Note Course

10 11

10 11

2.1. File management system


approach
2.2. Database approach
• Limitations
‒ Uncontrolled redundancy Lecturer

‒ Inconsistent data
Database
‒ Inflexibility Lecturer
Enrollment DBMS Student
‒ Limited data sharing Class
Course

‒ Poor enforcement of standards Note

‒ Low programmer productivity Metadata


Student
‒ Excessive program maintenance (Catalog)

‒ Excessive data maintenance

12 13

12 13

3
24/2/26

2.2. Database approach: Key advantages 2.2. Database approach: Characteristics


• Controlled redundancy: • Self-describing
‒ maintain data consistency & enforce integrity constraints
‒ DBMS contains catalog (or meta-data) that stores the
• Data Integration: description (structures, constraints) of the database
‒ Database is self-contained & represents semantics of application. All related
data is logically connected ‒ This allows the DBMS to manage with multiple DBs
• Data and operation sharing: • Data Abstraction:
‒ multiple users/interfaces access the same data effectively ‒ Data model is used to hide storage details
• Flexibility:
‒ Users interact with a conceptual view of the DB rather than
‒ data storage independence, data accessibility, reduced program
maintenance dealing with raw storage mechanisms
• Services & Controls • Sharing of data
‒ Security & privacy controls ‒ Support multiple views of the same data (a DB) for different
‒ backup & recovery users or applications.
‒ enforcement of standards
‒ Allow concurrent access on a DB while maintaining data
• Ease of application development integrity and security.

14 15

14 15

2.2. Database approach: Characteristics 2.2. Database approach


• Persistence • Data Abstraction: 3-tier Schema Model (ANSI-SPARC
‒ store data on secondary storage è ensure data durability Architecture)
• Retrieval …..
END USERS

‒ support declarative query language (e.g. SQL), allowing


users to specify what data they need without detailing how
EXTERNAL LEVEL
(View level)
E X TE R N A L
V IE W 1
….. E X TE R N A L
V IE W n

to retrieve it E xternal/C onceptual M apping

CONCEPTUAL LEVEL CONCEPTUAL SCHEMA


‒ procedural database programming language for more (Logical level)

complex operations and procedural logic within the DB. C onceptual Internal M apping

INTERNAL SCHEMA
• Performance INTERNAL LEVEL
(Physical level)

‒ retrieve and store data quickly using indexing, query


optimization techniques
‒ handle large volume of data STORED DATABASE

16 17

16 17

4
24/2/26

3. Basic concepts
3.1. Data
3.2. Database
3.3. Data model vs. schema vs. instance
3.4. Database management system (DBMS)
3.5. Database environment
3. Basic concepts 3.6. Database users

21

20 21

3.1. Data 3.2 Database


• Definitions • Example: Course management system
• Important information:
Wikipedia Data is any sequence of one or more symbols
‒ Program, Class, student, course, teacher, ...
given meaning by specific act(s) of interpretation.
‒ Student: personal infos, studying progress
Information in raw or unorganized form (e.g. ‒ Course: hours, teacher, timetable, ...
Businessdict alphabets, numbers, or symbols) that refer to, or ‒ ...
[Link]
represent conditions, ideas, or objects. Data is
limitless and present everywhere in the universe
ØIt need to store the information
‒ E.g. A specific student data: ID, Name, Age, Gender, ØDatabase
Address,…

22 23

22 23

5
24/2/26

3.2. Database 3.2. Database


• Definitions • A database is logically coherent & internally consistent
• It's designed for a specific purpose
Database is a shared collection of related
Wikipedia • It provides a structured representation of the real world
data designed to meet the information
needs of an organization ‒ Entities (e.g., Students, Courses)
‒ Relationships (e.g., Tam is enrolled in C++)
A database is a collection of data that is
Intro to CS [Example] A course management system
organized so that it can be easily accessed,
managed and updated
Entities Relationships
‒ E.g.: course management database, Sales management • Students • Students take in some
• Courses courses
database, library database, …
• Teachers • Course are given by
some teachers

24 25

24 25

3.3. Model vs. Schema vs. Instance 3.3. Model vs. Schema vs. Instance

• Set of concepts used to describe the structure of a


type <type_name> = record
<field_name> : <data_type>;
database: data types, relationships, constraints, semantics, <field_name> : <data_type>;
Data Model Data Model …
• Tool for data abstraction end;

• Compose of structures and its operators


type student = record
ID : string;
fullName: string;
• The data structure represents all relevant features of the Birthday: date;
Schema Schema Address: string ;
Class: string;
real-world domain of interest. end;

• Data itself
( « Stud001 », « Nguyen », 1/4/1983, «1 Dai Co Viet », « 1F
Instance VN K50 »)
Instance

26 27

26 27

6
24/2/26

3.4. Database Management System


3.4. Database Management System (DBMS)
• We need a tool that can: • Definitions
‒ store information correctly
A software to facilitate the creation and
Wikipedia
‒ retrieve it efficiently maintenance of a database

The DBMS provides users and programmers with a


ØTools that help build and manage a databases Techtarget
systematic way to create, retrieve, update and manage data

• Software: DataBase Management System (DBMS)

28 29

28 29

3.4. Database Management System


(DBMS)
Main modules of a DBMS
• Core functions: App
‒ Defining ~ specifying data types, structures, and
constraints of the DB
Query processing
‒ Constructing ~ creating the DB structure and populating and optimization Transaction
it with data managment
DBMS
‒ Manipulating ~ querying, updating, generating reports from
the data
Storage
managements

Data Data
30 31

30 31

7
24/2/26

Example: database usage Basic concepts


• Student: Database System
‒ List of courses offered by the "Computer Science" department Application
‒ Grade of the « Database» course?

• Teacher
‒ List of students in class "124432" for semester 2022.2
‒ Timetable

• Staff
‒ List of students
Database
‒ Success rate for each course

ØWe need a software to exploit a database Database Management


ØApplication System (DBMS)

32 33

32 33

3.5. Database Environment 3.6. Database Users


• A database environment (database system) is a • Database administrators
collective system of components that regulates the ‒ authorize access to the database
management, the use of data, and the data itself ‒ co-ordinate and monitoring its use
‒ Hardware
‒ acquire software and hardware resources, control their use
‒ Software and monitor system performance
‒ Data use and
control the
Application
‒ Users content

‒ Procedures/Manuals
DBMS enable the
database
to be developed
DB DB

34 35

34 35

8
24/2/26

3.6. Database Users Summary


• Database Designers • Overview
‒ Course overview
‒ Define the database content, structure, constraints, and
‒ Course objective
supported transactions
‒ Motivation for studying databases
‒ Communicate with end users to understand their requirements
• Data management
• End-users ‒ File management system approach
‒ Use the database for queries and reports; some may update the ‒ Database management system approach
data • Basic concepts
‒ Types of end-users: ‒ Data
‒ Database
• Casual end users (Use the database occasionally, ex. manager)
‒ Data model vs. schema vs. Instance
• Naive users (Regularly interact with the database; users of the
‒ Database management system (DBMS)
predefined applications/interfaces)
‒ Database environment
• Sophisticated end users (Data Scientists, Engineers)
‒ Database users

36 37

36 37

4. Relational data model 4.1. Introduction


4.1. Introduction • Some of data models:
4.2. Database Basic concepts ‒ Hierarchical database model
4.3. Constraints ‒ Network model
‒ Object-oriented database model
4.4. An example
‒ Relational model
‒ Entity-relationship model
‒ Document model
‒…

38 39

38 39

9
24/2/26

4.1. Introduction 4.2. Basic concepts


• Relational data model: Relations • are saved in the format of tables, which have
rows and columns. The uppercase letters Q, R,
‒ Is very simple model, was first introduced by Ted Codd of S, ... denote relation names.
IBM Research in 1970
Relation • actual contents at given point in time. The
‒ Used by most of commercial database systems instance/state lowercase letters q, r, s denote relation states
‒ Query with high-level languages
Database • a set of named relations (or tables)
‒ Efficient implementations
‒ Based on mathematical theory, closed to file structure and
data structure, there are three sets of terminology: clazz
clazz_id name lecturer_id monitor_id
student
student_id first_name last_name … clazz_id
20162101 CNTT1.01-K61 02001 20160003 20160001 Ngọc An Bùi
Relation Table File
20162102 CNTT1.02-K61 20160002 Anh Hoàng 20162101
Tuple Row Record 20172201 CNTT2.01-K62 02002 20170001 20160003 Thu Hồng Trần 20162101
20172202 CNTT2.02-K62 20160004 Minh Anh Nguyễn 20162101
Attribute Column Field 20170001 Nhật Ánh Nguyễn 20172201

40 41

40 41

[Link] concepts: a simple database 4.2. Basic concepts


student subject Tuple • A single row of a table, which contains a single recor
d for that relation.
• The lowercase letters t, u, v denote tuples.
Cardinality • Is the number of tuples in a relation.
Degree • Is the number of attributes in a relation.
Foreign key (arity)
enrollment
Foreign key

Primary key

42 43

42 43

10
24/2/26

4.2. Basic concepts 4.2. Basic concepts


• An example • Relational schema: structural description of relations
‒ student(student_id, first_name, last_name, dob, gender, in database.
address, note, clazz_id) ‒ A relation schema R of degree n, denoted by R(A1, A2, ...,
Relation / table name
Attributes / Fields/Columns
An), is made up of a relation name R and a list of attributes
A1, A2, ..., An
student
student(student_id, first_name, last_name, dob, gender,
address, note, clazz_id)

Tuples /
Rows/ Cardinality ‒ Each attribute Ai has values belong to domain Di of Ai,
Records =6 denoted by dom(Ai)
DOM(gender) = {'Female', 'Male'}
Degree = 8

44 45

44 45

4.2. Basic concepts 4.3. Constraints


• Relational schema: structural description of relations 4.3.1. Introduction
in database. 4.3.2. Types of constraints
‒ An n-tuple t in a relation r(R) is denoted by 4.3.3. An example
t = <v1, v2, ..., vn>, where vi is the value corresponding to
attribute Ai
t = <'20220101', 'Hoai An', 'Vu', '2003-12-04', 'M', 'Hai Bà
Trưng, Hà nội', '', '20220101’>

Both t[Ai] and [Link] (and sometimes t[i]) refer to the value vi in
t for attribute Ai
E.g.: t.student_id = '20220101'

46 47

46 47

11
24/2/26

4.3.1. Introduction 4.3.2. Types of constraints


• Every relation has some conditions that must hold for • Key constraints
it to be a valid relation • Domain constraints
• These conditions are called Relational Integrity • Referential integrity constraints
Constraints
• Provide a way of ensuring that changes made to the
database by authorized users do not result in a loss
of data consistency.

48 49

48 49

4.3.2. Types of constraints Some types of key


• Key constraints • Superkey / Key: An attribute, or a set of attributes,
‒ A key is an attribute or a set of attributes in the relation, that uniquely identifies a tuple within a relation
which can identify a tuple uniquely. ‒ Eg: student(student_id, first_name, last_name, dob, gender,
‒ Key constraints force that: address, note, clazz_id, citizen_id)
• in a relation with a key, no two tuples can have identical
values for key attributes. Super key = {student_id, first_name}
Super key = {student_id}
• a key can not have NULL values.
Super key = {student_id, first_name, last_name}
• Key constraints are also referred to as Entity Constraints. Super key = {student_id, first_name, last_name, dob}
Super key = {student_id, first_name, last_name, dob, gender, address,
student(student_id, first_name, last_name, dob, gender, address, note, clazz_id, citizen_id}
note, clazz_id) Super key = {citizen_id}
Key = {student_id, first_name}
Key = {student_id}
50 51

50 51

12
24/2/26

Some types of key Some types of key


• Candidate Key / Minimal key : Superkey (K) such • Primary Key: Candidate key selected to identify
that no proper subset is a superkey within the tuples uniquely within a relation.
relation ‒ "Good" candidate key
‒ In each tuple of the relation, values of K uniquely identify Primary key = {student_id}
that tuple (uniqueness) ‒ Each key attribute of primary key has its name underlined.
‒ No proper subset of K has the uniqueness property student(student_id, first_name, last_name, dob, gender,
(irreducibility) address, note, clazz_id, citizen_id)
‒ a minimal set of attributes that can be used to identify a
single tuple (called Minimal key)
• Alternate Keys: Candidate keys that are not selected
Candidate key = {student_id}
to be the primary key
Candidate key = {citizen_id}

52 53

52 53

4.3.2. Types of constraints 4.3.2. Types of constraints


• Domain Constraints: • Domain constraints
‒ Attributes have specific values in real-world scenario. Every attribute
‒ NULL value
is bound to have a specific range of values.
‒ Within each tuple, the value of each attribute A must be an atomic • Represents value for an attribute that is currently unknown /
value from the domain DOM(A). undefined or not applicable for any tuple;
‒ The data types associated with domains • deals with incomplete or exceptional data;
• standard numeric data types for integers (short integer, integer, and • represents the absence of a value and is not the same as
long integer) and real numbers (float, double precision float).
zero or spaces
small integer (2 bytes): -32768 to +32767
• Characters, Booleans, fixed-length strings, and variable-length strings,
date, time, timestamp, and money, or other special data types.
• a subrange of values from a data type: eg. Age>0 ; grade>=0 and
<=10
• an enumerated data type in which all possible values are explicitly
listed: e.g.: gender in {‘F’, ‘M’}

54 55

54 55

13
24/2/26

4.3.2. Types of constraints 4.3.2. Types of constraints


• Referential integrity Constraints • Foreign Key:
‒ Referential integrity constraints work on the concept of ‒ Attribute, or set of attributes, within one relation that
Foreign Keys. A foreign key is a key attribute of a relation matches candidate key of some relation
that can be referred in other relation. ‒ Used to model relationships between relations
‒ Each key attribute of foreign key has its name italic
‒ Referential integrity constraint states that if a relation refers
clazz(clazz_id, name, lecturer_id, monitor_id)
to a key attribute of a different or same relation, then that
student(student_id, first_name, last_name, dob, gender, address, note, clazz_id)
key element must exist.
clazz student
clazz_id name lecturer_id monitor_id student_id first_name last_name … clazz_id
20162101 CNTT1.01-K61 02001 20160003 20160001 Ngọc An Bùi
20162102 CNTT1.02-K61 20160002 Anh Hoàng 20162101
20172201 CNTT2.01-K62 02002 20170001 20160003 Thu Hồng Trần 20162101
20172202 CNTT2.02-K62 20160004 Minh Anh Nguyễn 20162101
20170001 Nhật Ánh Nguyễn 20172201

56 57

56 57

4.4. An example 4.4. An example


student subject

clazz(clazz_id, name, lecturer_id, monitor_id)

student(student_id, first_name, last_name, dob, gender, address, note,

clazz_id)
Foreign key
enrollment
subject(subject_id, name, credit, percentage_final_exam) Foreign key

enrollment(student_id, subject_id, semester, midterm_score,

final_score) Primary key

58 59

58 59

14
24/2/26

Summary Learning objectives


• Relational data model • Upon completion of this lesson, students will be able
‒ Relations, relation instance/state, relation schema to:
‒ Database, tuple ‒ Recall the concepts of database, DBMS, data model, file
‒ Cardinality, degree system.
‒ Identify the characteristics of database and file system
• Constraints
approach in data management
‒ Key constraints
‒ Recall some basic concepts of relational data model.
‒ Domain constraints
‒ Show some constraints of relational data model.
‒ Referential integrity constraints

60 61

60 61

15

You might also like