0% found this document useful (0 votes)

19 views42 pages

Supervised Learning in Machine Learning

Uploaded by

obsndine1498

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views42 pages

Supervised Learning in Machine Learning

Uploaded by

obsndine1498

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Machine Learning:

Supervised Learning

Department of Computer
Science
Haramaya University

2025
Topics
• Introduction to Supervised Learning

• Classification
o Decision tree
o Bayes classification
o SVM
o KNN
o ANN
• Regression-continuous value prediction

2
Supervised Learning
• A supervised scenario is characterized by the concept of
a teacher or supervisor, whose main task is to provide
the agent with a precise measure of its error (directly
comparable with output values)
• The goal is to infer a function or mapping from training
data that is labeled.
• The training data consist of input vector X and output
vector Y of labels or tags.
• Based the training set, the algorithms generalize to
respond correctly to all possible inputs i.e. it is called
learning from Examples.

3
Supervised Learning
• A data set denoted in the form of
o Where the inputs are , the outputs are and i=1 to N, N is the
number of observation.
• Generalization: the algorithms should produce sensible
output for the inputs that were not encountered during
learning.
Supervised Learning categorized into two:
o Classification: data is classified into one of two or more classes
o Regression: a task of predicting continuous quantity.

4
Classification…
Classification:
o It is a supervised learning model as the classifier already has a
set of classified examples and from these examples, the
classifier learns to assign unseen new examples.
• Example:
o assigning a given email into Spam or non-spam category
o eye color classification into: blue, brown, or green
• widely-used classiﬁers: decision tree, support vector
machine, naïve Bayes, neural network, K-nearest
neighbors etc.

5
6
Decision Tree(DT)
• Decision tree (DT) is a statistical model that
builds classification models in the form of
tree structure.
• This model classifies data in a dataset by
flowing through a query structure from the
root until it reaches the leaf, which
represents one class.
• The root represents the attribute that plays
a main role in classification, and the leaf
represents the class.
o Given an input, at each node, a test is applied
and one of the branches is taken depending on
the outcome.
7
Decision Tree…
Classify whether a person will buy a laptop.
Since:
Buys
Age Income • “High” income tends to lead to Yes
Laptop?
Young High No (Middle-aged case),
• and “Low” income tends to lead to Yes
Young Low Yes
(Young case),
Middle High Yes • while only “High income” in “Young”
Old Medium ? case gave No,
we can infer that medium income
Age?
generally leans toward Yes.
/ | \
Young Middle Old • Predicted classification:
/ \ | |
Old, Medium Income → Buys
High Low High Medium
No Yes Yes Yes Laptop = Yes

8
Decision Tree…
• DT learning is supervised, because it constructs DT from
class-labeled training tuples.
• Two DT algorithms are ID3 (Iterative Dichotomiser 3)and C4.5
(Successor of ID3)
• The statistical measure used to select attribute (that best
splits the dataset in terms of given classes) are
information gain and gain ratio.
• Both measures have a close relationship with another
concept called entropy.

9
Decision Tree…
• Entropy: Measures uncertainty or randomness in the
dataset. Lower entropy means more pure (better)
classification.
• Information Gain: Measures how much an attribute
reduces entropy. Higher information gain means a better
attribute for splitting. (used in ID3, Iterative
Dichotomiser 3)
• Gain Ratio: An improved version of information gain that
adjusts for attribute bias (C4.5 ,Successor of ID3)
• Gini Index: Measures impurity in a dataset. Lower Gini
Index means a better attribute for classification. Used in
CART (Classification and Regression Trees)
10
Algorithm for decision tree learning
• Basic algorithm (a greedy divide-and-conquer algorithm)
o Assume attributes are categorical (continuous attributes can be handled
too)
o Tree is constructed in a top-down recursive manner/no backtracking
o At start, all the training examples are at the root
o Examples are partitioned recursively based on selected attributes
o Attributes are selected on the basis of an impurity function (e.g.,
information gain)
• Conditions for stopping partitioning
o All examples for a given node belong to the same class
o There are no remaining attributes for further partitioning – majority
class is the leaf
o There are no examples left

11
Choose an attribute to partition data
• The key to building a decision tree :
1. choose an attribute from the dataset
2. Calculate the significance of attributes in splitting data
3. Split data based on the value of the best attribute
4. Go to step 1
• The objective is to reduce impurity or uncertainty in data as much as possible.
o A subset of data is pure if all instances belong to the same class.
• The best attribute is selected for splitting the training examples using a
Goodness function- mathematical function used to evaluate how good a split is.
• The best attribute:
 separates the classes of the training examples faster, and
 the tree is smallest

12
Decision Tree…

Entropy: is a metric used to measure the impurity of the data

• It is a degree of randomness in the data
• it is measure of impurity or uncertainty of data.
• The higher the entropy the more the information content

Where
• i is simply the identifier for each class in the dataset.
• m= is the number of class labels
• is the nonzero probability that an arbitrary tuple in D belongs to class and
estimated by pi= ,
|ci,=means the number of tuples (data points) that belong to class 𝑐𝑖 in the
• =total number of tuples in the dataset.
•
dataset 𝐷

.
Decision Tree…

• Example

• Entropy ≈ 0.97 means the data are mixed (not completely pure).

.
Decision Tree…
Entropy: measures the level of disorder or uncertainty in a
given dataset or system

15
Decision Tree…
Entropy:

16
Decision Tree…
Entropy:

17
Decision Tree
• Suppose a set of D containing a total of N examples
which are positive and are negative outcomes and the
entropy is given by:

• Some useful property of the entropy:

o D(

 means that all the examples are in the same class

o means that half the examples in S are of one class and half in
the opposite class.

18
Decision tree…
Information Gain:

19
Decision tree…
Information Gain
• We want to determine which attribute in a given set of
training feature vectors is most useful for discriminating
between the classes to be learned.
• Information gain tells us how important a given attribute
of the feature vectors is.
• We use it to decide the ordering of attributes in the
nodes of a decision tree
Information Gain = Entropy before splitting -
Entropy after splitting

20
Decision tree…
Information Gain:
•Selects the attribute with the highest information gain,
that create small average disorder:
•The attribute with the highest information gain is
considered as the best.
First, compute the disorder using Entropy; the
expected information needed to classify objects into
classes
Second, measure the Information Gain to calculate by
how much the disorder of a set would reduce by
knowing the value of a particular attribute.

21
Decision tree…
Information Gain

22
Entropy

23
Entropy

24
Entropy

25
Entropy

26
Entropy

27
Entropy

28
29
30
Gain Ratio

 Gain Ratio or Uncertainty Coefficient is used to normalize the

information gain of an attribute against how much entropy that
attribute has. Formula of gain ratio is given by:

Gain Ratio=Information Gain/Entropy

 From the above formula, it can be stated that if entropy is very small, then the
gain ratio will be high and vice versa.
 First, determine the information gain of all the attributes, and then
compute the average information gain.
 Second, calculate the gain ratio of all the attributes whose calculated
information gain is larger or equal to the computed average information
gain, and then pick the attribute of higher gain ratio to split.

31
Gini Index
 The Gini index can also be used for feature selection.
 The tree chooses the feature that minimizes the Gini impurity
index. The higher value of the Gini Index indicates the impurity
is higher. Both Gini Index and Gini Impurity are used
interchangeably
 It performs only binary split. For categorical variables, it gives
the results in terms of “success” or “failure”
 Gini Index can be calculated from the below mathematical
formula where c is the number of classes and pi is the probability
associated with the ith class.

32
Example-Decision tree
The problem of “Sunburn”: You want to predict whether another
person is likely to get sunburned if he is back to the beach.
How can you do this? Data Collected: predict based on the
observed properties of the people
Name Hair Height Weight Lotion Result
Sarah Blonde Average Light No Sunburned
Dana Blonde Tall Average Yes None
Alex Brown Short Average Yes None
Annie Blonde Short Average No Sunburned
Emily Red Average Heavy No Sunburned
Pete Brown Tall Heavy No None
John Brown Average Heavy No None
Kate Blonde Short Light Yes None

33
Decision tree
•

34
Decision Tree
•

Test expected information for each attribute

Hair 0.50
height 0.69
weight 0.94
lotion 0.61

35
Decision tree
• Information gain of each attribute

Gain(hair) = 0.954 - 0.50 = 0.454

Gain(height) = 0.954 - 0.69 =0.264
Gain(weight) = 0.954 - 0.94 =0.014
Gain (lotion) = 0.954 - 0.61 =0.344
Which decision variable maximises the Info Gain?

36
The best decision tree?

is_sunburned
Hair colour
blonde
red brown
Sunburned None
?

Sunburned = Sarah, Annie,

None = Dana, Katie

• Once we have finished with hair colour we then need to

calculate the remaining branches of the decision tree.
• Which attributes is better to classify the remaining ?

37
The best Decision Tree
• This is the simplest and optimal one possible and
it makes a lot of sense.
• It classifies 4 of the people on just the hair colour
alone.
is_sunburned
Hair colour
blonde brown
red
None
Sunburned
Lotion used

no yes

Sunburned
None

38
Avoid overfitting in classification
• Overfitting: A tree may overfit the training data
o Good accuracy on training data but poor on test data
o Symptoms: tree too deep and too many branches, some may
reflect anomalies due to noise or outliers
• Two approaches to avoid overfitting
o Pre-pruning: Stop the tree from growing
once it meets certain conditions.
Post-pruning: Remove branches or sub-trees from a “fully grown”
tree.
• First grow the tree completely.
• Then remove branches that don’t improve accuracy on a validation
dataset.
39
Decision Tree

 You can view Decision Tree as an IF-

THEN_ELSE statement which tells us
whether someone will suffer from
sunburn.
If (Hair-Colour=“red”) then
return (sunburned = yes)
else if (hair-colour=“blonde” and lotion-used=“No”)
then
return (sunburned = yes)
else
return (false)

40
Decision Tree
• The benefits of having a decision tree are as
follows −
o It does not require any domain knowledge.
o It is easy to understand.
o The learning and classification steps of a
decision tree are simple and fast.

41
Thank you!
?

Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
51 pages
Supervised Learning Overview at MKAU
No ratings yet
Supervised Learning Overview at MKAU
129 pages
Understanding Decision Tree Classifiers
No ratings yet
Understanding Decision Tree Classifiers
30 pages
Machine Learning: Decision Tree Overview
No ratings yet
Machine Learning: Decision Tree Overview
24 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
40 pages
A Decision Tree Is A Supervised Learning Algorithm
No ratings yet
A Decision Tree Is A Supervised Learning Algorithm
9 pages
Understanding Decision Trees in Machine Learning
No ratings yet
Understanding Decision Trees in Machine Learning
33 pages
Data Mining Classification Techniques
No ratings yet
Data Mining Classification Techniques
44 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
78 pages
Decision Trees and Probabilistic Models
No ratings yet
Decision Trees and Probabilistic Models
25 pages
Decision Tree DR - Shatha
No ratings yet
Decision Tree DR - Shatha
42 pages
Classification Techniques in Machine Learning
No ratings yet
Classification Techniques in Machine Learning
41 pages
Data Mining: Classification Techniques
No ratings yet
Data Mining: Classification Techniques
58 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
80 pages
ML 7
No ratings yet
ML 7
56 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
37 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
75 pages
Understanding Data Classification Techniques
No ratings yet
Understanding Data Classification Techniques
40 pages
Class V
No ratings yet
Class V
24 pages
Understanding Decision Trees in ML
100% (1)
Understanding Decision Trees in ML
58 pages
Understanding Classification Techniques
No ratings yet
Understanding Classification Techniques
75 pages
Classification Rule Mining Techniques
No ratings yet
Classification Rule Mining Techniques
56 pages
Supervised Learning Techniques Syllabus
No ratings yet
Supervised Learning Techniques Syllabus
183 pages
ML Unit II (1).pptx
No ratings yet
ML Unit II (1).pptx
183 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
46 pages
Understanding Decision Tree Learning
No ratings yet
Understanding Decision Tree Learning
21 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
45 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
25 pages
Understanding Decision Trees in Classification
No ratings yet
Understanding Decision Trees in Classification
5 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
45 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
10 pages
Classification and Prediction in Data Mining
No ratings yet
Classification and Prediction in Data Mining
30 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
47 pages
Classification and Prediction Methods
No ratings yet
Classification and Prediction Methods
63 pages
Decision Trees in AI Classification
No ratings yet
Decision Trees in AI Classification
43 pages
Classification and Prediction Techniques
100% (1)
Classification and Prediction Techniques
37 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Data Mining Classification Techniques
No ratings yet
Data Mining Classification Techniques
31 pages
Classification and Prediction in Data Mining
No ratings yet
Classification and Prediction in Data Mining
71 pages
Classification Techniques Overview
No ratings yet
Classification Techniques Overview
45 pages
Classification Techniques in Machine Learning
No ratings yet
Classification Techniques in Machine Learning
82 pages
Classification Techniques Overview
No ratings yet
Classification Techniques Overview
47 pages
Decision Tree Classifier Overview
No ratings yet
Decision Tree Classifier Overview
36 pages
Machine Learning Classifiers Overview
No ratings yet
Machine Learning Classifiers Overview
83 pages
Classification Methods in Data Mining
No ratings yet
Classification Methods in Data Mining
33 pages
Classification Techniques in Machine Learning
No ratings yet
Classification Techniques in Machine Learning
74 pages
Classification Techniques in Machine Learning
No ratings yet
Classification Techniques in Machine Learning
78 pages
Supervised vs. Unsupervised Learning
No ratings yet
Supervised vs. Unsupervised Learning
31 pages
Supervised Machine Learning Algorithms
No ratings yet
Supervised Machine Learning Algorithms
34 pages
Decision Trees in Classification and Regression
No ratings yet
Decision Trees in Classification and Regression
43 pages
Supervised Learning with Decision Trees
No ratings yet
Supervised Learning with Decision Trees
45 pages
Understanding Classification in Data Analysis
No ratings yet
Understanding Classification in Data Analysis
41 pages
Decision Trees for Classification Techniques
100% (1)
Decision Trees for Classification Techniques
62 pages
Basic Concepts of Classification
No ratings yet
Basic Concepts of Classification
69 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
42 pages
Inductive Bias in Decision Trees
No ratings yet
Inductive Bias in Decision Trees
78 pages
Classification Algorithms Overview
No ratings yet
Classification Algorithms Overview
82 pages
Best Split Measures in Decision Trees
No ratings yet
Best Split Measures in Decision Trees
37 pages
SICAM P85x 7KG85 CM US
No ratings yet
SICAM P85x 7KG85 CM US
97 pages
Hashedin - 2026 Sde JD
No ratings yet
Hashedin - 2026 Sde JD
3 pages
King County Identity Verification Guide
No ratings yet
King County Identity Verification Guide
2 pages
EESAGAN Edge-Enhanced and Structure-Aware GAN For Remote Sensing Image Super-Resolution
No ratings yet
EESAGAN Edge-Enhanced and Structure-Aware GAN For Remote Sensing Image Super-Resolution
16 pages
Professional Weather Center: Save This Manual For Future Reference
No ratings yet
Professional Weather Center: Save This Manual For Future Reference
24 pages
Introduction to Computer Components
100% (1)
Introduction to Computer Components
19 pages
Comprehensive CV Template Guide
No ratings yet
Comprehensive CV Template Guide
1 page
Kiostix Ticketing Management Overview
No ratings yet
Kiostix Ticketing Management Overview
18 pages
Healthcare Management System Overview
No ratings yet
Healthcare Management System Overview
3 pages
AI and Machine Learning for Kids
No ratings yet
AI and Machine Learning for Kids
48 pages
Konica Minolta Color Centro Guide
No ratings yet
Konica Minolta Color Centro Guide
279 pages
Square Root Calculation Tricks
No ratings yet
Square Root Calculation Tricks
3 pages
Glossary - Intel® Boot Agent User Guide
No ratings yet
Glossary - Intel® Boot Agent User Guide
1 page
Pros and Cons of Mamdani Method
No ratings yet
Pros and Cons of Mamdani Method
15 pages
Understanding GST Advance Ruling Process
No ratings yet
Understanding GST Advance Ruling Process
10 pages
Unix File Manipulation Techniques
No ratings yet
Unix File Manipulation Techniques
4 pages
Digital Logic & Computer Organization Q&A
No ratings yet
Digital Logic & Computer Organization Q&A
4 pages
Installing Your Energy Saving Monitor
No ratings yet
Installing Your Energy Saving Monitor
10 pages
Activity Lifecycle Callbacks Log
No ratings yet
Activity Lifecycle Callbacks Log
40 pages
FANUC Tape Cut Maintenance Manual
No ratings yet
FANUC Tape Cut Maintenance Manual
2 pages
KDS-150 MICR Encoder Overview
No ratings yet
KDS-150 MICR Encoder Overview
2 pages
Управа ресурсами для 2024-2025
No ratings yet
Управа ресурсами для 2024-2025
8 pages
TPC Retailer Activation and Loading Guide
No ratings yet
TPC Retailer Activation and Loading Guide
12 pages
Installation Opensupports - Opensupports Wiki GitHub
No ratings yet
Installation Opensupports - Opensupports Wiki GitHub
3 pages
Voltage Controlled Oscillator Overview
No ratings yet
Voltage Controlled Oscillator Overview
4 pages
BXB Dam-0404 SD en
No ratings yet
BXB Dam-0404 SD en
4 pages
Saving SPSS Data and Output Files Guide
No ratings yet
Saving SPSS Data and Output Files Guide
84 pages
Zebra's Zatar: Simplifying IoT Solutions
No ratings yet
Zebra's Zatar: Simplifying IoT Solutions
14 pages
Multi-Door Compact Type Controller (V4A) - Multi-Door Controller - ACS - SYRIS Technology Corp. (The RFID Total Solution.) - Powered by SYRIS
No ratings yet
Multi-Door Compact Type Controller (V4A) - Multi-Door Controller - ACS - SYRIS Technology Corp. (The RFID Total Solution.) - Powered by SYRIS
2 pages
MS Access Data Management Guide
No ratings yet
MS Access Data Management Guide
6 pages

Supervised Learning in Machine Learning

Uploaded by

Supervised Learning in Machine Learning

Uploaded by

Machine Learning:

Entropy: is a metric used to measure the impurity of the data

• Some useful property of the entropy:

 means that all the examples are in the same class

 Gain Ratio or Uncertainty Coefficient is used to normalize the

Gain Ratio=Information Gain/Entropy

Test expected information for each attribute

Gain(hair) = 0.954 - 0.50 = 0.454

Sunburned = Sarah, Annie,

• Once we have finished with hair colour we then need to

 You can view Decision Tree as an IF-

You might also like