0% found this document useful (0 votes)

32 views38 pages

Machine Learning Performance Metrics Guide

The document discusses various techniques for evaluating machine learning model performance, including cross-validation, precision, recall, and ROC curves. It covers k-fold cross-validation and bootstrapping for evaluating classifier performance on limited datasets. Underfitting and overfitting are explained as well as the importance of training, validation, and test sets for accurate assessment of model generalization.

Uploaded by

nandukannanmelath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views38 pages

Machine Learning Performance Metrics Guide

Uploaded by

nandukannanmelath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Performance Evaluation

Module-II
EC 19-203-0811
Introduction to Machine Learning

Course Outcomes
1. To understand various machine learning techniques
2. To acquire knowledge about classification techniques.
3. To understand dimensionality reduction techniques and decision trees.
4. To understand unsupervised machine learning techniques.

7-Performance Evaluation Metrics 1:35 PM 2

Syllabus
Module I
Introduction: Machine Learning, Applications, Supervised Learning -Classification, Regression,
Unsupervised Learning, Reinforcement Learning, Supervised Learning: Learning a Class from Examples,
Vapnik - Chervonenkis (VC) Dimension, Probably Approximately Correct (PAC) Learning, Noise, Learning
Multiple Classes, Regression, Model Selection and Generalization, Dimensions of a Supervised Machine
Learning Algorithm

Module II
Multilayer Perceptrons: Introduction, The Perceptron, Training a Perceptron, Learning Boolean
Functions, Multilayer Perceptrons, Backpropagation Algorithm, Training Procedures. Classification- Cross
validation and re-sampling methods- Kfold cross validation, Boot strapping, Measuring classifier
performance- Precision, recall, ROC curves. Bayes Theorem, Bayesian classifier, Maximum Likelihood
estimation, Density Functions.

7-Performance Evaluation Metrics 1:35 PM 3

Syllabus

7-Performance Evaluation Metrics 1:35 PM 4

Syllabus
Module III
Dimensionality Reduction: Introduction, Subset Selection, Principal Components
Analysis, Factor Analysis, Multidimensional Scaling, Linear Discriminant Analysis,
Isomap, Locally Linear Embedding, Decision Trees: Introduction, Univariate Trees,
Pruning, Rule Extraction from Trees, Learning Rules from Data, Multivariate Trees,
Introduction to Linear Discrimination, Generalizing the Linear Model.

Module IV
Clustering: Introduction, Mixture Densities, k-Means Clustering, Expectation-
Maximization Algorithm, Mixtures of Latent Variable Models, Supervised Learning after
Clustering, Hierarchical Clustering, Choosing the Number of Clusters.

7-Performance Evaluation Metrics 1:35 PM 5

References

1. Stephen Marsland, “MACHINE LEARNING An Algorithmic Perspective”, 2nd

Edition, CRC Press, 2015. [Ch-2]

2. Christopher M. Bishop, “Pattern Recognition and Machine Learning”,

Springer,2006.

3. Ethem Alpaydin, “Introduction to Machine Learning”, Second Edition, 2010

[Ch-11]

4. Images from different websites and NPTEL course slides

7-Performance Evaluation Metrics 1:35 PM 6

Contents

• Cross validation and re-sampling methods

• K-fold cross validation
• Boot strapping
• Measuring classifier performance
Precision
Recall
ROC curves

7-Performance Evaluation Metrics 1:35 PM 7

Contents

• Cross validation and re-sampling methods

• K-fold cross validation
• Boot strapping
• Measuring classifier performance
Precision
Recall
ROC curves

7-Performance Evaluation Metrics 1:35 PM 8

Purpose of Learning
• get better at predicting the outputs, be it class labels or
continuous regression values.
• to compare the predictions with known target labels
• error that the algorithm makes on the training set.

• However, the algorithms must be generalised to examples that were not

seen in the training set
need some different data, a test set.
• But test set does not modify the weights or other parameters for them
we use them to decide how well the algorithm has learnt.
• The only problem with this is that it reduces the amount of data that
we have available for training

7-Performance Evaluation Metrics 1:35 PM 9

Overfitting
• to make sure that enough training is done so that the algorithm generalises
well.
• But there is at least as much danger in over-training as there is in under-
training.
• The number of degrees of variability in most machine learning algorithms is
huge — for a neural network there are lots of weights, and each of them can
vary.
• This is undoubtedly more variation than there is in the function
• If we train for too long, then we will overfit the data, which means that we
have learnt about the noise and inaccuracies in the data as well as the
actual function. Therefore, the model that we learn will be much too
complicated, and won’t be able to generalise.

7-Performance Evaluation Metrics 1:35 PM 10

Overfitting

Finding the generating function Overfitting: NN matches the input perfectly,

including the noise in them reduces the
chance of generalisation
7-Performance Evaluation Metrics 1:35 PM 11
Underfitting

• Machine Learning Algorithm is said to have underfitting when it

cannot capture the underlying trend of the data.

• i.e., it only performs well on training data but performs poorly on

testing data

7-Performance Evaluation Metrics 1:35 PM 12

Underfitting

• model is too simple for the data.

• Eg: data is quadratic and model

is linear

[Link]
ea8964d9c45c#:~:text=Underfitting%20means%20that%20your%20model,val%2Ftest%20error%20is%20large.

7-Performance Evaluation Metrics 1:35 PM 13

Underfitting

[Link]

7-Performance Evaluation Metrics 1:35 PM 14

Variance

• The difference between the error rate of training data and testing
data is called variance.

• If the difference is high then it’s called high variance and when the
difference of errors is low then it’s called low variance.

• Usually, make a low variance for a generalized model.

[Link]

7-Performance Evaluation Metrics 1:35 PM 15

Training Set

• The sample of data used to fit the model.

• The actual dataset that we use to train the model (weights and
biases in the case of a Neural Network).
• The model sees and learns from this data.

7-Performance Evaluation Metrics 1:35 PM 16

Validation Set
• is a set of data, separate from the training set, that is used to validate our
model performance during training.

• This validation process gives information that helps us tune the model’s
hyperparameters and configurations accordingly.

• It is like a critic telling us whether the training is moving in the right

direction or not.

• The model is trained on the training set, and, simultaneously, the model
evaluation is performed on the validation set after every epoch.

7-Performance Evaluation Metrics 1:35 PM 17

Test Set

• The test set is a separate set of data used to test the

model after completing the training.

• It provides an unbiased final model performance metric in terms of

accuracy, precision, etc.

• OR simply, it answers the question of "How well does the model

perform?"

7-Performance Evaluation Metrics 1:35 PM 18

Visualisation of the Splits

[Link]

7-Performance Evaluation Metrics 1:35 PM 19

How to split data?

7-Performance Evaluation Metrics 1:35 PM 20

Resampling Techniques

• Machine Learning models often fails to generalize well on data it has not
been trained on.

• Sometimes, it fails miserably, sometimes it gives somewhat better than

miserable performance.

• To be sure that the model can perform well on unseen data, we use a re-
sampling technique, called Cross-Validation

7-Performance Evaluation Metrics 1:35 PM 21

Cross Validation
• is a technique used to evaluate the performance of a model on unseen
data.

• It involves dividing the available data into multiple folds or subsets,

using one of these folds as a validation set, and training the model on
the remaining folds.

• This process is repeated multiple times, each time using a different fold
as the validation set.

• Finally, the results from each validation step are averaged to produce a
more robust estimate of the model’s performance

7-Performance Evaluation Metrics 1:35 PM 22

k- fold Cross Validation

[Link]

7-Performance Evaluation Metrics 1:35 PM 23

k- fold Cross Validation

• In K-Fold CV, we have a parameter ‘k’.

• This parameter decides how many folds the dataset is divided.

• Every fold gets chance to appears in the training set (k-1) times,
which in turn ensures that every observation in the dataset
appears in the dataset, thus enabling the model to learn the
underlying data distribution better.

• The value of ‘k’ used is generally between 5 or 10.

[Link]

7-Performance Evaluation Metrics 1:35 PM 24

Examples of k- fold CV

[Link]

7-Performance Evaluation Metrics 1:35 PM 25

Leave-One-Out Cross-Validation

Extreme Case:
the algorithm
is validated on just one
piece of data, training on all
of the rest.

[Link]

7-Performance Evaluation Metrics 1:35 PM 26

Bootstrapping

• calculated on multiple bags of random samples with replacement

• it can be used to infer population results of machine learning

models trained on random samples with replacement.

[Link]
algorithms/bootstrapping/#:~:text=Particularly%20useful%20for%20a
ssessing%20the,replacement%20during%20the%20sampling%20pro
cess.

7-Performance Evaluation Metrics 1:35 PM 27

Confusion matrix

[Link]

7-Performance Evaluation Metrics 1:35 PM 28

Accuracy
The higher the accuracy, the better the model.

Accuracy = TP+TN / TP+FP+FN+TN

100+150/100+20+30+150 = 0.83

This means that the machine learning

algorithm is 83% accurate in its predictions.

[Link]

7-Performance Evaluation Metrics 1:35 PM 29

Misclassification Rate
• Also referred to as the error rate

• the misclassification rate defines how

often the model makes incorrect
predictions.

Error rate = FP+FN/TP+FP+FN+TN

20+30/100+20+30+150 = 0.17

• Hence, the machine learning algorithm

is 17% inaccurate in its predictions.
[Link]

7-Performance Evaluation Metrics 1:35 PM 30

Precision

precision equals 100/100+20 = 0.83

This means that out of all the positive
predictions, 83% were true.
[Link]

7-Performance Evaluation Metrics 1:35 PM 31

Recall

Recall = TP/TP+FN = 100/100+30 = 0.76

This means that out of all the actual positive cases, only
[Link] 76% were predicted correctly.
7-Performance Evaluation Metrics 1:35 PM 32
F1-score

[Link]

7-Performance Evaluation Metrics 11:29 AM 33

Receiver Operating Characteristics

[Link]

7-Performance Evaluation Metrics 1:35 PM 34

Evaluating Regression Models

[Link]
everyone/lecture/akqf1/evaluating-machine-learning-models

7-Performance Evaluation Metrics 1:35 PM 35

Evaluating Regression Models

[Link]
everyone/lecture/akqf1/evaluating-machine-learning-models

7-Performance Evaluation Metrics 1:35 PM 36

Conclusion

• Performance Evaluation Metrics

• Training –Validation –Test Split
• Overfitting-Underfitting
• Confusion Matrix
• Receiver Operating Characteristics

7-Performance Evaluation Metrics 1:35 PM 37

Thank You

Recall Formula in Model Evaluation
No ratings yet
Recall Formula in Model Evaluation
39 pages
Machine Learning Model Evaluation Metrics
No ratings yet
Machine Learning Model Evaluation Metrics
17 pages
Machine Learning Evaluation Techniques
No ratings yet
Machine Learning Evaluation Techniques
19 pages
Machine Learning Training & Evaluation Guide
No ratings yet
Machine Learning Training & Evaluation Guide
42 pages
Machine Learning Evaluation Techniques
No ratings yet
Machine Learning Evaluation Techniques
121 pages
Machine Learning Model Evaluation Guide
No ratings yet
Machine Learning Model Evaluation Guide
34 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
Machine Learning Model Evaluation Guide
No ratings yet
Machine Learning Model Evaluation Guide
52 pages
Machine Learning Performance Evaluation
No ratings yet
Machine Learning Performance Evaluation
68 pages
Machine Learning Fundamentals Overview
No ratings yet
Machine Learning Fundamentals Overview
64 pages
Ensemble Learning Techniques Explained
No ratings yet
Ensemble Learning Techniques Explained
4 pages
Essential Guide to Model Validation Techniques
No ratings yet
Essential Guide to Model Validation Techniques
22 pages
How to Train and Validate ML Models
No ratings yet
How to Train and Validate ML Models
40 pages
Bias-Variance Tradeoff in Machine Learning
No ratings yet
Bias-Variance Tradeoff in Machine Learning
40 pages
Model Selection and Evaluation in ML
No ratings yet
Model Selection and Evaluation in ML
24 pages
Machine Learning Methodology Overview
No ratings yet
Machine Learning Methodology Overview
53 pages
Machine Learning Experiment Guidelines
No ratings yet
Machine Learning Experiment Guidelines
23 pages
Model Selection & Evaluation in ML
No ratings yet
Model Selection & Evaluation in ML
31 pages
Steps to Develop a Machine Learning App
No ratings yet
Steps to Develop a Machine Learning App
14 pages
Machine Learning Overview and Techniques
No ratings yet
Machine Learning Overview and Techniques
18 pages
Machine Learning Model Validation Techniques
No ratings yet
Machine Learning Model Validation Techniques
54 pages
Machine Learning Model Evaluation Metrics
No ratings yet
Machine Learning Model Evaluation Metrics
36 pages
Model Evaluation in Data Science Projects
No ratings yet
Model Evaluation in Data Science Projects
12 pages
Evaluating Machine Learning Models
100% (2)
Evaluating Machine Learning Models
10 pages
Model Evaluation and Hyperparameter Tuning
No ratings yet
Model Evaluation and Hyperparameter Tuning
48 pages
Machine Learning Classification Overview
No ratings yet
Machine Learning Classification Overview
35 pages
Machine Learning Classifier Evaluation Guide
No ratings yet
Machine Learning Classifier Evaluation Guide
61 pages
Machine Learning Dataset Types Explained
No ratings yet
Machine Learning Dataset Types Explained
23 pages
Cross-Validation in Machine Learning
No ratings yet
Cross-Validation in Machine Learning
51 pages
Evaluating Machine Learning Models
No ratings yet
Evaluating Machine Learning Models
12 pages
Machine Learning Exam Notes Overview
No ratings yet
Machine Learning Exam Notes Overview
20 pages
Model Validation Techniques Explained
No ratings yet
Model Validation Techniques Explained
20 pages
Model Selection in Machine Learning
No ratings yet
Model Selection in Machine Learning
62 pages
Modelling in Machine Learning
No ratings yet
Modelling in Machine Learning
24 pages
Model Evaluation in Machine Learning
No ratings yet
Model Evaluation in Machine Learning
70 pages
Eval
No ratings yet
Eval
27 pages
Cross-Validation Techniques in ML
No ratings yet
Cross-Validation Techniques in ML
56 pages
Evaluating Machine Learning Models
No ratings yet
Evaluating Machine Learning Models
73 pages
Evaluating Machine Learning Models
No ratings yet
Evaluating Machine Learning Models
14 pages
Credibility - Evaluating What's Been Learnt
No ratings yet
Credibility - Evaluating What's Been Learnt
78 pages
Model Evaluation Techniques in ML
No ratings yet
Model Evaluation Techniques in ML
44 pages
Performnce and Evaluation Metrices.pptx
No ratings yet
Performnce and Evaluation Metrices.pptx
29 pages
Machine Learning Model Evaluation Guide
No ratings yet
Machine Learning Model Evaluation Guide
42 pages
Machine Learning Overview: Supervised & Unsupervised
No ratings yet
Machine Learning Overview: Supervised & Unsupervised
34 pages
Machine Learning Challenges & Solutions
No ratings yet
Machine Learning Challenges & Solutions
26 pages
Chapter 3 EDA
No ratings yet
Chapter 3 EDA
33 pages
Machine Learning Model Evaluation Guide
No ratings yet
Machine Learning Model Evaluation Guide
69 pages
Machine Learning Model Evaluation Guide
No ratings yet
Machine Learning Model Evaluation Guide
22 pages
Machine Learning for Loan Approval Analysis
No ratings yet
Machine Learning for Loan Approval Analysis
31 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
16 pages
Machine Learning Data Preprocessing Guide
No ratings yet
Machine Learning Data Preprocessing Guide
39 pages
Gradient Boosting Trees Overview
No ratings yet
Gradient Boosting Trees Overview
14 pages
Deep Learning Techniques Overview
No ratings yet
Deep Learning Techniques Overview
34 pages
NLP Classifier Evaluation Metrics Guide
No ratings yet
NLP Classifier Evaluation Metrics Guide
146 pages
Model Evaluation Techniques
No ratings yet
Model Evaluation Techniques
9 pages
Practical Data Analysis in Machine Learning
No ratings yet
Practical Data Analysis in Machine Learning
39 pages
Machine Learning Course Syllabus
No ratings yet
Machine Learning Course Syllabus
78 pages
Dimensionality Reduction and Model Validation
No ratings yet
Dimensionality Reduction and Model Validation
80 pages
Online Machine Learning Course Syllabus
No ratings yet
Online Machine Learning Course Syllabus
14 pages
SS40 Elite Imaging Solutions Overview
No ratings yet
SS40 Elite Imaging Solutions Overview
102 pages
Occlusion Invariant Face Recognition Algorithm
No ratings yet
Occlusion Invariant Face Recognition Algorithm
9 pages
Exploring LUPI: Privileged Information in Learning
No ratings yet
Exploring LUPI: Privileged Information in Learning
5 pages
AI-Powered Crop Disease Detection
No ratings yet
AI-Powered Crop Disease Detection
12 pages
Neuromorphic Computing Overview
No ratings yet
Neuromorphic Computing Overview
17 pages
Hybrid IDS with BWO-CONV-LSTM Model
No ratings yet
Hybrid IDS with BWO-CONV-LSTM Model
17 pages
Deep Learning for Engine Component Classification
No ratings yet
Deep Learning for Engine Component Classification
32 pages
Perceptron vs. Neuron in Deep Learning
No ratings yet
Perceptron vs. Neuron in Deep Learning
8 pages
CS231n: Intro to Image Classification
No ratings yet
CS231n: Intro to Image Classification
16 pages
Machine Learning in Water Management
No ratings yet
Machine Learning in Water Management
18 pages
Deep Learning for Plant Leaf Recognition
No ratings yet
Deep Learning for Plant Leaf Recognition
20 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
21 pages
Non-Linearity in Neural Networks
No ratings yet
Non-Linearity in Neural Networks
73 pages
NILM Algorithms Overview and Python Guide
No ratings yet
NILM Algorithms Overview and Python Guide
23 pages
Overview of AI Agents and Algorithms
No ratings yet
Overview of AI Agents and Algorithms
2 pages
ANN Hyperparameter Tuning Assignment
No ratings yet
ANN Hyperparameter Tuning Assignment
3 pages
Entry-Level AI/ML Engineer Resume
No ratings yet
Entry-Level AI/ML Engineer Resume
1 page
Cat vs Dog Classification with ResNet50
No ratings yet
Cat vs Dog Classification with ResNet50
10 pages
IEEE Machine Learning Projects 2025
No ratings yet
IEEE Machine Learning Projects 2025
10 pages
Rule-Based vs Machine Learning AI Approaches
No ratings yet
Rule-Based vs Machine Learning AI Approaches
3 pages
Technologies 10 00005
No ratings yet
Technologies 10 00005
11 pages
K-Nearest Neighbor Algorithm Overview
100% (1)
K-Nearest Neighbor Algorithm Overview
6 pages
Call for Papers: IJSCAI Journal
No ratings yet
Call for Papers: IJSCAI Journal
2 pages
Imbalance Data Classification with T-Link
No ratings yet
Imbalance Data Classification with T-Link
12 pages
RLVS: Efficient Adversarial Attacks on Videos
No ratings yet
RLVS: Efficient Adversarial Attacks on Videos
16 pages
History and Applications of AI
No ratings yet
History and Applications of AI
24 pages
Key Milestones in Generative AI Evolution
No ratings yet
Key Milestones in Generative AI Evolution
12 pages
AI Project Cycle and Ethical Frameworks
No ratings yet
AI Project Cycle and Ethical Frameworks
3 pages
Overview of Artificial Intelligence
No ratings yet
Overview of Artificial Intelligence
53 pages

Machine Learning Performance Metrics Guide

Uploaded by

Machine Learning Performance Metrics Guide

Uploaded by

Performance Evaluation

7-Performance Evaluation Metrics 1:35 PM 2

7-Performance Evaluation Metrics 1:35 PM 3

7-Performance Evaluation Metrics 1:35 PM 4

7-Performance Evaluation Metrics 1:35 PM 5

1. Stephen Marsland, “MACHINE LEARNING An Algorithmic Perspective”, 2nd

2. Christopher M. Bishop, “Pattern Recognition and Machine Learning”,

3. Ethem Alpaydin, “Introduction to Machine Learning”, Second Edition, 2010

4. Images from different websites and NPTEL course slides

7-Performance Evaluation Metrics 1:35 PM 6

• Cross validation and re-sampling methods

7-Performance Evaluation Metrics 1:35 PM 7

• Cross validation and re-sampling methods

7-Performance Evaluation Metrics 1:35 PM 8

• However, the algorithms must be generalised to examples that were not

7-Performance Evaluation Metrics 1:35 PM 9

7-Performance Evaluation Metrics 1:35 PM 10

Finding the generating function Overfitting: NN matches the input perfectly,

• Machine Learning Algorithm is said to have underfitting when it

• i.e., it only performs well on training data but performs poorly on

7-Performance Evaluation Metrics 1:35 PM 12

• model is too simple for the data.

• Eg: data is quadratic and model

7-Performance Evaluation Metrics 1:35 PM 13

7-Performance Evaluation Metrics 1:35 PM 14

• Usually, make a low variance for a generalized model.

7-Performance Evaluation Metrics 1:35 PM 15

• The sample of data used to fit the model.

7-Performance Evaluation Metrics 1:35 PM 16

• It is like a critic telling us whether the training is moving in the right

7-Performance Evaluation Metrics 1:35 PM 17

• The test set is a separate set of data used to test the

• It provides an unbiased final model performance metric in terms of

• OR simply, it answers the question of "How well does the model

7-Performance Evaluation Metrics 1:35 PM 18

7-Performance Evaluation Metrics 1:35 PM 19

7-Performance Evaluation Metrics 1:35 PM 20

• Sometimes, it fails miserably, sometimes it gives somewhat better than

7-Performance Evaluation Metrics 1:35 PM 21

• It involves dividing the available data into multiple folds or subsets,

7-Performance Evaluation Metrics 1:35 PM 22

7-Performance Evaluation Metrics 1:35 PM 23

• In K-Fold CV, we have a parameter ‘k’.

• This parameter decides how many folds the dataset is divided.

• The value of ‘k’ used is generally between 5 or 10.

7-Performance Evaluation Metrics 1:35 PM 24

7-Performance Evaluation Metrics 1:35 PM 25

7-Performance Evaluation Metrics 1:35 PM 26

• calculated on multiple bags of random samples with replacement

• it can be used to infer population results of machine learning

7-Performance Evaluation Metrics 1:35 PM 27

7-Performance Evaluation Metrics 1:35 PM 28

Accuracy = TP+TN / TP+FP+FN+TN

This means that the machine learning

7-Performance Evaluation Metrics 1:35 PM 29

• the misclassification rate defines how

Error rate = FP+FN/TP+FP+FN+TN

• Hence, the machine learning algorithm

7-Performance Evaluation Metrics 1:35 PM 30

precision equals 100/100+20 = 0.83

7-Performance Evaluation Metrics 1:35 PM 31

Recall = TP/TP+FN = 100/100+30 = 0.76

7-Performance Evaluation Metrics 11:29 AM 33

7-Performance Evaluation Metrics 1:35 PM 34

7-Performance Evaluation Metrics 1:35 PM 35

7-Performance Evaluation Metrics 1:35 PM 36

• Performance Evaluation Metrics

7-Performance Evaluation Metrics 1:35 PM 37

You might also like