0% found this document useful (0 votes)

47 views49 pages

Introduction to Machine Learning Concepts

The document provides an overview of machine learning (ML), defining it as a subset of artificial intelligence that enables computers to learn from data and improve performance without explicit programming. It discusses various types of ML, including supervised, unsupervised, and reinforcement learning, along with their advantages, disadvantages, and applications. Additionally, it covers key concepts such as training error, generalization error, overfitting, underfitting, and the bias-variance trade-off.

Uploaded by

Kunal Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views49 pages

Introduction to Machine Learning Concepts

Uploaded by

Kunal Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

MACHINE

LEARNING-C
SC701
MODULE 1- INTRODUCTION TO ML
Machine learning is a growing technology which
enables computers
to learn automatically from past data.
Machine learning uses various
algorithms for building mathematical
models and making predictions using historical
data or information. Currently, it is being used for
various tasks such as image recognition, speech
recognition, email filtering, Facebook
auto-tagging, recommender system, and many
more.
What is Machine Learning

In the real world, we are surrounded by humans

who can learn everything from their experiences
with their learning capability, and we have
computers or machines which work on our
instructions. But can a machine also learn from
experiences or past data like a human does? So
here comes the role of Machine Learning.
Machine Learning is said as a subset of
artificial intelligence
that is mainly concerned
with the development of

algorithms which allow

a computer to learn from
the data and past experiences
on their own. The term machine learning was
first introduced by Arthur Samuel in 1959.
WE CAN DEFINE IT IN A SUMMARIZED WAY AS:

“MACHINE LEARNING ENABLES A MACHINE TO

AUTOMATICALLY LEARN FROM DATA, IMPROVE
PERFORMANCE FROM EXPERIENCES, AND PREDICT
THINGS WITHOUT BEING EXPLICITLY
PROGRAMMED”.
Bloom’s Taxonomy
HOW DOES MACHINE LEARNING WORK

A Machine Learning system learns from

historical data, builds the prediction
models, and whenever it receives new
data, predicts the output for it. The
accuracy of predicted output depends upon
the amount of data, as the huge amount of
data helps to build a better model which
predicts the output more accurately.
MACHINE LEARNING MODEL
Training Data

Train Machine
Learning algorithm

Trained Model

Test the model with

new input

Is model
N Y Machine Learning
performing
Model Ready
correctly?
DATA FORMATS
Structured Data is stored in predefined format and is
highly specific.

Unstructured Data is a collection of many varied data

types which are stored in their native formats.

Semi structured Data that does not follow the tabular

data structure models associated with relational
databases or other data table.
DIKW PYRAMID

Understanding

Meaning

content

Events, Records and

Transaction
CATEGORIES OF DATA ANALYTICS
TYPES OF MACHINE LEARNING

Based on the methods and way of learning,

machine learning is divided into various types

Supervised Machine Learning

Unsupervised Machine Learning

Reinforcement Learning
Supervised Machine Learning

Supervised machine learning is based on

supervision. It means in the supervised learning
technique, we train the machines using the
"labeled" dataset, and based on the training, the
machine predicts the output. Here, the labeled data
specifies that some of the inputs are already
mapped to the output.
Categories of Supervised Machine Learning

Supervised machine learning can be classified

into two types of problems, which are given
below:

Classification

Regression
ADVANTAGES

Since supervised learning work with the labeled

dataset so we can have an exact idea about the
data

These algorithms are helpful in predicting the

output on the basis of prior experience.
DISADVANTAGES
These algorithms are not able to solve complex
tasks.

It may predict the wrong output if the test data is

different from the training data.

It requires lots of computational time to train the

algorithm.
APPLICATIONS

Image Segmentation

Medical Diagnosis

Fraud Detection

Spam detection

Speech Recognition
UNSUPERVISED MACHINE LEARNING
The main aim of the unsupervised learning
algorithm is to group or categories the
unsorted/unlabelled dataset according to the
similarities, patterns, and differences.

Machines are instructed to find the hidden

patterns from the input dataset.
CATEGORIES OF UNSUPERVISED MACHINE
LEARNING

Unsupervised Learning can be further classified

into two types, which are given below:

Clustering

Association
ADVANTAGES:

These algorithms can be used for complicated

tasks compared to the supervised ones because
these algorithms work on the unlabeled dataset.

Unsupervised algorithms are preferable for

various tasks as getting the unlabeled dataset is
easier as compared to the labeled dataset.
DISADVANTAGES:

The output of an unsupervised algorithm can be

less accurate as the dataset is not labeled, and
algorithms are not trained with the exact output
in prior.

Working with Unsupervised learning is more

difficult as it works with the unlabeled dataset
that does not map with the output.
APPLICATIONS

Network Analysis

Recommendation Systems

Anomaly Detection

Singular Value Decomposition

REINFORCEMENT LEARNING

Reinforcement learning works on a

feedback-based process, in which an AI agent (A
software component) automatically explore its
surrounding by hitting & trail, taking action,
learning from experiences, and improving its
performance.

Agent gets rewarded for each good action and get

punished for each bad action; hence the goal of
reinforcement learning agent is to maximize the
rewards.
ADVANTAGES

It helps in solving complex real-world problems

which are difficult to be solved by general
techniques.

The learning model of RL is similar to the

learning of human beings; hence most accurate
results can be found.

Helps in achieving long term results.

DISADVANTAGE

RL algorithms are not preferred for simple

problems.

RL algorithms require huge data and

computations.

Too much reinforcement learning can lead to an

overload of states which can weaken the results.
APPLICATION

Video Games

Resource Management

Robotics

Text Mining
ISSUES IN MACHINE LEARNING
Inadequate Training Data

Poor quality of data

Massive training data

Over fitting and Under fitting

Monitoring and maintenance

Getting bad recommendations

Lack of skilled resources

Limited possibilities to reuse a model

Data Bias
APPLICATION OF MACHINE LEARNING
STEPS IN DEVELOPING A MACHINE LEARNING
APPLICATION.
Collect Data

Prepare the input data

Analyze the input data

Train the algorithm

Test the algorithm

Use the algorithm

Periodic Revisit
TRAINING ERROR

Training error is simply an error that occurs

during model training, i.e. dataset
inappropriately handle during preprocessing or
in feature selection.
GENERALIZATION ERROR

In supervised learning applications in machine

learning and statistical learning theory,
generalization error (also known as the
out-of-sample error) is a measure of how
accurately an algorithm is able to predict
outcome values for previously unseen data.
Notice that the gap between predictions and
observed data is induced by model
inaccuracy, sampling error, and noise. Some
of the errors are reducible but some are not.
Choosing the right algorithm and tuning
parameters could improve model accuracy, but we
will never be able to make our predictions 100%
accurate.
TRAINING ERROR AND GENERALIZATION
ERROR
OVERFITTING

A statistical model is said to be over fitted when

the model does not make accurate predictions on
testing data. When a model gets trained with so
much data, it starts learning from the noise and
inaccurate data entries in our data set. And when
testing with test data results in High variance.
Then the model does not categorize the data
correctly, because of too many details and noise.
REASONS FOR OVERFITTING:

High variance and low bias.

The model is too complex.

The size of the training data.

Bias- differences between actual or expected
values and the predicted values are known as
error or bias error or error due to bias
Low Bias: Low bias value means fewer
assumptions are taken to build the target
function. In this case, the model will closely
match the training dataset.
High Bias: High bias value means more
assumptions are taken to build the target
function. In this case, the model will not match
the training dataset closely.
Variance is the measure of spread in data from
its mean position. In machine learning variance
is the amount by which the performance of a
predictive model changes when it is trained on
different subsets of the training data.

Low variance: Low variance means that the

model is less sensitive to changes in the training
data and can produce consistent estimates of the
target function with different subsets of data
from the same distribution.
High variance: High variance means that the
model is very sensitive to changes in the training
data and can result in significant changes in the
estimate of the target function when trained on
different subsets of data from the same
distribution.
EXAMPLE-
Actual- 9.2
Predicted-8.9,12.2,7.2,7.8

Bias= |actual- predicted|

Low bias= 0.3

High bias=2.0

Variance= variety in predicted output

Low variance= 7.2,7.8

High variance= 12.2,8.9
TECHNIQUES TO REDUCE OVERFITTING

Decrease training data.

Reduce model complexity.

UNDERFITTING

Machine learning algorithm is said to have under

fitting when it cannot capture the underlying
trend of the data, i.e., it only performs well on
training data but performs poorly on testing
data.
REASONS FOR UNDERFITTING

High bias and low variance.

The size of the training dataset used is not

enough.

The model is too simple.

Training data is not cleaned and also contains

noise in it.
TECHNIQUES TO REDUCE UNDERFITTING

Increase model complexity

Increase the number of features, performing feature

engineering

Remove noise from the data.

BIAS-VARIANCE TRADE-OFF.
The bias is known as the difference between the
prediction of the values by the Machine
Learning model and the correct value. Being high
in biasing gives a large error in training as well
as testing data.

The variability of model prediction for a given

data point which tells us the spread of our data is
called the variance of the model. The model with
high variance has a very complex fit to the
training data and thus is not able to fit
accurately on the data which it hasn’t seen
before. As a result, such models perform very well
on training data but have high error rates on test
data
If the algorithm is too simple (hypothesis with
linear equation) then it may be on high bias and
low variance condition and thus is error-prone. If
algorithms fit too complex (hypothesis with high
degree equation) then it may be on high variance
and low bias. In the latter condition, the new
entries will not perform well. Well, there is
something between both of these conditions,
known as a Trade-off or Bias Variance Trade-off.
BIAS-VARIANCE TRADE OFF

Machine Learning Concepts Overview
No ratings yet
Machine Learning Concepts Overview
59 pages
Machine Learning Algorithms Course Overview
No ratings yet
Machine Learning Algorithms Course Overview
67 pages
Machine Learning Fundamentals Explained
No ratings yet
Machine Learning Fundamentals Explained
24 pages
Unit 1
No ratings yet
Unit 1
148 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
28 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
55 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
27 pages
Machine Learning Basics and Applications
No ratings yet
Machine Learning Basics and Applications
24 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
28 pages
Types of Machine Learning Explained
No ratings yet
Types of Machine Learning Explained
19 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
15 pages
Supervised vs Unsupervised Learning Guide
No ratings yet
Supervised vs Unsupervised Learning Guide
15 pages
Machine Learning Course Overview and Concepts
No ratings yet
Machine Learning Course Overview and Concepts
51 pages
Machine Learning
No ratings yet
Machine Learning
26 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
62 pages
ML Unit-1
No ratings yet
ML Unit-1
14 pages
Machine Learning Evolution and Concepts
No ratings yet
Machine Learning Evolution and Concepts
9 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
25 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
54 pages
Machine Learning Overview and Lifecycle
No ratings yet
Machine Learning Overview and Lifecycle
61 pages
ML-I_Module 1_Shraddha More (1)
No ratings yet
ML-I_Module 1_Shraddha More (1)
120 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
96 pages
Overview of Machine Learning Types
No ratings yet
Overview of Machine Learning Types
6 pages
Machine Learning Data Types and Models
No ratings yet
Machine Learning Data Types and Models
25 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
57 pages
INT522 Unit-5
No ratings yet
INT522 Unit-5
59 pages
Overview of Machine Learning Types
No ratings yet
Overview of Machine Learning Types
17 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
30 pages
AI in Student Application Evaluation
No ratings yet
AI in Student Application Evaluation
96 pages
Machine Learning Handwritten Notes
No ratings yet
Machine Learning Handwritten Notes
19 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
35 pages
Types of Machine Learning Algorithms Explained
No ratings yet
Types of Machine Learning Algorithms Explained
69 pages
Machine Learning Overview and Types
No ratings yet
Machine Learning Overview and Types
21 pages
Complete ML Concepts
No ratings yet
Complete ML Concepts
30 pages
ML Lecture 1 Introduction To Machine Learning
No ratings yet
ML Lecture 1 Introduction To Machine Learning
28 pages
Basics of Machine Learning Explained
No ratings yet
Basics of Machine Learning Explained
86 pages
Data Science Fundamentals Explained
No ratings yet
Data Science Fundamentals Explained
56 pages
Machine Learning Overview and Types
No ratings yet
Machine Learning Overview and Types
15 pages
Understanding Machine Learning Types
No ratings yet
Understanding Machine Learning Types
12 pages
Machine Learning Concepts and Types
No ratings yet
Machine Learning Concepts and Types
21 pages
Understanding Machine Learning Basics
100% (1)
Understanding Machine Learning Basics
67 pages
RL03 - Supervised, Unsupervised, RL Learning - Saleh - PPT
No ratings yet
RL03 - Supervised, Unsupervised, RL Learning - Saleh - PPT
39 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
14 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
45 pages
Machine Learning in Investment Management
No ratings yet
Machine Learning in Investment Management
11 pages
L-1 Machine learning
No ratings yet
L-1 Machine learning
48 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
80 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
90 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
27 pages
Machine Learning Overview and Types
No ratings yet
Machine Learning Overview and Types
47 pages
Introduction to Machine Learning Types
No ratings yet
Introduction to Machine Learning Types
20 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
6 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
47 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
28 pages
Machine Learning Models Overview
No ratings yet
Machine Learning Models Overview
54 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
56 pages
GDP Forecasting in India Using ARIMA
100% (1)
GDP Forecasting in India Using ARIMA
32 pages
Probability and Queuing Theory Overview
No ratings yet
Probability and Queuing Theory Overview
17 pages
Supervised ML Modeling Process Overview
No ratings yet
Supervised ML Modeling Process Overview
31 pages
Understanding Correlation Coefficients
No ratings yet
Understanding Correlation Coefficients
33 pages
Correlation Coefficient & Linear Regression
No ratings yet
Correlation Coefficient & Linear Regression
53 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
42 pages
Trend-Cycle Decomposition & Unit Roots
No ratings yet
Trend-Cycle Decomposition & Unit Roots
53 pages
Continuous Random Variables in Delivery
No ratings yet
Continuous Random Variables in Delivery
14 pages
Airline Passenger Statistics Analysis
No ratings yet
Airline Passenger Statistics Analysis
6 pages
Understanding Pearson's Correlation Coefficient
No ratings yet
Understanding Pearson's Correlation Coefficient
16 pages
FSS102 Statistics Practice Questions
No ratings yet
FSS102 Statistics Practice Questions
5 pages
Intermediate Statistics Formula Sheet
No ratings yet
Intermediate Statistics Formula Sheet
30 pages
6.02 The Sampling Distribution of The Sample Mean
No ratings yet
6.02 The Sampling Distribution of The Sample Mean
6 pages
Statistical Measures of Central Tendency
No ratings yet
Statistical Measures of Central Tendency
35 pages
PL2131 Lab Tutorial Q&A Insights
No ratings yet
PL2131 Lab Tutorial Q&A Insights
3 pages
Linear Regression and Parameter Estimation
No ratings yet
Linear Regression and Parameter Estimation
10 pages
Understanding Multi-Arm Bandit Algorithms
No ratings yet
Understanding Multi-Arm Bandit Algorithms
45 pages
Mortgage and Shipping Cost Analysis
No ratings yet
Mortgage and Shipping Cost Analysis
12 pages
Understanding Least Squares Regression
No ratings yet
Understanding Least Squares Regression
17 pages
One-Way ANOVA for Energy Consumption
No ratings yet
One-Way ANOVA for Energy Consumption
7 pages
Analisis Uji ANOVA dan Kruskal Wallis
No ratings yet
Analisis Uji ANOVA dan Kruskal Wallis
4 pages
Understanding Multicollinearity in Regression
No ratings yet
Understanding Multicollinearity in Regression
7 pages
ANOVA Analysis in Agricultural Research
No ratings yet
ANOVA Analysis in Agricultural Research
4 pages
Convergence in Probability vs. Distribution
No ratings yet
Convergence in Probability vs. Distribution
6 pages
Nested Designs in Repeated Measures Analysis
No ratings yet
Nested Designs in Repeated Measures Analysis
12 pages
Business Research Methods MCQs Set 3
100% (1)
Business Research Methods MCQs Set 3
7 pages
Econometrics: A Predictive Modeling Approach: Francis X. Diebold University of Pennsylvania
No ratings yet
Econometrics: A Predictive Modeling Approach: Francis X. Diebold University of Pennsylvania
247 pages
Stock Price Comparison of Companies A, B, C
No ratings yet
Stock Price Comparison of Companies A, B, C
4 pages
Entrepreneurship Education's Impact on Startups
No ratings yet
Entrepreneurship Education's Impact on Startups
7 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
39 pages