CS 601- Machine Learning
Introduction to machine
learning
Machine learning is a tool for turning
information into knowledge.
Machine learning techniques are used to
automatically find the valuable underlying
patterns within complex data that we would
otherwise struggle to discover.
The hidden patterns and knowledge about a
problem can be used to predict future
events and perform all kinds of complex
decision making.
Tom Mitchell gave a “well-posed” mathematical
and relational definition that “A computer
program is said to learn from experience E with
respect to some task T and some performance
measure P, if its performance on T, as measured
by P, improves with experience E.
For Example:
A checkers learning problem:
Task(T): Playing checkers.
Performance measures (P): Performance of
games won.
Training Experience ( E ): Playing practice games
against itself.
Need For Machine Learning
Ever since the technical revolution, we’ve been
generating an immeasurable amount of data.
With the availability of so much data, it is finally
possible to build predictive models that can
study and analyze complex data to find useful
insights and deliver more accurate results.
Top Tier companies such as Netflix and Amazon
build such Machine Learning models by using
tons of data in order to identify profitable
opportunities and avoid unwanted risks.
ML Vs AI Vs DL
Figure: 1.1
Important Terms of Machine Learning
Algorithm: A Machine Learning algorithm is a set of rules and
statistical techniques used to learn patterns from data and draw
significant information from it.
It is the logic behind a Machine Learning model. An example of a
Machine Learning algorithm is the Linear Regression algorithm.
Model: A model is the main component of Machine Learning. A
model is trained by using a Machine Learning Algorithm. An algorithm
maps all the decisions that a model is supposed to take based on the
given input, in order to get the correct output.
Predictor Variable: It is a feature(s) of the data that can be used to
predict the output.
Response Variable: It is the feature or the output variable that
needs to be predicted by using the predictor variable(s).
Training Data: The Machine Learning model is built using the
training data. The training data helps the model to identify key trends
and patterns essential to predict the output.
Testing Data: After the model is trained, it must be tested to
evaluate how accurately it can predict an outcome. This is done by
the testing data set.
A Machine Learning process begins by
feeding the machine lots of data, by using
this data the machine is trained to detect
hidden insights and trends. These insights
are then used to build a Machine Learning
Model by using an algorithm in order to
solve a problem in Figure 1.2.
Figure: 1.2
Scope
SCOPE
Increase in Data Generation: Due to excessive production of
data, need a method that can be used to structure, analyze and
draw useful insights from data. This is where Machine Learning
comes in. It uses data to solve problems and find solutions to the
most complex tasks faced by organizations.
Improve Decision Making: By making use of various algorithms,
Machine Learning can be used to make better business decisions.
For example, Machine Learning is used to
forecast sales, predict downfalls in the stock
market, identify risks and anomalies, etc.
Uncover patterns & trends in data: Finding
hidden patterns and extracting key insights from
data is the most essential part of Machine Learning.
By building predictive models and using statistical
techniques, Machine Learning allows you to dig
beneath the surface and explore the data at a
minute scale. Understanding data and extracting
patterns manually will take days, whereas Machine
Learning algorithms can perform such computations
in less than a second.
Solve complex problems: Building self-driving
cars, Machine Learning can be used to solve the
most complex problems.
Limitations
What algorithms exist for learning general target
function from specific training examples?
In what setting will particular algorithm converge to
the desired function, given sufficient training data?
Which algorithm performs best for which types of
problems and representations?
How much training data is sufficient?
When and how can prior knowledge held by the
learner guide the process of generalizing from
examples?
What is the best way to reduce the learning task to
one more function approximation problem?
Machine Learning Algorithms Require Massive Stores
of Training Data.
Labeling Training Data Is a Tedious Process.
Machines Cannot Explain Themselves.
Machine Learning Types:
A Machine can learn to solve a problem by
any one of the following three approaches.
These are the ways in which a machine can
learn:
◦ Supervised Learning
◦ Unsupervised Learning
◦ Reinforcement Learning
These are the three main types
of machine learning paradigms:
Supervised Learning
◦ The model learns from labeled data,
meaning each input has a corresponding
correct output.
◦ The goal is to map inputs to outputs based
on example data.
◦ Common tasks: Classification (e.g., spam
detection) and Regression (e.g., predicting
house prices).
◦ Examples: Linear Regression, Decision
Trees, Neural Networks.
Unsupervised Learning
◦ The model learns patterns from
unlabeled data, meaning there are no
predefined outputs.
◦ The goal is to discover hidden
structures or groupings in data.
◦ Common tasks: Clustering (e.g.,
customer segmentation) and
Dimensionality Reduction (e.g.,
PCA).
◦ Examples: K-Means, DBSCAN, Auto
encoders.
Reinforcement Learning (RL)
◦ The model learns by interacting with an
environment and receiving rewards or
penalties.
◦ The goal is to maximize cumulative
rewards over time through trial and error.
◦ Used in fields like robotics, game playing
(e.g., AlphaGo), and autonomous
systems.
◦ Examples: Q-Learning, Deep Q Networks
(DQN), Proximal Policy Optimization
(PPO).
Common Machine Learning Algorithms
Several machine learning algorithms are commonly used. These include:
Neural networks: Neural networks function similarly to the human brain,
comprising multiple linked processing nodes. Neural networks excel at
pattern identification and are used in different applications such as natural
language processing, image recognition, speech recognition, and creating
images.
Linear regression: This algorithm predicts numerical values using a linear
relationship between variables. For example, linear regression is used to
forecast housing prices based on past data in a particular area.
Logistic regression: This supervised learning method predicts categorical
variables, such as "yes/no" replies to questions. It is suitable for applications
such as spam classification and quality control on a production line.
Clustering: Clustering algorithms use unsupervised learning to find patterns
in data and organise it accordingly. Computers can assist data scientists by
identifying differences between data items that humans have overlooked.
Decision trees: Decision trees are useful for categorising data and for
regression analysis, which predicts numerical values. A tree structure can be
used to illustrate the branching sequence of linked decisions used in decision
trees. Unlike neural networks, decision trees can be easily validated and
audited.
Random forests: ML predicts a value or category by integrating results from
different decision trees
Applications of Machine
Learning
Nowadays; Machine Learning is used almost everywhere.
However, some most commonly used applicable areas of Machine Learning are:
Speech recognition: It is also known as automatic speech recognition (ASR), computer
speech recognition, or speech-to-text, and it is a capability that uses natural language
processing (NLP) to translate human speech into a written format.
Customer service: Chatbots are replacing human operators on websites and social media,
affecting client engagement. Chatbots answer shipping FAQs, offer personalized advice, cross-
sell products, and recommend sizes. Some common examples are virtual agents on e-
commerce sites, Slack and Facebook Messenger bots, and virtual and voice assistants.
Computer vision: This artificial intelligence technology allows computers to derive
meaningful information from digital images, videos, and other visual inputs that can then be
used for appropriate action. Computer vision, powered by convolutional neural networks, is
used for photo tagging on social media, radiology imaging in healthcare, and self-driving cars
in the automotive industry.
Recommendation engines: AI algorithms may help to detect trends in data that might be
useful for developing more efficient marketing strategies using past data patterns. Online
retailers use recommendation engines to provide their customers with relevant product
recommendations for the purchasing process.
Robotic process automation (RPA): Also known as software robotics, RPA uses intelligent
automation technologies to perform repetitive manual tasks.
Automated stock trading: AI-driven high-frequency trading platforms are designed to
optimize stock portfolios and make thousands or even millions of trades each day without
human intervention.
Fraud detection: Machine learning is capable of detecting suspected transactions for banks
and others in the financial sector. A model can be trained by supervised learning, based on
knowledge of recent fraudulent transactions. Anomaly detection may identify transactions that
appear unusual, and need to be followed up.
How does Machine Learning
work?
The mechanism of how a machine learns from a model
is divided into three main components −
Decision Process − Based on the input data and
output labels provided to the model, it will produce a
logic about the pattern identified.
Cost Function − It is the measure of error between
expected value and predicted value. This is used to
evaluate the performance of machine learning.
Optimization Process − Cost function can be
minimized by adjusting the weights at the training
stage. The algorithm will repeat the process of
evaluation and optimization until the error minimizes.
Machine Learning Model
Before discussing the machine learning model, we must need to understand
the following formal definition of ML given by professor Mitchell −
“A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P, if its
performance at tasks in T, as measured by P, improves with
experience E.”
The above definition is basically focusing on three parameters, also the
main components of any learning algorithm, namely Task(T), Performance(P)
and experience (E). In this context, we can simplify this definition as −
ML is a field of AI consisting of learning algorithms that −
Improve their performance (P)
At executing some task (T)
Over time with experience (E)
Based on the above, the following diagram represents a Machine Learning
Model.
Task(T)
From the perspective of problem, we may define the task T as the real-world problem to
be solved. The problem can be anything like finding best house price in a specific
location or to find best marketing strategy etc. On the other hand, if we talk about
machine learning, the definition of task is different because it is difficult to solve ML
based tasks by conventional programming approach.
A task T is said to be a ML based task when it is based on the process and the system
must follow for operating on data points. The examples of ML based tasks are
Classification, Regression, Structured annotation, Clustering, Transcription etc.
Experience (E)
As name suggests, it is the knowledge gained from data points provided to the algorithm
or model. Once provided with the dataset, the model will run iteratively and will learn
some inherent pattern. The learning thus acquired is called experience(E). Making an
analogy with human learning, we can think of this situation as in which a human being is
learning or gaining some experience from various attributes like situation, relationships
etc. Supervised, unsupervised and reinforcement learning are some ways to learn or gain
experience. The experience gained by out ML model or algorithm will be used to solve
the task T.
Performance (P)
An ML algorithm is supposed to perform task and gain experience with the passage of
time. The measure which tells whether ML algorithm is performing as per expectation or
not is its performance (P). P is basically a quantitative metric that tells how a model is
performing the task, T, using its experience, E. There are many metrics that help to
understand the ML performance, such as accuracy score, F1 score, confusion matrix,
precision, recall, sensitivity etc.
Supervised Learning
Supervised learning that uses labeled dataset to train
algorithms to understand data patterns and predict outcomes.
◦ For example, filtering a mail into inbox or spam folder.
The supervised learning further can be classified into two
types − classification and regression.
There are different supervised learning algorithms that are
widely used −
◦ Linear Regression
◦ Logistic Regression
◦ Decision Trees
◦ Random Forest
◦ K-nearest Neighbor
◦ Support Vector Machine
◦ Naive Bayes
◦ Linear Discriminant Analysis
◦ Neural Networks
Linear regression
Linear regression in machine learning is defined as a
statistical model that analyzes the linear relationship
between a dependent variable and a given set of
independent variables.
The linear relationship between variables means that
when the value of one or more independent variables will
change (increase or decrease), the value of the
dependent variable will also change accordingly (increase
or decrease).
In machine learning, linear regression is used for
predicting continuous numeric values based on learned
linear relation for new and unseen data.
It is used in predictive modeling, financial forecasting, risk
assessment, etc.
12/
What is Clustering?
The task of grouping data points based on their
similarity with each other is called Clustering or Cluster
Analysis.
This method is defined under the branch of
unsupervised learning, which aims at gaining insights from
unlabelled data points.
Think of it as you have a dataset of customers shopping
habits.
Clustering can help you group customers with similar
purchasing behaviors, which can then be used for
targeted marketing, product recommendations, or
customer segmentation
For Example, In the graph given below, we can clearly see that
there are 3 circular clusters forming on the basis of distance.
UNIT 5TH
🧠 Support Vector Machines (SVM)
Definition:
SVM is a supervised ML algorithm used for classification and regression tasks.
It aims to find the hyperplane that best separates data points of different
classes.
Key Concepts:
Margin: Distance between the hyperplane and the closest data points
(support vectors).
Kernel Trick: Allows SVM to operate in a high-dimensional space (e.g., RBF,
polynomial).
Support Vectors: Data points closest to the decision boundary.
Pros:
Effective in high-dimensional spaces.
Works well with clear margin of separation.
Cons:
Not suitable for large datasets.
Requires good kernel choice and parameter tuning.
🧮 Bayesian Learning
Definition:
Bayesian learning uses Bayes’ Theorem to update the
probability of a hypothesis as more evidence becomes
available.
Bayes’ Theorem:
P(H∣D)=P(D∣H)⋅P(H)P(D)P(H|D) = \frac{P(D|H) \cdot P(H)}
{P(D)}P(H∣D)=P(D)P(D∣H)⋅P(H)Where:
HHH: Hypothesis
DDD: Data
P(H∣D)P(H|D)P(H∣D): Posterior probability
P(D∣H)P(D|H)P(D∣H): Likelihood
P(H)P(H)P(H): Prior probability
P(D)P(D)P(D): Evidence
Applications:
Naive Bayes classifiers
Probabilistic graphical models
Spam filtering, text classification
Applications of ML
1. Computer Vision
Tasks: Object detection, image classification, segmentation
Techniques: CNNs, transfer learning
Examples: Facial recognition, medical imaging, self-driving
cars
2. Speech Processing
Tasks: Speech recognition, speaker identification, emotion
detection
Techniques: RNNs, LSTMs, Transformers
Examples: Siri, Google Assistant, call center automation
3. Natural Language Processing (NLP)
Tasks: Sentiment analysis, machine translation, question
answering
Techniques: BERT, GPT, LLMs, RNNs
Examples: Chatbots, translation services, summarization tools
Models pre-trained on ImageNet are now used in transfer
learning.
Case Study: ImageNet Competition (ILSVRC)
Overview:
Annual competition from 2010–2017 focused on large-
scale visual recognition.
Dataset: 14M+ labeled images in 1000 categories.
Significant Milestones:
2012 (AlexNet): Deep CNN by Krizhevsky et al. reduced
top-5 error from 26% to 15%.
2014 (GoogLeNet, VGG): Deeper architectures,
inception modules.
2015–2017: ResNet introduced residual learning,
reaching <3.6% error.
Impact:
Sparked the deep learning revolution.
Enabled breakthroughs in image classification and other
domains.