0% found this document useful (0 votes)

50 views16 pages

Linear Discriminants in Machine Learning

Unit 4 discusses Linear Discriminants, a statistical method used in machine learning for dimensionality reduction and classification, particularly through Linear Discriminant Analysis (LDA). It also covers the Perceptron, a basic binary classifier, and Support Vector Machines (SVM), which find optimal hyperplanes for classification tasks, including techniques for handling non-linear data with various kernel functions. Additionally, the document introduces Logistic Regression and Linear Regression as supervised learning algorithms for classification and predictive analysis, respectively.

Uploaded by

VASAVI OLETI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views16 pages

Linear Discriminants in Machine Learning

Uploaded by

VASAVI OLETI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Unit 4 : Linear Discriminants for Machine Learning

Introduction to Linear Discriminants:

Linear Discriminants (LD) is a statistical method used in machine learning to reduce the
dimensionality of data and achieve optimal discrimination among different classes. It
involves finding a linear combination of features that can effectively separate two or more
classes of objects.

LDs are widely used in various applications like pattern recognition and image retrieval.

The method is based on discriminant functions that are estimated based on a set of data called
training set. These discriminant functions are linear with respect to the characteristic vector,
and usually have the form

f(t)=wtx+b0,

where w represents the weight vector, x the characteristic vector, and b0 a threshold.

Linear Discriminants for Classification:

Linear discriminants are linear functions used in classification to separate data points
belonging to different classes. They aim to find a linear combination of features that best
separates these classes, often used in Linear Discriminant Analysis (LDA).

LDA is a supervised learning technique that also serves as a dimensionality reduction

method.

Why they are used:

 Linear discriminants are linear functions that act as decision boundaries in

classification.

 They aim to find a line (in 2D) or a hyperplane (in higher dimensions) that best
separates different classes.

 They are used in algorithms like Linear Discriminant Analysis (LDA).

 Let’s suppose we have d-dimensional data points x1….xn with 2

classes Ci=1,2 each having N1 & N2 samples.

 Consider W as a unit vector onto which we will project the data points. Since we

are only concerned with the direction, we choose a unit vector for this purpose.

 Number of samples : N = N1 + N2

 If x(n) are the samples on the feature space then WTx(n) denotes the data points

after projection.

 Means of classes before projection: mi

 Means of classes after projection: Mi = WTmi

Datapoint X before and after projection

Scatter matrix: Used to make estimates of the covariance matrix. IT is a m X m positive
semi-definite matrix.

Perceptron Classifier:

One of the earliest and most basic machine learning methods used for binary classification is
the perceptron. Frank Rosenblatt created it in the late 1950s, and it is a key component of
more intricate neural network topologies.

A simple binary linear classifier called a perceptron generates predictions based on

the weighted average of the input data. Based on whether the weighted total exceeds a
predetermined threshold, a threshold function determines whether to output a 0 or a 1.
Single Layer Perceptron
Components of a Perceptron:
1. Input Features (x): Predictions are based on the characteristics or qualities of the
input data, or input features (x).
A number value is used to represent each feature. The two classes in binary
classification are commonly represented by the numbers 0 (negative class) and 1
(positive class).
2. Input Weights (w): Each input information has a weight (w), which establishes its
significance when formulating predictions.
The weights are numerical numbers as well and are either initialized to zeros
or small random values.
3. Weighted Sum (∑f(x)∑f(x)): To calculate the weighted sum, use the dot product of
the input features' (x) weights and their associated features' (w) weights.
Mathematically, it is written as ∑f(x)=w1x1+w2x2+...+wn∗xn∑f(x)=w1x1+w2x2+...
+wn∗xn.
4. Activation Function (Step Function) : The activation function, which is commonly
a step function, is applied to the weighted sum (∑f(x)∑f(x)). If the weighted total
exceeds a predetermined threshold, the step function is utilized to decide the
perceptron's output.
The output is 1 (positive class) if ∑f(x)∑f(x) is greater than or equal to the
threshold and 0 (negative class) otherwise.

Working of the Perceptron:

1. Initialization: The weights (w) are initially initialized, frequently using tiny random
values or zeros.
2. Prediction: The Perceptron calculates the weighted total (∑f(x)∑f(x)) of the input
features and weights in order to provide a forecast for a particular input.
3. Activation Function: Following the computation of the weighted sum (∑f(x)∑f(x)),
an activation function is used. The perceptron outputs 1 (positive class)
if ∑f(x)∑f(x) is greater than or equal to a specific threshold; otherwise, it outputs 0
(negative class) because the activation function is a step function.
4. Updating Weight: Weights are updated if a misclassification, or an inaccurate
prediction, is made by the perceptron. The weight update is carried out to reduce
prediction inaccuracy in the future. Typically, the update rule involves shifting the
weights in a way that lowers the error. The perceptron learning rule, which is based
on the discrepancy between the expected and actual class labels, is the most widely
used rule.
5. Repeat: Each input data point in the training dataset is repeated through steps 2
through 4 one more time. This procedure keeps going until the model converges and
accurately categorizes the training data, which could take a certain amount of
iterations.

Perceptron Learning Algorithm:

Steps:

1. Initialize weights wi and bias b (usually with zeros or small random numbers)
2. For each training sample (x,t):
o Compute predicted output: y=step(w⋅x+b)
o Update weights and bias: wi=wi+η(t−y)xi

b+η(t−y)

o Where:
 t= target output
 y = predicted output
 η= learning rate (typically small, e.g., 0.1)
3. Repeat until convergence or a maximum number of iterations is reached.

Support Vector Machines:

Support Vector Machine (SVM) is a supervised machine learning algorithm used for
classification and regression tasks. SVM is particularly well-suited for classification tasks.

SVM aims to find the optimal hyperplane in an N-dimensional space to separate data points
into different classes. The algorithm maximizes the margin between the closest points of
different classes.

How does Support Vector Machine Algorithm Work?

The key idea behind the SVM algorithm is to find the hyperplane that best separates two
classes by maximizing the margin between them. This margin is the distance from the
hyperplane to the nearest data points (support vectors) on each side.

Multiple hyperplanes separate the data from two classes

The best hyperplane, also known as the “hard margin,” is the one that maximizes the
distance between the hyperplane and the nearest data points from both classes. This ensures a
clear separation between the classes. So, from the above figure, we choose L2 as hard
margin.

Mathematical Computation: SVM

Consider a binary classification problem with two classes, labeled as +1 and -1. We have a
training dataset consisting of input feature vectors X and their corresponding class labels Y.

The equation for the linear hyperplane can be written as: wTx+b=0

Where:

 w is the normal vector to the hyperplane (the direction perpendicular to it).

 b is the offset or bias term, representing the distance of the hyperplane from the origin
along the normal vector w.

Distance from a Data Point to the Hyperplane

The distance between a data point x_i and the decision boundary can be calculated as:

di=wTxi+b/∣∣w∣∣

where ||w|| represents the Euclidean norm of the weight vector w. Euclidean norm of the
normal vector W
Linear SVM Classifier

Distance from a Data Point to the Hyperplane: y^={1: wTx+b≥0 }

0: wTx+b <0

Where y^ is the predicted label of a data point.

Support Vector Machine (SVM) Terminology

 Hyperplane: A decision boundary separating different classes in feature space,

represented by the equation wx + b = 0 in linear classification.

 Support Vectors: The closest data points to the hyperplane, crucial for determining
the hyperplane and margin in SVM.

 Margin: The distance between the hyperplane and the support vectors. SVM aims to
maximize this margin for better classification performance.

 Kernel: A function that maps data to a higher-dimensional space, enabling SVM to

handle non-linearly separable data.

 Hard Margin: A maximum-margin hyperplane that perfectly separates the data

without misclassifications.

 Soft Margin: Allows some misclassifications by introducing slack variables,

balancing margin maximization and misclassification penalties when data is not
perfectly separable.

Linearly Non-Separable Case:

When data is not linearly separable (i.e., it can’t be divided by a straight line), SVM uses a
technique called kernels to map the data into a higher-dimensional space where it becomes
separable. This transformation helps SVM find a decision boundary even for non-linear data.

Original 1D dataset for classification

A kernel is a function that maps data points into a higher-dimensional space without
explicitly computing the coordinates in that space. This allows SVM to work efficiently with
non-linear data by implicitly performing the mapping.
For example, consider data points that are not linearly separable. By applying a kernel
function, SVM transforms the data points into a higher-dimensional space where they
become linearly separable.

 Linear Kernel: For linear separability.

 Polynomial Kernel: Maps data into a polynomial space.

 Radial Basis Function (RBF) Kernel: Transforms data into a space based on
distances between data points.

SVM algorithm is of two types:-

 Linear SVM: When the data points are linearly separable into two classes, the data is
called linearly-separable data. We use the linear SVM classifier to classify such data.

 Non-linear SVM: When the data is not linearly separable, we use the non-linear SVM
classifier to separate the data points.

Non-linear SVM:

Nonlinear SVM was introduced when the data cannot be separated by a linear decision
boundary in the original feature space.

The kernel function computes the similarity between data points allowing SVM to
capture complex patterns and nonlinear relationships between features. This enables
nonlinear SVM to form curved or circular decision boundaries with help of kernel.

FOR EXAMPLE:
From the above figure, there are two classes of data points that a straight line cannot separate.
But, a circular hyperplane can separate them, hence we can introduce a coordinate Z, with the
help of X and Y, where Z=X2+Y2Z=X2+Y2. Now after introducing the third dimension, the
graph changes to:-

Types of Kernels used in SVM

Here are some common types of kernels used by SVM. Let’s understand them one by one:

1. Linear Kernel

 A linear kernel is the simplest form of kernel used in SVM. It is suitable when the
data is linearly separable meaning that a straight line (or hyperplane in higher
dimensions) can effectively separate the classes.

 It is represented as: K(x,y)=x.y

 It is used for text classification problems such as spam detection

2. Polynomial Kernel

 The polynomial kernel allows SVM to model more complex relationships by

introducing polynomial terms. It is useful when the data is not linearly separable but
still follows a pattern. The formula of Polynomial kernel is:

 K(x,y)=(x.y+c)d where is a constant and d is the polynomial degree.

 It is used in Complex problems like image recognition where relationships between

features can be non-linear.
3. Radial Basis Function Kernel (RBF) Kernel

 The RBF kernel is the most widely used kernel in SVM. It maps the data into an
infinite-dimensional space making it highly effective for complex classification
problems. The formula of RBF kernel is:

 K(x,y)=e–(γ∣∣x–y∣∣2) where γis a parameter that controls the influence of each training
example.

 We use RBF kernel When the decision boundary is highly non-linear and we have no
prior knowledge about the data’s structure is available.

4. Gaussian Kernel

 The Gaussian kernel is a special case of the RBF kernel and is widely used for non-
linear data classification. It provides smooth and continuous transformations of data
into higher dimensions. It can be represented by:

 K(x,y)=e–(∣∣x–y∣∣22σ2) where σ is a parameter that controls the spread of the kernel

function.

 It is used Used when data has a smooth, continuous distribution and requires a
flexible boundary.

Gaussian Kernel Graph

5. Sigmoid Kernel

 The sigmoid kernel is inspired by neural networks and behaves similarly to the
activation function of a neuron. It is based on the hyperbolic tangent function and is
suitable for neural networks and other non-linear classifiers. It is represented as:

 K(x,y)=tanh(γ.xTy+r)

 It is often used in neural networks and non-linear classifiers.

Kernel Trick:

The kernel trick computes the dot product of data points in the higher-dimensional
space directly that helps a model find patterns in complex data and transforming the data
into a higher-dimensional space where it becomes easier to separate different classes or
detect relationships.
For example, imagine we have data points shaped like two concentric circles: one circle
represents one class and the other circle represents another class. If we try to separate these
classes with a straight line it can’t be done because the data is not linearly separable in its
current form.

When we use a kernel function, it transforms the original 2D data (like the concentric circles)
into a higher-dimensional space where the data becomes linearly separable. In that higher-
dimensional space, the SVM finds a simple straight-line decision boundary to separate the
classes.

Logistic Regression:

Logistic regression is another supervised learning algorithm which is used to solve the
classification problems. In classification problems, we have dependent variables in a binary or
discrete format such as 0or 1.

Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes or No,
True or False, Spam or not spam, etc.

It is a predictive analysis algorithm which works on the concept of probability. o Logistic

regression is a type of regression, but it is different from the linear regression algorithm in the
term how they are used.

Logistic regression uses sigmoid function or logistic function which is a complex cost
function. This sigmoid function is used to model the data in logistic regression

Function represented as Where, f(x)=Output between the 0 and 1 value.

x=input to the function e=base of natural logarithm.

When we provide the input values(data) to the function,it gives the S-curve as follows:

It uses the concept of threshold levels, values above the threshold level are rounded upto1,
and values below the threshold level are rounded upto 0.

There are three types of logistic regression:

• Binary(0/1,pass/fail) –Only two possible outcomes. Example:Assessing cancer risk

either high or low
• Multinomial(cats,dogs,lions)-There are unordered two or more outcomes
• Ordinal(low, medium, high) - There are ordered two or more outcomes

Linear Regression:

Linear regression is a statistical regression method which is used for predictive analysis.

It is one of the very simple and easy algorithms which works on regression and shows
the relationship between the continuous variables.

It is used for solving the regression problem in machine learning.

Linear regression shows the linear relationship between the independent variable(X-
axis) and the dependent variable (Y-axis), hence called linear regression.

If there is only one input variable (x), then such linear regression is called simple linear
regression. And if there is more than one input variable, then such linear regression is
called multiple linear regression.

The relationship between variables in the linear regression model can be explained using
the below image. Here we are predicting the salary of an employee on the basis of the
year of experience.
Below is the mathematical equation for Linear regression: Y= aX+b

Here, Y=dependent variables(targetvariables), X=Independent variables (predictorvariables),

a and b are the linear coefficients

Some popular applications of linear regression are:

• Analyzing trends and sales estimates

• Salary forecasting
• Real estate prediction
• Arriving at ETAs in traffic.

Types of Perceptron Models

Based on the number of layers, perceptrons are broadly classified into two major categories:

Single Layer Perceptron Model:

It is the simplest Artificial Neural Network (ANN) model. A single-layer perceptron model
consists of a feed-forward network and includes a threshold transfer function for thresholding
on the Output. The main objective of the single-layer perceptron model is to classify linearly
separable data with binary labels.

Multi-Layer Perceptron Model:

The multi-layer perceptron learning algorithm has the same structure as a single-layer
perceptron but consists of an additional one or more hidden layers, unlike a single-layer
perceptron, which consists of a single hidden layer. The distinction between these two types
of perceptron models is shown in the Figure below.
Multi-Layer Perceptrons (MLPs):

The main shortcoming of the Feed Forward networks was its inability to learn
with backpropagation (Unsupervised Learning). Multi-layer Perceptrons are
the neural networks which incorporate multiple hidden layers and activation
functions. The learning takes place in a Supervised manner where the weights
are updated by the means of Gradient Descent. Multi-layer Perceptron is bi-
directional, i.e., Forward propagation of the inputs, and the backward
propagation of the weight updates. The activation functions can be changes
with respect to the type of target. Softmax is usually used for multi-class
classification, Sigmoid for binary classification and so on. These are also
called dense networks because all the neurons in a layer are connected to all the
neurons in the next layer as shown in figure
Multilayer networks learned by the BACKPROPACATION algorithm
are capable of expressing a rich variety of nonlinear decision surfaces.
A Differentiable Threshold Unit (Sigmoid unit)
Sigmoid unit-a unit very much like a perceptron, but based on a smoothed, differentiable
threshold function.

• The sigmoid unit first computes a linear combination of its inputs, then
applies a threshold to the result and the threshold output is a continuous
function of its input.
• More precisely, the sigmoid unit computes its output O as

Backpropagation for Training an MLP:

Backpropagation is an algorithm that back propagates the errors from output

nodes to the input nodes. Therefore, it is simply referred to as backward
propagation of errors. It uses in the vast applications of neural networks like
Character recognition, Signature verification, etc.
Working of Backpropagation:
Neural networks use supervised learning to generate output vectors from input
vectors that the network operates on. It Compares generated output to the desired
output and generates an error report if the result does not match the generated output
vector. Then it adjusts the weights according to the bug report to get your desired
output. Backpropagation Algorithm:

Step 1: Inputs X, arrive through the preconnected path.

Step 2: The input is modeled using true weights W. Weights are usually chosen
randomly. Step 3: Calculate the output of each neuron from the input layer to the
hidden layer to the output layer.
Step 4: Calculate the error in the outputs
Backpropagation Error= Actual Output – Desired Output

Step 5: From the output layer, go back to the hidden layer to adjust the weights to
reduce the error.
Step 6: Repeat the process until the desired output is achieved.

• The BACKPROPAGATION Algorithm learns the weights for a multilayer

network, given a network with a fixed set of units and interconnections. It
employs gradient descent to attempt to minimize the squared error between the
network output values and the target values for these outputs.
• In BACKPROPAGATION algorithm, we consider networks with multiple
output units rather than single units as before, so we redefine E to sum the
errors over all of the network output units.

Linear Soft Margin Classifier Overview
100% (1)
Linear Soft Margin Classifier Overview
18 pages
Perceptron vs SVM: Key Differences
100% (1)
Perceptron vs SVM: Key Differences
3 pages
Linear Regression and Perceptron Overview
No ratings yet
Linear Regression and Perceptron Overview
8 pages
SVM and Perceptron in Linear Classification
No ratings yet
SVM and Perceptron in Linear Classification
103 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
14 pages
Perceptron and SVM in Machine Learning
No ratings yet
Perceptron and SVM in Machine Learning
24 pages
Understanding SVM and Dual Perceptron
No ratings yet
Understanding SVM and Dual Perceptron
8 pages
Understanding SVM Classes and Margins
No ratings yet
Understanding SVM Classes and Margins
33 pages
cs221 Lecture11
No ratings yet
cs221 Lecture11
71 pages
Supervised Machine Learning Explained
No ratings yet
Supervised Machine Learning Explained
61 pages
Linear Classifiers and Decision Boundaries
No ratings yet
Linear Classifiers and Decision Boundaries
13 pages
Machine Learning: SVMs and Perceptron
No ratings yet
Machine Learning: SVMs and Perceptron
23 pages
Support Vector Machine Lecture Notes
No ratings yet
Support Vector Machine Lecture Notes
13 pages
Supervised Learning: Linear Models Overview
No ratings yet
Supervised Learning: Linear Models Overview
34 pages
SVM Techniques for Image Classification
No ratings yet
SVM Techniques for Image Classification
22 pages
Understanding Support Vector Machines (SVM)
No ratings yet
Understanding Support Vector Machines (SVM)
139 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
40 pages
SVM Implementation in Python
No ratings yet
SVM Implementation in Python
24 pages
Linear Discriminant Analysis Explained
No ratings yet
Linear Discriminant Analysis Explained
19 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
53 pages
SVM Classifier Overview and Concepts
No ratings yet
SVM Classifier Overview and Concepts
47 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
45 pages
Discriminant Functions in Machine Learning
No ratings yet
Discriminant Functions in Machine Learning
33 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
15 pages
Understanding Linear Discriminants in ML
No ratings yet
Understanding Linear Discriminants in ML
11 pages
Understanding SVM Algorithm Basics
No ratings yet
Understanding SVM Algorithm Basics
10 pages
LDA and SVM in Machine Learning
No ratings yet
LDA and SVM in Machine Learning
38 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
23 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
46 pages
Neural Networks and Classification Techniques
No ratings yet
Neural Networks and Classification Techniques
26 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
61 pages
ch6 (Q 2,8,4)
No ratings yet
ch6 (Q 2,8,4)
9 pages
Perceptron vs. SVM: Key Differences
No ratings yet
Perceptron vs. SVM: Key Differences
41 pages
Support Vector Machines Overview
No ratings yet
Support Vector Machines Overview
27 pages
SVM Numerical Example Explained
No ratings yet
SVM Numerical Example Explained
69 pages
Multiclass Classification Overview
No ratings yet
Multiclass Classification Overview
41 pages
Machine Learning Classification Techniques
No ratings yet
Machine Learning Classification Techniques
30 pages
SVM Classifier: Concepts and Optimization
No ratings yet
SVM Classifier: Concepts and Optimization
40 pages
Hard vs Soft Margin in SVM Explained
No ratings yet
Hard vs Soft Margin in SVM Explained
11 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
11 pages
Normalization in SVM Objective Function
No ratings yet
Normalization in SVM Objective Function
55 pages
Support Vector Machine (SVM) Explained
No ratings yet
Support Vector Machine (SVM) Explained
16 pages
Chapter 4. Classification Algorithms-Stud
No ratings yet
Chapter 4. Classification Algorithms-Stud
43 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
52 pages
SVM Tutorial and Applications Overview
No ratings yet
SVM Tutorial and Applications Overview
34 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
7 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
23 pages
Introduction to Neural Networks Basics
No ratings yet
Introduction to Neural Networks Basics
13 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
52 pages
Understanding Kernel Tricks in SVMs
No ratings yet
Understanding Kernel Tricks in SVMs
43 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
33 pages
Understanding Linear Classifiers and SVMs
No ratings yet
Understanding Linear Classifiers and SVMs
31 pages
Supervised Regression in Machine Learning
No ratings yet
Supervised Regression in Machine Learning
74 pages
Backpropagation Steps in Neural Networks
No ratings yet
Backpropagation Steps in Neural Networks
10 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
57 pages
Introduction to Data Science Course
No ratings yet
Introduction to Data Science Course
71 pages
AI Tools for Academic Success
No ratings yet
AI Tools for Academic Success
166 pages
N-gram Language Modeling Overview
No ratings yet
N-gram Language Modeling Overview
65 pages
AI's Transformative Impact on Finance
No ratings yet
AI's Transformative Impact on Finance
25 pages
Fast Feedforward Networks Overview
No ratings yet
Fast Feedforward Networks Overview
12 pages
Machine Learning Use Cases Manual
No ratings yet
Machine Learning Use Cases Manual
6 pages
GCP Data Engineer Adolfo Camacho Profile
No ratings yet
GCP Data Engineer Adolfo Camacho Profile
8 pages
Rainfall Prediction Model for West Bengal
No ratings yet
Rainfall Prediction Model for West Bengal
6 pages
2020 Analytic Platforms Market Share
No ratings yet
2020 Analytic Platforms Market Share
25 pages
A Paper On Stylometric Differences Between Authentic and Fabricated Ahadith
No ratings yet
A Paper On Stylometric Differences Between Authentic and Fabricated Ahadith
18 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
17 pages
Software Engineer Experience Overview
No ratings yet
Software Engineer Experience Overview
1 page
Doc-20250125-Wa0023 250126 174615
No ratings yet
Doc-20250125-Wa0023 250126 174615
11 pages
ICT Innovations and Cognitive Functions
No ratings yet
ICT Innovations and Cognitive Functions
2 pages
Introduction To Large Language Models (LLMS) - Unit 7 - Week 5
No ratings yet
Introduction To Large Language Models (LLMS) - Unit 7 - Week 5
4 pages
Discrete Mathematics Course at IIT Ropar
No ratings yet
Discrete Mathematics Course at IIT Ropar
1 page
Machine Learning AL3451 Answer Key
No ratings yet
Machine Learning AL3451 Answer Key
12 pages
Copyright Compensation and Commons in The Music AI Industry
No ratings yet
Copyright Compensation and Commons in The Music AI Industry
19 pages
Python for Machine Learning Basics
No ratings yet
Python for Machine Learning Basics
54 pages
Autism Detection via ML Techniques
No ratings yet
Autism Detection via ML Techniques
38 pages
AI Solutions in SME Cybersecurity 2024
No ratings yet
AI Solutions in SME Cybersecurity 2024
8 pages
Stanford CS329S: ML Systems Design 2024
No ratings yet
Stanford CS329S: ML Systems Design 2024
70 pages
Machine Learning Course Outline
No ratings yet
Machine Learning Course Outline
33 pages
Egusphere 2025 5776
No ratings yet
Egusphere 2025 5776
67 pages
Face Recognition via PCA, LDA, Neural Networks
No ratings yet
Face Recognition via PCA, LDA, Neural Networks
6 pages
Introduction to Artificial Intelligence
No ratings yet
Introduction to Artificial Intelligence
37 pages
Linear Discriminant Analysis Explained
No ratings yet
Linear Discriminant Analysis Explained
32 pages
Classification Basics in Data Mining
No ratings yet
Classification Basics in Data Mining
20 pages
Deep Learning Assignment Question Bank
No ratings yet
Deep Learning Assignment Question Bank
2 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
19 pages

Linear Discriminants in Machine Learning

Uploaded by

Linear Discriminants in Machine Learning

Uploaded by

Unit 4 : Linear Discriminants for Machine Learning

Introduction to Linear Discriminants:

Linear Discriminants for Classification:

LDA is a supervised learning technique that also serves as a dimensionality reduction

Why they are used:

 Linear discriminants are linear functions that act as decision boundaries in

 They are used in algorithms like Linear Discriminant Analysis (LDA).

classes Ci=1,2 each having N1 & N2 samples.

 Means of classes before projection: mi

 Means of classes after projection: Mi = WTmi

Datapoint X before and after projection

A simple binary linear classifier called a perceptron generates predictions based on

Working of the Perceptron:

Perceptron Learning Algorithm:

Support Vector Machines:

How does Support Vector Machine Algorithm Work?

Multiple hyperplanes separate the data from two classes

Mathematical Computation: SVM

 w is the normal vector to the hyperplane (the direction perpendicular to it).

Distance from a Data Point to the Hyperplane

Distance from a Data Point to the Hyperplane: y^={1: wTx+b≥0 }

Where y^ is the predicted label of a data point.

Support Vector Machine (SVM) Terminology

 Hyperplane: A decision boundary separating different classes in feature space,

 Kernel: A function that maps data to a higher-dimensional space, enabling SVM to

 Hard Margin: A maximum-margin hyperplane that perfectly separates the data

 Soft Margin: Allows some misclassifications by introducing slack variables,

Linearly Non-Separable Case:

Original 1D dataset for classification

 Linear Kernel: For linear separability.

 Polynomial Kernel: Maps data into a polynomial space.

SVM algorithm is of two types:-

Types of Kernels used in SVM

 It is represented as: K(x,y)=x.y

 It is used for text classification problems such as spam detection

 The polynomial kernel allows SVM to model more complex relationships by

 K(x,y)=(x.y+c)d where is a constant and d is the polynomial degree.

 It is used in Complex problems like image recognition where relationships between

 K(x,y)=e–(∣∣x–y∣∣22σ2) where σ is a parameter that controls the spread of the kernel

Gaussian Kernel Graph

 It is often used in neural networks and non-linear classifiers.

It is a predictive analysis algorithm which works on the concept of probability. o Logistic

Function represented as Where, f(x)=Output between the 0 and 1 value.

There are three types of logistic regression:

• Binary(0/1,pass/fail) –Only two possible outcomes. Example:Assessing cancer risk

It is used for solving the regression problem in machine learning.

Here, Y=dependent variables(targetvariables), X=Independent variables (predictorvariables),

Some popular applications of linear regression are:

• Analyzing trends and sales estimates

Types of Perceptron Models

Single Layer Perceptron Model:

Multi-Layer Perceptron Model:

Backpropagation for Training an MLP:

Backpropagation is an algorithm that back propagates the errors from output

Step 1: Inputs X, arrive through the preconnected path.

• The BACKPROPAGATION Algorithm learns the weights for a multilayer

You might also like