0% found this document useful (0 votes)

35 views6 pages

Understanding Support Vector Machines

Support Vector Machine (SVM) is a versatile supervised machine learning algorithm used for classification and regression, focusing on finding the optimal hyperplane that maximizes the margin between classes. It can handle both linear and nonlinear data through the use of kernel functions, making it robust to outliers. Additionally, the document discusses Bayesian classification based on Bayes' Theorem and various clustering methods, including partitioning, hierarchical, density-based, grid-based, and model-based approaches.

Uploaded by

andraprudhvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views6 pages

Understanding Support Vector Machines

Uploaded by

andraprudhvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

2)Support Vector Machine (SVM) is a powerful machine learning algorithm used for

linear or nonlinear classification, regression, and even outlier detection tasks. SVMs can be
used for a variety of tasks, such as text classification, image classification, spam
detection, handwriting identification, gene expression analysis, face detection, and anomaly
detection. SVMs are adaptable and efficient in a variety of applications because they can
manage high-dimensional data and nonlinear relationships.
SVM algorithms are very effective as we try to find the maximum separating hyperplane
between the different classes available in the target feature.
Support Vector Machine
Support Vector Machine (SVM) is a supervised machine learning algorithm used for both
classification and regression. Though we say regression problems as well it’s best suited for
classification. The main objective of the SVM algorithm is to find the
optimal hyperplane in an N-dimensional space that can separate the data points in different
classes in the feature space. The hyperplane tries that the margin between the closest points
of different classes should be as maximum as possible. The dimension of the hyperplane
depends upon the number of features. If the number of input features is two, then the
hyperplane is just a line. If the number of input features is three, then the hyperplane
becomes a 2-D plane. It becomes difficult to imagine when the number of features exceeds
three.
Let’s consider two independent variables x1, x2, and one dependent variable which is either
a blue circle or a red circle.

Linearly Separable Data points

From the figure above it’s very clear that there are multiple lines (our hyperplane here is a
line because we are considering only two input features x1, x2) that segregate our data
points or do a classification between red and blue circles. So how do we choose the best
line or in general the best hyperplane that segregates our data points?
How does SVM work?
One reasonable choice as the best hyperplane is the one that represents the largest
separation or margin between the two classes.
Multiple hyperplanes separate the data from two classes

So we choose the hyperplane whose distance from it to the nearest data point on each side
is maximized. If such a hyperplane exists it is known as the maximum-margin
hyperplane/hard margin. So from the above figure, we choose L2. Let’s consider a
scenario like shown below

Here we have one blue ball in the boundary of the red ball. So how does SVM classify the
data? It’s simple! The blue ball in the boundary of red ones is an outlier of blue balls. The
SVM algorithm has the characteristics to ignore the outlier and finds the best hyperplane
that maximizes the margin. SVM is robust to outliers.

Support Vector Machine Terminology

1. Hyperplane: Hyperplane is the decision boundary that is used to separate the data
points of different classes in a feature space. In the case of linear classifications, it will
be a linear equation i.e. wx+b = 0.
2. Support Vectors: Support vectors are the closest data points to the hyperplane, which
makes a critical role in deciding the hyperplane and margin.
3. Margin: Margin is the distance between the support vector and hyperplane. The main
objective of the support vector machine algorithm is to maximize the margin. The
wider margin indicates better classification performance.
4. Kernel: Kernel is the mathematical function, which is used in SVM to map the original
input data points into high-dimensional feature spaces, so, that the hyperplane can be
easily found out even if the data points are not linearly separable in the original input
space. Some of the common kernel functions are linear, polynomial, radial basis
function(RBF), and sigmoid.
5. Hard Margin: The maximum-margin hyperplane or the hard margin hyperplane is a
hyperplane that properly separates the data points of different categories without any
misclassifications.
6. Soft Margin: When the data is not perfectly separable or contains outliers, SVM
permits a soft margin technique. Each data point has a slack variable introduced by the
soft-margin SVM formulation, which softens the strict margin requirement and permits
certain misclassifications or violations. It discovers a compromise between increasing
the margin and reducing violations.
7. C: Margin maximisation and misclassification fines are balanced by the regularisation
parameter C in SVM. The penalty for going over the margin or misclassifying data
items is decided by it. A stricter penalty is imposed with a greater value of C, which
results in a smaller margin and perhaps fewer misclassifications.
8. Hinge Loss: A typical loss function in SVMs is hinge loss. It punishes incorrect
classifications or margin violations. The objective function in SVM is frequently
formed by combining it with the regularisation term.
9. Dual Problem: A dual Problem of the optimisation problem that requires locating the
Lagrange multipliers related to the support vectors can be used to solve SVM. The dual
formulation enables the use of kernel tricks and more effective computing.
Types of Support Vector Machine
Based on the nature of the decision boundary, Support Vector Machines (SVM) can be
divided into two main parts:
 Linear SVM: Linear SVMs use a linear decision boundary to separate the data points
of different classes. When the data can be precisely linearly separated, linear SVMs are
very suitable. This means that a single straight line (in 2D) or a hyperplane (in higher
dimensions) can entirely divide the data points into their respective classes. A
hyperplane that maximizes the margin between the classes is the decision boundary.
 Non-Linear SVM: Non-Linear SVM can be used to classify data when it cannot be
separated into two classes by a straight line (in the case of 2D). By using kernel
functions, nonlinear SVMs can handle nonlinearly separable data. The original input
data is transformed by these kernel functions into a higher-dimensional feature space,
where the data points can be linearly separated. A linear SVM is used to locate a
nonlinear decision boundary in this modified space.

3) Bayesian classification

Bayesian classification is based on Bayes' Theorem. Bayesian classifiers

are the statistical classifiers. Bayesian classifiers can predict class
membership probabilities such as the probability that a given tuple
belongs to a particular class.

Baye's Theorem
Bayes' Theorem is named after Thomas Bayes. There are two types of
probabilities −

 Posterior Probability [P(H/X)]

 Prior Probability [P(H)]

where X is data tuple and H is some hypothesis.

According to Bayes' Theorem,

P(H/X)= P(X/H)P(H) / P(X)

Bayesian Belief Network
Bayesian Belief Networks specify joint conditional probability distributions.
They are also known as Belief Networks, Bayesian Networks, or
Probabilistic Networks.

 A Belief Network allows class conditional independencies to be

defined between subsets of variables.
 It provides a graphical model of causal relationship on which
learning can be performed.
 We can use a trained Bayesian Network for classification.

There are two components that define a Bayesian Belief Network −

 Directed acyclic graph

 A set of conditional probability tables

Explore our latest online courses and learn new skills at your own pace.
Enroll and become a certified expert to boost your career.

Directed Acyclic Graph

 Each node in a directed acyclic graph represents a random variable.
 These variable may be discrete or continuous valued.
 These variables may correspond to the actual attribute given in the
data.

Directed Acyclic Graph Representation

The following diagram shows a directed acyclic graph for six Boolean
variables.
The arc in the diagram allows representation of causal knowledge. For
example, lung cancer is influenced by a person's family history of lung
cancer, as well as whether or not the person is a smoker. It is worth noting
that the variable PositiveXray is independent of whether the patient has a
family history of lung cancer or that the patient is a smoker, given that we
know the patient has lung cancer.

Conditional Probability Table

The conditional probability table for the values of the variable LungCancer
(LC) showing each possible combination of the values of its parent nodes,
FamilyHistory (FH), and Smoker (S) is as follows −

4) Cluster Analysis: The process of grouping a set of physical or abstract objects into classes of
similar objects is called clustering. A cluster is a collection of data objects that are similar to one
another within the same cluster and are dissimilar to the objects in other clusters. A cluster of data
objects can be treated collectively as one group and so may be considered as a form of data
compression. Cluster analysis tools based on k-means, k-medoids, and several methods have also
been built into many statisticalanalysis software packages or systems, such as S-Plus, SPSS, and SAS

Major Clustering Methods:

 Partitioning Methods

 Hierarchical Methods

Density-Based Methods

 Grid-Based Methods

 Model-Based Methods

4.2.1 Partitioning Methods: A partitioning method constructs k partitions of the data, where each
partition represents a cluster and k <= n. That is, it classifies the data into k groups, which together
satisfy the following requirements: Each group must contain at least one object, and Each object
must belong to exactly one group. A partitioning method creates an initial partitioning. It then uses
an iterative relocation technique that attempts to improve the partitioning by moving objects from
one group to another. The general criterion of a good partitioning is that objects in the same cluster
are close or related to each other, whereas objects of different clusters are far apart or very
different.
4.2.2 Hierarchical Methods: A hierarchical method creates a hierarchical decomposition ofthe given
set of data objects. A hierarchical method can be classified as being eitheragglomerative or divisive,
based on howthe hierarchical decomposition is formed.  Theagglomerative approach, also called
the bottom-up approach, starts with each objectforming a separate group. It successively merges the
objects or groups that are closeto one another, until all of the groups are merged into one or until a
termination condition holds.  The divisive approach, also calledthe top-down approach, starts with
all of the objects in the same cluster. In each successiveiteration, a cluster is split up into smaller
clusters, until eventually each objectis in one cluster, or until a termination condition holds.
Hierarchical methods suffer fromthe fact that once a step (merge or split) is done,it can never be
undone. This rigidity is useful in that it leads to smaller computationcosts by not having toworry
about a combinatorial number of different choices. There are two approachesto improving the
quality of hierarchical clustering:  Perform careful analysis ofobject ―linkages‖ at each hierarchical
partitioning, such as in Chameleon, or  Integratehierarchical agglomeration and other approaches
by first using a hierarchicalagglomerative algorithm to group objects into microclusters, and then
performingmacroclustering on the microclusters using another clustering method such as iterative
relocation.

4.2.3 Density-based methods:  Most partitioning methods cluster objects based on the distance
between objects. Such methods can find only spherical-shaped clusters and encounter difficulty at
discovering clusters of arbitrary shapes.  Other clustering methods have been developed based on
the notion of density. Their general idea is to continue growing the given cluster as long as the
density in the neighborhood exceeds some threshold; that is, for each data point within a given
cluster, the neighborhood of a given radius has to contain at least a minimum number of points.
Such a method can be used to filter out noise (outliers)and discover clusters of arbitrary shape. 
DBSCAN and its extension, OPTICS, are typical density-based methods that growclusters according to
a density-based connectivity analysis. DENCLUE is a methodthat clusters objects based on the
analysis of the value distributions of density functions.

4.2.4 Grid-Based Methods:  Grid-based methods quantize the object space into a finite number of
cells that form a grid structure.  All of the clustering operations are performed on the grid structure
i.e., on the quantized space. The main advantage of this approach is its fast processing time, which is
typically independent of the number of data objects and dependent only on the number of cells in
each dimension in the quantized space.  STING is a typical example of a grid-based method. Wave
Cluster applies wavelet transformation for clustering analysis and is both grid-based and density-
based.

4.2.5 Model-Based Methods:  Model-based methods hypothesize a model for each of the clusters
and find the best fit of the data to the given model.  A model-based algorithm may locate clusters
by constructing a density function that reflects the spatial distribution of the data points.  It also
leads to a way of automatically determining the number of clusters based on standard statistics,
taking ―noise‖ or outliers into account and thus yielding robust clustering method

Hard vs Soft Margin in SVM Explained
No ratings yet
Hard vs Soft Margin in SVM Explained
8 pages
EM Algorithm and Support Vector Machines
100% (1)
EM Algorithm and Support Vector Machines
81 pages
Machine Learning: Linear Models & SVMs
No ratings yet
Machine Learning: Linear Models & SVMs
37 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
17 pages
Support Vector Machines in Data Mining
No ratings yet
Support Vector Machines in Data Mining
27 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
4 pages
SVM and RBF in Machine Learning Applications
No ratings yet
SVM and RBF in Machine Learning Applications
20 pages
Support Vector Machine Overview and Implementation
No ratings yet
Support Vector Machine Overview and Implementation
28 pages
Support Vector Machines in Machine Learning
No ratings yet
Support Vector Machines in Machine Learning
10 pages
Understanding Support Vector Machines (SVM)
No ratings yet
Understanding Support Vector Machines (SVM)
6 pages
Understanding Support Vector Machines (SVM)
No ratings yet
Understanding Support Vector Machines (SVM)
11 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
28 pages
Lecture 4
No ratings yet
Lecture 4
13 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
8 pages
Understanding Linear SVM Classifiers
No ratings yet
Understanding Linear SVM Classifiers
14 pages
Understanding Support Vector Machines
100% (3)
Understanding Support Vector Machines
22 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
38 pages
Hard vs Soft Margin in SVM Explained
No ratings yet
Hard vs Soft Margin in SVM Explained
11 pages
SVM and Naive Bayes Classifiers Explained
No ratings yet
SVM and Naive Bayes Classifiers Explained
15 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
21 pages
Overview of Naive Bayes Classifiers
No ratings yet
Overview of Naive Bayes Classifiers
20 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
7 pages
Support Vector Machine Concepts Explained
No ratings yet
Support Vector Machine Concepts Explained
11 pages
Understanding Support Vector Machines
100% (1)
Understanding Support Vector Machines
2 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
31 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
38 pages
SVMs: Geometric Margins & Bayesian Learning
No ratings yet
SVMs: Geometric Margins & Bayesian Learning
43 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
47 pages
Support Vector Machines Explained
No ratings yet
Support Vector Machines Explained
36 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
11 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
4 pages
Support Vector Machines Course Overview
No ratings yet
Support Vector Machines Course Overview
18 pages
PPT-S09-Knowledge Data Discovery-S2
No ratings yet
PPT-S09-Knowledge Data Discovery-S2
72 pages
Understanding Support Vector Machines (SVM)
No ratings yet
Understanding Support Vector Machines (SVM)
50 pages
SVM: Pros and Cons Explained
No ratings yet
SVM: Pros and Cons Explained
28 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
15 pages
SVM Algorithm Flowchart Overview
No ratings yet
SVM Algorithm Flowchart Overview
50 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
34 pages
Understanding Support Vector Machines (SVM)
No ratings yet
Understanding Support Vector Machines (SVM)
28 pages
Understanding Support Vector Machines (SVM)
No ratings yet
Understanding Support Vector Machines (SVM)
27 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
31 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
18 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
4 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
29 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
30 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
52 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
8 pages
Understanding Support Vector Machines (SVM)
No ratings yet
Understanding Support Vector Machines (SVM)
39 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
52 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
23 pages
Advanced NLP Techniques: GloVe & SVM
No ratings yet
Advanced NLP Techniques: GloVe & SVM
38 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
12 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
30 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
21 pages
Support Vector Machines Syllabus Overview
No ratings yet
Support Vector Machines Syllabus Overview
43 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
20 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
13 pages
Solar Energy Forecasting with AI Techniques
No ratings yet
Solar Energy Forecasting with AI Techniques
21 pages
Machine Learning for Intrusion Detection
No ratings yet
Machine Learning for Intrusion Detection
11 pages
Methodology for Detecting Management Fraud
No ratings yet
Methodology for Detecting Management Fraud
16 pages
Comprehensive Machine Learning Course
No ratings yet
Comprehensive Machine Learning Course
6 pages
Optimizing Fire Tube Boiler Design with ML
No ratings yet
Optimizing Fire Tube Boiler Design with ML
31 pages
BCA AI & DA Semester IV Syllabus
No ratings yet
BCA AI & DA Semester IV Syllabus
14 pages
ISLP: Statistical Learning in Python
No ratings yet
ISLP: Statistical Learning in Python
8 pages
Hybrid Deep Learning for Brain Tumor Detection
No ratings yet
Hybrid Deep Learning for Brain Tumor Detection
5 pages
CSSF White Paper Artificial Intelligence 201218
No ratings yet
CSSF White Paper Artificial Intelligence 201218
81 pages
Stock Sentiment Analysis Project 2024
No ratings yet
Stock Sentiment Analysis Project 2024
3 pages
MRI Analysis of Aging in Bonnet Macaques
No ratings yet
MRI Analysis of Aging in Bonnet Macaques
2 pages
Mathematical Foundations of ML
No ratings yet
Mathematical Foundations of ML
332 pages
¿Qué es el Aprendizaje Automático?
No ratings yet
¿Qué es el Aprendizaje Automático?
23 pages
The Role of Artificial Intelligence in Shaping Intelligent Motorways Opportunities Challenges and Real-World Implementations
No ratings yet
The Role of Artificial Intelligence in Shaping Intelligent Motorways Opportunities Challenges and Real-World Implementations
33 pages
Introduction to Econometrics in R
No ratings yet
Introduction to Econometrics in R
3 pages
Soil Classification & Crop Yield ML Report
No ratings yet
Soil Classification & Crop Yield ML Report
80 pages
Thesis Layout Guidelines
No ratings yet
Thesis Layout Guidelines
2 pages
Machine Learning Factor Pricing Model
No ratings yet
Machine Learning Factor Pricing Model
23 pages
Machine Learning for Surgical Infection Detection
No ratings yet
Machine Learning for Surgical Infection Detection
15 pages
Finding Wally with Deep Learning Models
No ratings yet
Finding Wally with Deep Learning Models
4 pages
LSTM-SMI vs ARIMA for Wind Power Forecasting
No ratings yet
LSTM-SMI vs ARIMA for Wind Power Forecasting
22 pages
FTIR Analysis of Benzalkonium Chloride
No ratings yet
FTIR Analysis of Benzalkonium Chloride
5 pages
Austin Housing ML Analysis Report
No ratings yet
Austin Housing ML Analysis Report
47 pages
Twitter Sentiment Analysis Techniques
No ratings yet
Twitter Sentiment Analysis Techniques
7 pages
Machine Learning in Stock Price Forecasting
No ratings yet
Machine Learning in Stock Price Forecasting
5 pages
Data Analytics Lab at VTU Kalaburagi
No ratings yet
Data Analytics Lab at VTU Kalaburagi
61 pages
Non-Invasive Human Activity Recognition
No ratings yet
Non-Invasive Human Activity Recognition
21 pages
Big Data Customer Segmentation Model
No ratings yet
Big Data Customer Segmentation Model
16 pages
Ehsv M1286
No ratings yet
Ehsv M1286
5 pages
ML Code Examples in Python
No ratings yet
ML Code Examples in Python
2 pages

Understanding Support Vector Machines

Uploaded by

Understanding Support Vector Machines

Uploaded by

2)Support Vector Machine (SVM) is a powerful machine learning algorithm used for

Linearly Separable Data points

Support Vector Machine Terminology

Bayesian classification is based on Bayes' Theorem. Bayesian classifiers

 Posterior Probability [P(H/X)]

where X is data tuple and H is some hypothesis.

According to Bayes' Theorem,

P(H/X)= P(X/H)P(H) / P(X)

 A Belief Network allows class conditional independencies to be

There are two components that define a Bayesian Belief Network −

 Directed acyclic graph

Directed Acyclic Graph

Directed Acyclic Graph Representation

Conditional Probability Table

Major Clustering Methods:

You might also like