Module 3
Support Vector Machine Algorithm
• Support Vector Machine or SVM is one of the most
popular Supervised Learning algorithms, which is used
for Classification as well as Regression problems.
However, primarily, it is used for Classification problems
in Machine Learning.
• The goal of the SVM algorithm is to create the best line
or decision boundary that can segregate n-dimensional
space into classes so that we can easily put the new data
point in the correct category in the future. This best
decision boundary is called a hyperplane.
Example: SVM can be understood with the example that we have used in the KNN
classifier. Suppose we see a strange cat that also has some features of dogs, so if
we want a model that can accurately identify whether it is a cat or dog, so such a
model can be created by using the SVM algorithm. We will first train our model
with lots of images of cats and dogs so that it can learn about different features of
cats and dogs, and then we test it with this strange creature. So as support vector
creates a decision boundary between these two data (cat and dog) and choose
extreme cases (support vectors), it will see the extreme case of cat and dog. On
the basis of the support vectors, it will classify it as a cat. Consider the below
diagram:
Types of SVM
SVM can be of two types:
• Linear SVM: Linear SVM is used for linearly separable
data, which means if a dataset can be classified into two
classes by using a single straight line, then such data is
termed as linearly separable data, and classifier is used
called as Linear SVM classifier.
• Non-linear SVM: Non-Linear SVM is used for non-
linearly separated data, which means if a dataset cannot
be classified by using a straight line, then such data is
termed as non-linear data and classifier used is called as
Non-linear SVM classifier.
Hyperplane and Support Vectors in the SVM algorithm:
Hyperplane: There can be multiple lines/decision boundaries to segregate the classes
in n-dimensional space, but we need to find out the best decision boundary that helps
to classify the data points. This best boundary is known as the hyperplane of SVM.
The dimensions of the hyperplane depend on the features present in the dataset,
which means if there are 2 features (as shown in image), then hyperplane will be a
straight line. And if there are 3 features, then hyperplane will be a 2-dimension plane.
We always create a hyperplane that has a maximum margin, which means the
maximum distance between the data points.
Support Vectors:
The data points or vectors that are the closest to the hyperplane and which affect the
position of the hyperplane are termed as Support Vector. Since these vectors support
the hyperplane, hence called a Support vector.
Working of SVM
Advantages of Support Vector Machine (SVM)
• High-Dimensional Performance: SVM excels in high-dimensional spaces, making it suitable
for image classification and gene expression analysis.
• Nonlinear Capability: Utilizing kernel functions like RBF and polynomial, SVM effectively
handles nonlinear relationships.
• Outlier Resilience: The soft margin feature allows SVM to ignore outliers, enhancing
robustness in spam detection and anomaly detection.
• Binary and Multiclass Support: SVM is effective for both binary classification and multiclass
classification, suitable for applications in text classification.
• Memory Efficiency: SVM focuses on support vectors, making it memory efficient compared to
other algorithms.
Disadvantages of Support Vector Machine (SVM)
• Slow Training: SVM can be slow for large datasets, affecting performance in SVM in data
mining tasks.
• Parameter Tuning Difficulty: Selecting the right kernel and adjusting parameters like C
requires careful tuning, impacting SVM algorithms.
• Noise Sensitivity: SVM struggles with noisy datasets and overlapping classes, limiting
effectiveness in real-world scenarios.
• Limited Interpretability: The complexity of the hyperplane in higher dimensions makes SVM
less interpretable than other models.
• Feature Scaling Sensitivity: Proper feature scaling is essential; otherwise, SVM models may
perform poorly.
Decision Tree Classification Algorithm
• Decision Tree is a Supervised learning
technique that can be used for both
classification and Regression problems, but
mostly it is preferred for solving Classification
problems.
• It is a tree-structured classifier, where internal
nodes represent the features of a dataset,
branches represent the decision rules and
each leaf node represents the outcome.
Decision trees are upside down which means the root is at the
top and then this root is split into various several nodes. They
are nothing but a bunch of if-else statements in layman terms. It
checks if the condition is true and if it is then it goes to the next
node attached to that decision.
• We see that if the weather is cloudy then we must go to play. Why
didn’t it split more? Why did it stop there?
• To answer this question, we need to know about few more
concepts like entropy, information gain, and Gini index. But in
simple terms, the output for the training dataset is always yes for
cloudy weather, since there is no disorderliness here we don’t
need to split the node further.
• The goal of machine learning is to decrease uncertainty or
disorders from the dataset and for this, we use these trees.
• How do we know what should be the root node? what should be
the decision node? when should I stop splitting? To decide this,
there is a metric called “Entropy” which is the amount of
uncertainty in the dataset.