0% found this document useful (0 votes)

28 views10 pages

Understanding Support Vector Machines

Uploaded by

ahmadmanhal673

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views10 pages

Understanding Support Vector Machines

Uploaded by

ahmadmanhal673

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

6/22/2024

Support Vector Machines

Logisic Regression

1
6/22/2024

Max margin classiﬁcaion

Instead of fitting all the points, focus on boundary points
Aim: learn a boundary that leads to the largest margin
(buffer) from points on both sides

Why: intuition; theoretical support; and works well in practice

Subset of vectors that support (determine boundary) are
called the support vectors

Linear SVM
Max margin classifier: inputs in margin are of unknown class

2
6/22/2024

Maximizing the Margin

First note that the w vector is orthogonal to the +1 plane
if u and v are two points on that plane, then wT(u-v) = 0
Same is true for -1 plane

Also: for point x+ on +1 plane and x- nearest point on -1 plane:

x+ = λw + x-

Compuing the Margin

Also: for point x+ on +1 plane and x- nearest point on -1 plane:

x+ = λw + x-

3
6/22/2024

Compuing the Margin

Define the margin M to be the distance between the +1 and -1
planes

We can now express this in terms of w 

to maximize the margin we minimize the length of w

Learning a Margin-Based Classifier

We can search for the optimal parameters (w and b) by
finding a solution that:
1. Correctly classifies the training examples: {xi,ti}, i=1,..,n
2. Maximizes the margin (same as minimizing wTw)
1 2
min w
2
s.t. (wT x i + b)t i ≥1∀i

Can optimize via projective gradient descent, etc.

Apply Lagrange multipliers: formulate equivalent problem

4
6/22/2024

Learning a Linear SVM

Convert the constrained minimization to an unconstrained
optimization problem: represent constraints as penalty
terms:

For data : {xi,ti}, use the following penalty term:

 0 if (wT x i + b)t i  1
  = max  i [1  (wT xi + b)t i ]
 otherwise  i 0

n
1
Rewrite the w + ∑ max α i [1 − (wT x i + b)t i ]}
2
min{
w,b 2 α i ≥0
minimization problem: i=1

n
Where {αi} are the 1
w + ∑ α i[1 − (wT x i + b)t i ]}
2
= min max{
Lagrange multipliers w ,b α i ≥0 2 i=1

What if data is not linearly separable?

• Introduce slack variables ξi

subject to constraints (for all i):

t i (w⋅ xi ) ≥1 − ξi
ξi ≥ 0
• Example lies on wrong side of hyperplane:
is upper bound on number of training errors

• λ trades off training error versus model complexity

• This is known as the soft-margin extension

5
6/22/2024

Non-linear SVMs: Feature spaces

• General idea: the original feature space can always

be mapped to some higher-dimensional feature
space where the training set is separable:

Φ: x → φ(x)

Learning via Quadratic Programming

• Optimal separating hyperplane can be found by solving
1
arg max（  j    j k y j yk ( x j  x k )）
 j 2 j ,k
where ( x j , y j ) are samples
- This is a quadratic function
- Once  jare found,
the weight matrix w    j y j x j
j

the decision function is h(x)  sign(  j y j (x  x j )  b)

j
• This optimization problem can be solved by quadratic
programming
QP is a well-studied class of optimization algorithms to maximize a quadratic
function subject to linear constraints

6
6/22/2024

Non-linear decision boundaries

• Note that both the learning objective and the decision
function depend only on dot products between patterns
n

• How to form non-linear decision boundaries in input space?

1. Map data into feature space

2. Replace dot products between inputs with feature points

x i ⋅ x j → φ(x i )⋅ φ(xj )
3. Find linear decision boundary in feature space

• Problem: what is a good feature function ϕ(x)?

Kernel trick + QP
• Max margin classifier can be found by solving
1
arg max（  j    j k y j yk ( (x j )   (xk )))
 j 2 j ,k
1
 arg max（  j    j k y j yk ( K ( x j , x k ))
 j 2 j ,k

• the weight matrix (no need to compute and store)

w    j y j ( x j )
j
• the decision function is

h(x)  sign(  j y j ( (x)   (x j ))  b)  sign(  j y j K (x, x j )  b)

j j

7
6/22/2024

Kernel Trick
• Kernel trick: dot-products in feature space can be
computed as a kernel function
φ (x i)⋅ φ (x j ) = K (x i , x j )

• Idea: work directly on x, avoid having to compute ϕ(x)

• Example:

Kernels
Examples of kernels (kernels measure similarity):
1. Polynomial K (x , x ) = (x ⋅ x + 1)
1 2 1 2 2

2
2. Gaussian K (x 1 , x 2 ) = exp(− x 1 − x 2 /2σ 2 )
3. Sigmoid K (x 1 , x 2 ) = tanh(κ (x 1 ⋅ x 2 ) + a)

Each kernel computation corresponds to dot product

calculation for particular mapping ϕ(x): implicitly maps to
high-dimensional space

Why is this useful?

1. Rewrite training examples using more complex features
2. Dataset not linearly separable in original space may be
linearly separable in higher dimensional space

8
6/22/2024

Input transformation
Mapping to a feature space can produce problems:
• High computational burden due to high dimensionality
• Many more parameters

SVM solves these two issues simultaneously

• Kernel trick produces efficient classification
• Dual formulation only assigns parameters to samples, not
features

Doing multi-class classification

• SVMs can only handle two-class outputs (i.e. a
categorical output variable with arity 2).
• Extend to output arity N, learn N SVM’s
– SVM 1 learns “Output==1” vs “Output != 1”
– SVM 2 learns “Output==2” vs “Output != 2”
– :
– SVM N learns “Output==N” vs “Output != N”

9
6/22/2024

Summary
Advantages:
• Kernels allow very flexible hypotheses
• Soft-margin extension permits mis-classified examples
• Excellent results

Disadvantages:
• Must choose kernel parameters
• Very large problems computationally intractable

Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
33 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
53 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
33 pages
Support Vector Machines Overview
No ratings yet
Support Vector Machines Overview
37 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
40 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
50 pages
Support Vector Machines Overview
No ratings yet
Support Vector Machines Overview
40 pages
SVM Tutorial and Applications Overview
100% (1)
SVM Tutorial and Applications Overview
34 pages
Support Vector Machines Overview
No ratings yet
Support Vector Machines Overview
27 pages
Support Vector Machines Explained
No ratings yet
Support Vector Machines Explained
19 pages
SVM Overview and Applications
No ratings yet
SVM Overview and Applications
34 pages
Understanding SVM Classes and Margins
No ratings yet
Understanding SVM Classes and Margins
33 pages
Support Vector Machines Overview and Uses
No ratings yet
Support Vector Machines Overview and Uses
34 pages
Understanding Linear Classifiers and SVMs
No ratings yet
Understanding Linear Classifiers and SVMs
31 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
15 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
23 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
57 pages
SVM Algorithm Flowchart Overview
No ratings yet
SVM Algorithm Flowchart Overview
40 pages
SVM-CDing2024 11 15
No ratings yet
SVM-CDing2024 11 15
54 pages
Optimal Hyperplane in SVM
No ratings yet
Optimal Hyperplane in SVM
8 pages
Understanding Kernel Tricks in SVMs
No ratings yet
Understanding Kernel Tricks in SVMs
43 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
4 pages
SVM: Theory and Applications
No ratings yet
SVM: Theory and Applications
35 pages
Support Vector Machines Explained
No ratings yet
Support Vector Machines Explained
25 pages
cs221 Lecture11
No ratings yet
cs221 Lecture11
71 pages
Support Vector Machine Overview
No ratings yet
Support Vector Machine Overview
45 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
26 pages
Support Vector Machines Overview
No ratings yet
Support Vector Machines Overview
41 pages
Kernel Methods in Support Vector Machines
No ratings yet
Kernel Methods in Support Vector Machines
34 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
36 pages
Support Vector Machines Overview and Uses
No ratings yet
Support Vector Machines Overview and Uses
34 pages
SVM Overview by Mingon Kang
No ratings yet
SVM Overview by Mingon Kang
23 pages
Support Vector Machines Explained
No ratings yet
Support Vector Machines Explained
44 pages
SVM Tutorial and Applications Overview
No ratings yet
SVM Tutorial and Applications Overview
34 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
13 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
44 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
36 pages
Functional vs Geometric Margin in SVM
No ratings yet
Functional vs Geometric Margin in SVM
11 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
14 pages
Hard vs Soft Margin in SVM Explained
No ratings yet
Hard vs Soft Margin in SVM Explained
11 pages
Lec5 Support Vector Machine
No ratings yet
Lec5 Support Vector Machine
28 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
35 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
16 pages
Support Vector Machines Explained
No ratings yet
Support Vector Machines Explained
12 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
36 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
31 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
15 pages
SVM and Decision Trees in Data Science
No ratings yet
SVM and Decision Trees in Data Science
19 pages
Validity of SVM Kernel Functions
0% (1)
Validity of SVM Kernel Functions
87 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
61 pages
Understanding Decision Boundaries in SVM
No ratings yet
Understanding Decision Boundaries in SVM
57 pages
Support Vector Machines Overview
No ratings yet
Support Vector Machines Overview
6 pages
Year 12 HSC Mathematics Assessment 2021
0% (1)
Year 12 HSC Mathematics Assessment 2021
6 pages
Siemens Simatic IT EBR Overview
100% (1)
Siemens Simatic IT EBR Overview
11 pages
C Programming 10-Day Syllabus
No ratings yet
C Programming 10-Day Syllabus
2 pages
Production and Cost Function Dynamics
No ratings yet
Production and Cost Function Dynamics
38 pages
10 1016@j Paid 2020 110029
No ratings yet
10 1016@j Paid 2020 110029
9 pages
1 s2.0 S1077314225000724 Main
No ratings yet
1 s2.0 S1077314225000724 Main
15 pages
Introduction to Real Analysis Course Outline
No ratings yet
Introduction to Real Analysis Course Outline
2 pages
SIA G2 Entry Examination Mathematics Test-中英文
No ratings yet
SIA G2 Entry Examination Mathematics Test-中英文
7 pages
Peluang dan Peubah Acak: Konsep Dasar
No ratings yet
Peluang dan Peubah Acak: Konsep Dasar
26 pages
Undergraduate Class Timetable Overview
No ratings yet
Undergraduate Class Timetable Overview
1 page
Math 8 Quarter 2 Summative Test
100% (1)
Math 8 Quarter 2 Summative Test
2 pages
Trigonometric Proofs and Identities
No ratings yet
Trigonometric Proofs and Identities
1 page
Positive Definite Matrices Overview
No ratings yet
Positive Definite Matrices Overview
10 pages
Class 12 Mathematics Practice Paper
No ratings yet
Class 12 Mathematics Practice Paper
3 pages
QR-Factorization and Linear Transformations
No ratings yet
QR-Factorization and Linear Transformations
8 pages
Likert Scale Analysis and Results Summary
No ratings yet
Likert Scale Analysis and Results Summary
5 pages
Class XI Mid Term Exam Syllabus 2022-23
No ratings yet
Class XI Mid Term Exam Syllabus 2022-23
28 pages
2312.14977diffusion Models For Generative Artificial
No ratings yet
2312.14977diffusion Models For Generative Artificial
23 pages
2D Catapult Mechanism Analysis
No ratings yet
2D Catapult Mechanism Analysis
39 pages
Single Correct Choice Questions
No ratings yet
Single Correct Choice Questions
13 pages
OFDM Channel Estimation via SVD
No ratings yet
OFDM Channel Estimation via SVD
5 pages
An Even Shorter Model Theory
100% (1)
An Even Shorter Model Theory
35 pages
New SAT Format Overview and Timing
No ratings yet
New SAT Format Overview and Timing
4 pages
Implementing Singly Linked List in C
No ratings yet
Implementing Singly Linked List in C
97 pages
Test Bank & Textbook - Differential Equations and Linear Algebra 2nd Edition by Jerry Farlow Examly - Shop
100% (4)
Test Bank & Textbook - Differential Equations and Linear Algebra 2nd Edition by Jerry Farlow Examly - Shop
296 pages
Functions and Relations in Mathematics
No ratings yet
Functions and Relations in Mathematics
18 pages
Understanding Boolean Logic Basics
No ratings yet
Understanding Boolean Logic Basics
11 pages
Areas Related to Circles Problems
No ratings yet
Areas Related to Circles Problems
9 pages
Sizing Control Valves for Two-Phase Flow
No ratings yet
Sizing Control Valves for Two-Phase Flow
9 pages
Pooters in Ecological Sampling Methods
No ratings yet
Pooters in Ecological Sampling Methods
19 pages

Understanding Support Vector Machines

Uploaded by

Understanding Support Vector Machines

Uploaded by

6/22/2024

Support Vector Machines

Max margin classiﬁcaion

Why: intuition; theoretical support; and works well in practice

Maximizing the Margin

Also: for point x+ on +1 plane and x- nearest point on -1 plane:

Compuing the Margin

Also: for point x+ on +1 plane and x- nearest point on -1 plane:

Compuing the Margin

We can now express this in terms of w 

Learning a Margin-Based Classifier

Can optimize via projective gradient descent, etc.

Apply Lagrange multipliers: formulate equivalent problem

Learning a Linear SVM

For data : {xi,ti}, use the following penalty term:

What if data is not linearly separable?

subject to constraints (for all i):

• λ trades off training error versus model complexity

• This is known as the soft-margin extension

Non-linear SVMs: Feature spaces

• General idea: the original feature space can always

Learning via Quadratic Programming

the decision function is h(x)  sign(  j y j (x  x j )  b)

Non-linear decision boundaries

• How to form non-linear decision boundaries in input space?

1. Map data into feature space

2. Replace dot products between inputs with feature points

• Problem: what is a good feature function ϕ(x)?

• the weight matrix (no need to compute and store)

h(x)  sign(  j y j ( (x)   (x j ))  b)  sign(  j y j K (x, x j )  b)

Copyright © 2001, 2003, Andrew W. Moore

• Idea: work directly on x, avoid having to compute ϕ(x)

Each kernel computation corresponds to dot product

Why is this useful?

SVM solves these two issues simultaneously

Doing multi-class classification

You might also like