0% found this document useful (0 votes)

46 views99 pages

Computer Vision Fundamentals and Techniques

Uploaded by

Dnyanda Thorat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views99 pages

Computer Vision Fundamentals and Techniques

Uploaded by

Dnyanda Thorat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Unit I

Introduction
Unit I Introduction

●
Image Processing,
● Computer Vision - Low-level, Mid-level, High-level;
● Fundamentals of Image Formation, Transformation:
● Orthogonal,
● Euclidean,
● Affine,
● Projective,
● Fourier Transform,
● Convolution and Filtering,
● Image Enhancement,
● Restoration,
● Histogram Processing.
Image Processing
Computer Vision

Computer Vision (CV) is a branch of Artificial Intelligence (AI) that

helps computers to interpret and understand visual information much like
humans. This is based on the key concepts such as Image Processing,
Feature Extraction, Object Detection, Image Segmentation and other core
techniques in CV.
Mathematical Prerequisites for Computer Vision
1. Linear Algebra

● Linear Algebra
● Vectors
● Matrices and Tensors
● Eigenvalues and Eigenvectors
● Singular Value Decomposition
2. Probability and Statistics

● Probability and Statistics

● Probability Distributions
● Bayesian Inference and Bayes' Theorem
● Markov Chains
● Kalman Filters
3. Signal Processing

● Signal Processing
● Image Filtering and Convolution
● Discrete Fourier Transform (DFT)
● Fast Fourier Transform (FFT)
● Principal Component Analysis (PCA)
Key Concepts in Computer Vision

1. Image Transformation

● Image Transformation
● Geometric Transformations
● Fourier Transform
● Intensity Transformation
2. Image Enhancement

● Image Enhancement
● Histogram Equalization
● Contrast Enhancement
● Image Sharpening
● Color Correction
3. Noise Reduction Techniques

● Noise Reduction Techniques

● Median Filtering
● Bilateral Filtering
● Wavelet Denoising
4. Morphological Operations

● Morphological Operations
● Erosion and Dilation
● Opening
● Closing
● Morphological Gradient
2. Feature Extraction
1. Edge Detection Techniques

● Computer Vision Algorithms

● Edge Detection Techniques
● Canny Edge Detector
● Sobel Operator
● Laplacian of Gaussian (LoG)

2. Corner and Interest Point Detection

● Harris Corner Detection

3. Feature Descriptors

● Feature Descriptors
● SIFT (Scale-Invariant Feature Transform)
● SURF (Speeded-Up Robust Features)
● ORB (Oriented FAST and Rotated BRIEF)
● HOG (Histogram of Oriented Gradients)
How Does Computer Vision Work?

1. Computer Vision works much like the human eye and brain. First, our eyes capture the
image and send the visual data to our brain. The brain then processes this information and
transforms it into a meaningful interpretation, recognizing and categorizing the object based
on its properties.
2. In a similar way, Computer Vision uses a camera (acting like the human eye) to capture
images. The visual data is then processed by algorithms to recognize and identify the
objects based on patterns it has learned. However, before the system can recognize objects
in new images, it needs to be trained on a large dataset of labeled images. This training
enables the system to identify and associate various patterns with their corresponding
labels.
What are the main steps in a typical Computer
Vision Pipeline?
1. Image Acquisition

The first step in a computer vision pipeline is image acquisition. This involves
capturing images or videos using sensors or cameras. The quality and resolution
of the images significantly impact the performance of the subsequent steps.

● Devices Used: Cameras, smartphones, drones, satellite imagery, and

medical imaging devices.
● Considerations: Lighting conditions, focus, frame rate, and resolution.
2. Preprocessing
Preprocessing involves preparing the raw image data for further analysis. This step includes
several techniques to enhance image quality and normalize the data.

● Noise Reduction: Applying filters (e.g., Gaussian filter) to remove noise from the
image.
● Normalization: Adjusting the intensity values to a common scale, often between 0 and
1.
● Image Scaling: Resizing images to a fixed dimension required by the model.
● Data Augmentation: Techniques like rotation, flipping, cropping, and color
adjustments to artificially expand the dataset.
3. Image Segmentation
Image segmentation is the process of partitioning an image into multiple segments or regions to
simplify its analysis. This step is crucial for identifying objects and their boundaries.

● Thresholding: Simple method that converts grayscale images to binary images based on a
threshold value.
● Edge Detection: Using algorithms like Canny, Sobel, or Laplacian to detect edges within an
image.
● Region-Based Segmentation: Techniques like Region Growing or Watershed to segment an
image based on the similarity of pixels.
● Semantic Segmentation: Assigning a label to each pixel of the image using deep learning
models like U-Net or Fully Convolutional Networks (FCNs).
4. Feature Extraction

Feature extraction involves identifying and extracting relevant features from the image that
can be used for further analysis or classification.

● Keypoint Detection: Identifying key points of interest in the image, such as corners
or blobs, using algorithms like SIFT, SURF, or ORB.
● Descriptors: Creating feature descriptors that represent the local neighborhood of
key points.
● Deep Learning Features: Using convolutional neural networks (CNNs) to
automatically learn and extract features from images.
5. Object Detection
Object detection is the task of identifying and locating objects within an image. This step often
involves bounding box regression and object classification.

● Classical Methods: Techniques like Histogram of Oriented Gradients (HOG) combined

with Support Vector Machines (SVM).
● Deep Learning Methods: Models like Faster R-CNN, YOLO (You Only Look Once),
and SSD (Single Shot Multibox Detector) for real-time object detection.
6. Object Recognition and Classification

After detecting objects, the next step is to recognize and classify them into
predefined categories.

● Classification Algorithms: Using traditional machine learning

algorithms like SVM, k-NN, or deep learning models like CNNs.
● Transfer Learning: Fine-tuning pre-trained models like VGG,
ResNet, or Inception for specific classification tasks.
7. Post-Processing
Post-processing involves refining the results obtained from the previous steps to enhance
accuracy and usability.

● Non-Maximum Suppression: Used in object detection to eliminate redundant

bounding boxes.
● Result Aggregation: Combining results from multiple frames in video analysis to
improve stability and reduce false positives.
● Refinement: Techniques like conditional random fields (CRFs) for improving
segmentation boundaries.
8. Visualization and Interpretation
The final step in the computer vision pipeline is visualizing and interpreting the results. This step
is crucial for understanding the performance and making decisions based on the visual data.

● Overlaying Results: Displaying bounding boxes, segmentation masks, and key points on
the original images.
● Metrics and Evaluation: Using metrics like accuracy, precision, recall, F1-score, and
Intersection over Union (IoU) to evaluate model performance.
● User Interface: Developing interactive dashboards or applications to visualize and
interpret the results in real-time.
Popular Libraries for Computer Vision
To implement computer vision tasks effectively, various libraries are used:

1. OpenCV: Mostly used open-source library for computer vision tasks like image processing,
video capture and real-time applications.
2. TensorFlow: A popular deep learning framework that includes tools for building and training
computer vision models.
3. PyTorch: Another deep learning library that provides great flexibility for computer vision tasks
for research and development.
4. scikit-image: A part of the scikit-learn ecosystem, this library provides algorithms for image
processing and computer vision.
Deep Learning for Computer Vision
1. Convolutional Neural Networks (CNNs)

Convolutional Neural Networks are designed for learning spatial hierarchies of

features from images and its key components include:

● Deep Learning for Computer Vision

● Deep learning
● Convolutional Neural Networks
● Convolutional Layers
● Pooling Layers
● Fully Connected Layers
2. Generative Adversarial Networks (GANs)
It consists of two networks (generator and discriminator) that work against each other to
create realistic images. There are various types of GANs each designed for specific tasks and
improvements:

● Generative Adversarial Networks (GANs)

● Deep Convolutional GAN (DCGAN)
● Conditional GAN (cGAN)
● Cycle-Consistent GAN (CycleGAN)
● Super-Resolution GAN (SRGAN)
● StyleGAN
3. Variational Autoencoders (VAEs)

They are the probabilistic version of autoencoders which forces the model to learn a distribution
over the latent space rather than a fixed point, some other autoencoders used in computer vision
are:

● Autoencoders
● Variational Autoencoders (VAEs)
● Denoising Autoencoders (DAE)
● Convolutional Autoencoder (CAE)
4. Vision Transformers (ViT)

They are inspired by transformers models to treat images and sequence of patches
and process them using self-attention mechanisms, some common vision
transformers include:

● Vision Transformers (ViT)

● Swin Transformer
● CvT (Convolutional Vision Transformer)
Computer Vision Tasks

1. Image Classification

It involves analyzing an image and assigning it a specific label or category based on its content such as
identifying whether an image contains a cat, dog or car.
Its techniques are as follows:

● Computer Vision Tasks

● Image Classification
● Image Classification using Support Vector Machine (SVM)
● Image Classification using RandomForest
● Image Classification using CNN
● Image Classification using TensorFlow
● Image Classification using PyTorch Lightning
Computer Vision - Low-level, Mid-level, High-level

1. Low-Level Vision
Operates directly on raw image data (pixels). Focuses on extracting basic features.
Examples:
● Image Preprocessing: noise removal, smoothing, filtering

● Edge Detection: Canny, Sobel, LoG

● Color Space Conversion: RGB to HSV, Grayscale

● Thresholding: Binary, Otsu’s method

● Gradient Computation: intensity or color changes

● Corner Detection: Harris, Shi-Tomasi

2. Mid-Level Vision

Involves grouping and interpreting low-level features into meaningful structures.

Examples:
● Segmentation: dividing image into regions (e.g., watershed, superpixels)

● Object Proposals / Contour Grouping

● Motion Estimation: optical flow

● Depth Estimation: stereo vision, structure from motion

● Feature Matching: SIFT, SURF, ORB

● Tracking: Kalman filter, Mean-Shift, Optical flow tracking

3. High-Level Vision
Involves semantic understanding — interpreting scenes and recognizing objects.
Examples:
● Object Recognition & Classification (e.g., ResNet, YOLO)

● Face Detection & Recognition

● Scene Understanding: indoor vs outdoor, activity recognition

● Image Captioning: describing an image in natural language

● Pose Estimation

● Visual Question Answering

● Image Preprocessing
Sobel Edge Detection :

What is Sobel Edge Detection

Sobel Edge detection is one of the successful approaches adopted in the field
of image processing and computer vision to detect edges in an image. This is
done through a derivation in which the gradient of the image intensity is
derived for each pixel, and hence, provides a determination of direction as
well as the rate of change in the direction. Masks of the Sobel operator are
two 3x3 convolutions one serves to detect a change in the horizontal direction
and the other in the vertical one.
Features of Sobel Edge detection
● Directional Sensitivity: Sobel edge detection is good for edge detection because it uses two different kernels thus it
can detect edges in both the horizontal and vertical phases.
● Noise Reduction: The operator incorporates smoothing or blurring such as the Gaussian blur that helps in
removing noise hence the operator is less sensitive to minor changes in the images.
● Gradient Magnitude Calculation: It calculates the gradient computed per pixel which aids in enhancing edges,
through quantizing the amount of intensification.
● Simple Implementation: Due to these features, the Sobel operator is easy to implement, and it can work in real-
time software systems.
● Edge Orientation: Besides, the edges are not only detected but also give information about the orientation of the
edges which is very useful for the next levels of image processing
The basic steps involved in this algorithm are:

● Noise reduction using Gaussian filter

● Gradient calculation along the horizontal and vertical axis

● Non-Maximum suppression of false edges

● Double thresholding for segregating strong and weak edges

● Edge tracking by hysteresis

Input Image Output Image
Contours

Contours are edges or outline of a objects in a image and is used in

image processing to identify shapes, detect objects or measure their size.
We use OpenCV's findContours() function that works best for binary
images.
There are three important arguments of this function:

● Source Image: This is the image from which we want to find the contours.
● Contour Retrieval Mode: This determines how contours are retrieved.
● Contour Approximation Method: This decides how much detail to keep when
storing
The function gives us three outputs:

● Image: The image with contours found in it.

● Contours: A list of contours. Each contour is made up of the (x, y)
coordinates that outline a shape in the image.
● Hierarchy: This gives extra information about the contours like which ones
are inside others.
1. Importing Necessary Libraries

First, we need to import libraries like numpy and OpenCV that help us
process image.
import cv2

import numpy as np
2. Reading Image

Now, we load the image we want to work with. We use [Link]() to read
the image and [Link](0) pauses the program until you press a key.
image = [Link]('./[Link]')
[Link](0)
3. Converting Image to GrayScale

To make it easier to process the image, we convert it from color (BGR) to

grayscale. Grayscale images are simpler to work with for tasks like detecting
edges.
gray = [Link](image, cv2.COLOR_BGR2GRAY)
4. Edge Detection Using Canny

Next, we apply Canny edge detection which highlights the edges of

objects in the image. This helps us find boundaries of shapes and objects
easily.
edged = [Link](gray, 30, 200)
[Link](0)
5. Finding Contours

We then find the contours, which are the boundaries of objects in the image.
This helps us detect the shapes in the image. We focus on the external
contours.
contours, hierarchy = [Link](edged,
cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
6. Displaying Canny Edges After Contouring

Now, we show the edges that we found using Canny edge

detection. This gives us a visual idea of where the edges of the
objects are.
[Link]('Canny Edges After Contouring', edged)
[Link](0)
7. Printing Number of Contours Found

print("Number of Contours Found = " + str(len(contours)))

Output - 3
8. Drawing Contours on the Original Image

we draw the contours on the original image to visualize the shapes we found.
The contours are drawn in green, and we display the updated image.

[Link](image, contours, -1, (0, 255, 0), 3)

[Link]('Contours', image)
[Link](0)
[Link]()
Affine Transformation

A transformation that can be expressed in the form of a matrix

multiplication (linear transformation) followed by a vector addition
(translation).
1. Rotations (linear transformation)
2. Translations (vector addition)
3. Scale operations (linear transformation)
How do we get an Affine Transformation?

1. We mentioned that an Affine Transformation is basically a relation between two images.

The information about this relation can come, roughly, in two ways:
1. We know both and and we also know that they are related. Then our task is to find
2. We know and . To obtain we only need to apply . Our information for may be
explicit (i.e. have the 2-by-3 matrix) or it can come as a geometric relation between
points.
2. Let's explain this in a better way (b). Since relates 2 images, we can analyze the simplest
case in which it relates three points in both images. Look at the figure below:
Projective transformations (Perspective transformations)

In mathematics, a linear transformation is a function that maps one vector space into another

and is often implemented by a matrix. Mapping is considered a linear transformation if it

preserves vector addition and scalar multiplication. To apply a linear transformation to a

vector (i.e., coordinates of one point, in our case — x and y values of a pixel), it is necessary

to multiply this vector by a matrix representing the linear transform. As an output, you will

get a vector with transformed coordinates.

Linear algebra for computer vision
● Vector spaces and linear transformations
● Eigendecomposition and singular value decomposition (SVD)
● Matrix factorizations and linear least squares
Fourier Transform

When we work in image processing, Fourier transform is an important image

processing tool which is used to decompose an image into the frequency
domain. the input image of the Fourier transform is the spatial domain(x,y)
equivalent. the output of the Fourier transform represents the image in the
frequency domain.
Fast Fourier Transform in Image Processing

Fast Fourier Transform (FFT) is a mathematical algorithm widely used in

image processing to transform images between the spatial domain and the
frequency domain. ( It is like a special translator for images).

● Spatial domain: Each pixel in image has color or brightness value

and together these values form the image you see. This is the spatial
domain—the image described by its pixels.
● Frequency domain:
● Frequency domain: Now imagine describing the same image in a different way—not
by the pixels directly, but by how patterns of light and dark change across the image.
For example:
○ Low frequencies represent smooth, gradual changes (like large shapes
or blurry areas).
○ High frequencies capture sharp changes (like edges or fine details).
● frequency domain shows how much of these patterns (or frequencies) are present in the
image)
Implementing Fast Fourier Transform in Image Processing

The Fast Fourier Transform (FFT) works in three main steps:

1. Forward FFT (Spatial to Frequency Domain): FFT converts pixel values from the spatial domain
into sine and cosine waves, mapping low frequencies to the center and high frequencies to the edges.
2. Filtering in the Frequency Domain: Filters are applied to modify certain frequency ranges for
purposes like noise removal (by eliminating high frequencies) or sharpening (by enhancing high
frequencies).
3. Inverse FFT (Frequency to Spatial Domain): The modified frequency data is transformed back into
the spatial domain, resulting in a processed version of the original image.
Image Filtering Using Convolution in OpenCV
2-D Convolution
The fundamental and the most basic operation in image processing is
convolution. This can be achieved by using Kernels. Kernel is a matrix
that is generally smaller than the image and the center of the kernel
matrix coincides with the pixels.
In a 2D Convolution, the kernel matrix is a 2-dimensional, Square, A x B
matrix, where both A and B are odd integers
Kernel
a small matrix used to apply effects like blurring, sharpening, or
edge detection
Kernels are used to perform mathematical operations on images,
modifying pixel values to achieve various effects.
The kernel is slid across the input image, and at each position, the kernel's values are
multiplied with the corresponding pixel values in the input image. The results are then
summed up, and the final value is assigned to the corresponding pixel in the output
image.
Examples of Kernels:
Linear Kernel:
The simplest kernel, it calculates the dot product of two vectors. Suitable for linearly
separable data.

Polynomial Kernel:
Introduces polynomial terms to capture non-linear relationships.

Gaussian (RBF) Kernel:

A popular choice for many applications, it uses a Gaussian function to measure similarity.

Sigmoid Kernel:
Useful for neural networks, especially when dealing with specific types of data.
Identity Kernel

Identity Kernel is the simplest and the most basic kernel operation
that could be performed. The output image produced is exactly like
the image that is given as the input. It does change the input image. It
is a square matrix with the center element equal to 1. All the other
elements of the matrix are 0.
Image Enhancement Techniques using OpenCV - Python

Image enhancement is the process of improving the quality and appearance of an

image. It can be used to correct flaws or defects in an image, or to simply make
an image more visually appealing. Image enhancement techniques can be applied
to a wide range of images, including photographs, scans, and digital images.
Some common goals of image enhancement include increasing contrast,
sharpness, and colorfulness; reducing noise and blur; and correcting distortion
and other defects. Image enhancement techniques can be applied manually using
image editing software, or automatically using algorithms and computer
programs such as OpenCV.
Image Restoration

Image Restoration Using Spatial Filtering

Spatial filtering is the method of filtering out noise from images

using a specific choice of spatial filters. Spatial filtering is defined as
the technique of modifying a digital image by performing an
operation on small regions or subsets of the original image pixels
directly. Frequently, we use a mask to encompass the region of the
image where this predefined operation is performed.
Histogram Processing

The histogram of a digital image with gray levels in the range [0, L-1] is a discrete
function.
Points about Histogram:

● Histogram of an image provides a global description of the

appearance of an image.
● Information obtained from histogram is very large in quality.
● Histogram of an image represents the relative frequency of
occurrence of various gray levels in an image.
Explanation of 4x4 Matrix
Each value represents the intensity of a pixel, typically ranging from 0
(black) to 255 (white) in an 8-bit grayscale image. In this example,
values are simplified and range from 1 to 8.

A histogram shows the frequency of pixel intensities. It helps analyze the

brightness and contrast of an image.
[Link]

Computer Vision Tutorial for Beginners
No ratings yet
Computer Vision Tutorial for Beginners
13 pages
Image Processing in Computer Vision
No ratings yet
Image Processing in Computer Vision
87 pages
Understanding Computer Vision Basics
No ratings yet
Understanding Computer Vision Basics
33 pages
Essential Steps in Image Processing
No ratings yet
Essential Steps in Image Processing
10 pages
Making Machines See: Computer Vision Basics
No ratings yet
Making Machines See: Computer Vision Basics
6 pages
Computer Vision Basics for Class 12 AI
No ratings yet
Computer Vision Basics for Class 12 AI
28 pages
Understanding Computer Vision Techniques
No ratings yet
Understanding Computer Vision Techniques
9 pages
Computer Vision Fundamentals Guide
No ratings yet
Computer Vision Fundamentals Guide
39 pages
Computer Vision Fundamentals and Techniques
No ratings yet
Computer Vision Fundamentals and Techniques
13 pages
CVPR Unit1 Unit2 Very Detailed Notes
No ratings yet
CVPR Unit1 Unit2 Very Detailed Notes
6 pages
Image Processing Steps Simplified
No ratings yet
Image Processing Steps Simplified
13 pages
Computer Vision ML Pipeline Overview
No ratings yet
Computer Vision ML Pipeline Overview
3 pages
Image Processing and Computer Vision Basics
No ratings yet
Image Processing and Computer Vision Basics
47 pages
Understanding Computer Vision Basics
No ratings yet
Understanding Computer Vision Basics
20 pages
CVPR Unit1 Unit2 Elaborate Notes
No ratings yet
CVPR Unit1 Unit2 Elaborate Notes
4 pages
Introduction to Image Processing & Vision
No ratings yet
Introduction to Image Processing & Vision
21 pages
Computer Vision and Image Processing Overview
No ratings yet
Computer Vision and Image Processing Overview
1 page
Selected Chapter 3 (1)
No ratings yet
Selected Chapter 3 (1)
7 pages
Making Machines See
No ratings yet
Making Machines See
29 pages
AI & ML: Computer Vision Course Overview
No ratings yet
AI & ML: Computer Vision Course Overview
35 pages
CO1
No ratings yet
CO1
22 pages
Image Processing & CNNs for Stress Evaluation
No ratings yet
Image Processing & CNNs for Stress Evaluation
26 pages
Computer Vision: Overview and Applications
No ratings yet
Computer Vision: Overview and Applications
33 pages
Computer Vision: Intensity Transformations
No ratings yet
Computer Vision: Intensity Transformations
45 pages
Computer Vision Pattern Recognition Unit1 Unit2 Notes
No ratings yet
Computer Vision Pattern Recognition Unit1 Unit2 Notes
4 pages
Computer Vision for Class 12 AI
No ratings yet
Computer Vision for Class 12 AI
8 pages
AL-701: Computer Vision Overview
No ratings yet
AL-701: Computer Vision Overview
50 pages
Computer Vision: Key Concepts & Applications
No ratings yet
Computer Vision: Key Concepts & Applications
5 pages
Understanding Computer Vision in AI
No ratings yet
Understanding Computer Vision in AI
9 pages
Fundamentals of Computer Vision Workshop
No ratings yet
Fundamentals of Computer Vision Workshop
21 pages
B Unit 5 Computer Vision
No ratings yet
B Unit 5 Computer Vision
9 pages
AD8703 Computer Vision Course Overview
No ratings yet
AD8703 Computer Vision Course Overview
65 pages
Stages of Computer Vision Process
No ratings yet
Stages of Computer Vision Process
4 pages
Computer Vision Overview for Class 12 AI
0% (1)
Computer Vision Overview for Class 12 AI
3 pages
Understanding Computer Vision Basics
No ratings yet
Understanding Computer Vision Basics
50 pages
Comprehensive Guide to Computer Vision
No ratings yet
Comprehensive Guide to Computer Vision
152 pages
Computer Vision Course Overview CE632
No ratings yet
Computer Vision Course Overview CE632
89 pages
Computer Vision Overview for Professionals
No ratings yet
Computer Vision Overview for Professionals
20 pages
Understanding CVIP Fundamentals
No ratings yet
Understanding CVIP Fundamentals
20 pages
Introduction to Computer Vision Basics
No ratings yet
Introduction to Computer Vision Basics
38 pages
PEC CSM602A Module-2
No ratings yet
PEC CSM602A Module-2
31 pages
Understanding Machine Vision Basics
No ratings yet
Understanding Machine Vision Basics
27 pages
Computer Vision II: Deep Learning Applications
No ratings yet
Computer Vision II: Deep Learning Applications
3 pages
Algorithm Design for Healthcare Imaging
No ratings yet
Algorithm Design for Healthcare Imaging
6 pages
Machine Vision
No ratings yet
Machine Vision
4 pages
Computer Vision Applications & Techniques
No ratings yet
Computer Vision Applications & Techniques
6 pages
Understanding Computer Vision Basics
No ratings yet
Understanding Computer Vision Basics
7 pages
Greyson Chesterfield - Advanced Image Processing With Python and OpenCV-Donbri (2024)
No ratings yet
Greyson Chesterfield - Advanced Image Processing With Python and OpenCV-Donbri (2024)
120 pages
AI in Computer Vision for Ecology
No ratings yet
AI in Computer Vision for Ecology
8 pages
Understanding Computer Vision Basics
No ratings yet
Understanding Computer Vision Basics
32 pages
Deep Learning Lab Manual Overview
No ratings yet
Deep Learning Lab Manual Overview
69 pages
Computer Vision Course Syllabus
No ratings yet
Computer Vision Course Syllabus
5 pages
Computer Vision Class XII Syllabus
No ratings yet
Computer Vision Class XII Syllabus
5 pages
Computer Vision Fundamentals for Class 12
No ratings yet
Computer Vision Fundamentals for Class 12
3 pages
Introduction to Computer Vision Basics
No ratings yet
Introduction to Computer Vision Basics
13 pages
Computer Vision Basics and Techniques
No ratings yet
Computer Vision Basics and Techniques
3 pages
Understanding Computer Vision Basics
No ratings yet
Understanding Computer Vision Basics
45 pages
Class 12 AI: Computer Vision Notes
No ratings yet
Class 12 AI: Computer Vision Notes
10 pages
PDF Compression and OCR Optimization
No ratings yet
PDF Compression and OCR Optimization
354 pages
Vector vs. Raster Data Explained
No ratings yet
Vector vs. Raster Data Explained
1 page
Advanced Illustration Techniques Tutorial
No ratings yet
Advanced Illustration Techniques Tutorial
31 pages
NumPy Image Processing Techniques
No ratings yet
NumPy Image Processing Techniques
5 pages
Scilab IPCV: Image Processing Tools
No ratings yet
Scilab IPCV: Image Processing Tools
4 pages
PIMCAT LED Specifications Overview
No ratings yet
PIMCAT LED Specifications Overview
2 pages
Image Alignment and Stitching Tutorial
No ratings yet
Image Alignment and Stitching Tutorial
75 pages
Color Wheel - Google Search
No ratings yet
Color Wheel - Google Search
1 page
Image Enhancement Techniques Overview
No ratings yet
Image Enhancement Techniques Overview
79 pages
High Resolution Image Specifications
No ratings yet
High Resolution Image Specifications
1 page
Vertex Shader in Graphics Pipeline
No ratings yet
Vertex Shader in Graphics Pipeline
2 pages
Tuan 7 Camera Calibration5
No ratings yet
Tuan 7 Camera Calibration5
69 pages
Image Sampling and Quantization Explained
No ratings yet
Image Sampling and Quantization Explained
152 pages
MATLAB Image Processing Exam Questions
No ratings yet
MATLAB Image Processing Exam Questions
2 pages
Intensity Transformations & Spatial Filtering
No ratings yet
Intensity Transformations & Spatial Filtering
69 pages
Gaussian Filter Applications in Image Processing
No ratings yet
Gaussian Filter Applications in Image Processing
2 pages
OpenCV Image Processing Techniques
No ratings yet
OpenCV Image Processing Techniques
4 pages
Photoshop CC 2020 For Beginners
100% (4)
Photoshop CC 2020 For Beginners
222 pages
Chocolate Art Submission Guidelines
No ratings yet
Chocolate Art Submission Guidelines
6 pages
Digital Image Processing Lab Manual
No ratings yet
Digital Image Processing Lab Manual
11 pages
Digital Image Processing Course EC370
No ratings yet
Digital Image Processing Course EC370
2 pages
Maya Particle Materials Guide
No ratings yet
Maya Particle Materials Guide
23 pages
Image Processing Lab Assignment
No ratings yet
Image Processing Lab Assignment
6 pages
Home Decoration Design Essentials
No ratings yet
Home Decoration Design Essentials
54 pages
Bitmap Graphic Software Quiz Guide
No ratings yet
Bitmap Graphic Software Quiz Guide
8 pages
Computer Vision Quiz 1 Overview
No ratings yet
Computer Vision Quiz 1 Overview
3 pages
3D Display Methods in Graphics
No ratings yet
3D Display Methods in Graphics
26 pages
Morphological Image Processing Techniques
No ratings yet
Morphological Image Processing Techniques
43 pages
VTK Tutorial: Data Structures & Rendering
No ratings yet
VTK Tutorial: Data Structures & Rendering
33 pages
Medical Image Feature Extraction Techniques
No ratings yet
Medical Image Feature Extraction Techniques
32 pages

Computer Vision Fundamentals and Techniques

Uploaded by

Computer Vision Fundamentals and Techniques

Uploaded by

Unit I

Computer Vision (CV) is a branch of Artificial Intelligence (AI) that

● Probability and Statistics

● Noise Reduction Techniques

● Computer Vision Algorithms

2. Corner and Interest Point Detection

● Harris Corner Detection

● Devices Used: Cameras, smartphones, drones, satellite imagery, and

● Classical Methods: Techniques like Histogram of Oriented Gradients (HOG) combined

● Classification Algorithms: Using traditional machine learning

● Non-Maximum Suppression: Used in object detection to eliminate redundant

Convolutional Neural Networks are designed for learning spatial hierarchies of

● Deep Learning for Computer Vision

● Generative Adversarial Networks (GANs)

● Vision Transformers (ViT)

● Computer Vision Tasks

● Edge Detection: Canny, Sobel, LoG

● Color Space Conversion: RGB to HSV, Grayscale

● Thresholding: Binary, Otsu’s method

● Gradient Computation: intensity or color changes

● Corner Detection: Harris, Shi-Tomasi

Involves grouping and interpreting low-level features into meaningful structures.

● Object Proposals / Contour Grouping

● Motion Estimation: optical flow

● Depth Estimation: stereo vision, structure from motion

● Feature Matching: SIFT, SURF, ORB

● Tracking: Kalman filter, Mean-Shift, Optical flow tracking

● Face Detection & Recognition

● Scene Understanding: indoor vs outdoor, activity recognition

● Image Captioning: describing an image in natural language

● Visual Question Answering

What is Sobel Edge Detection

● Noise reduction using Gaussian filter

● Gradient calculation along the horizontal and vertical axis

● Non-Maximum suppression of false edges

● Double thresholding for segregating strong and weak edges

● Edge tracking by hysteresis

Contours are edges or outline of a objects in a image and is used in

● Image: The image with contours found in it.

To make it easier to process the image, we convert it from color (BGR) to

Next, we apply Canny edge detection which highlights the edges of

Now, we show the edges that we found using Canny edge

print("Number of Contours Found = " + str(len(contours)))

[Link](image, contours, -1, (0, 255, 0), 3)

A transformation that can be expressed in the form of a matrix

1. We mentioned that an Affine Transformation is basically a relation between two images.

and is often implemented by a matrix. Mapping is considered a linear transformation if it

preserves vector addition and scalar multiplication. To apply a linear transformation to a

get a vector with transformed coordinates.

When we work in image processing, Fourier transform is an important image

Fast Fourier Transform (FFT) is a mathematical algorithm widely used in

● Spatial domain: Each pixel in image has color or brightness value

The Fast Fourier Transform (FFT) works in three main steps:

Gaussian (RBF) Kernel:

Image enhancement is the process of improving the quality and appearance of an

Image Restoration Using Spatial Filtering

Spatial filtering is the method of filtering out noise from images

● Histogram of an image provides a global description of the

A histogram shows the frequency of pixel intensities. It helps analyze the

You might also like