0% found this document useful (0 votes)
85 views76 pages

Computer Vision Fundamentals Explained

Uploaded by

jpreethi0311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views76 pages

Computer Vision Fundamentals Explained

Uploaded by

jpreethi0311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

COMPUTER VISION

Chapter-1

Computer Vision
Chapter 1 Computer Vision

Computer Vision
Computer Vision (CV) is a field of Artificial Intelligence (AI) that deals
with computational methods to help computers understand and
interpret the content of digital images and videos.

Hence, CV aims to make computers see and understand visual data


input from cameras or sensors.
Chapter 1 Computer Vision
Chapter 1 Human Vision vs Computer Vision
Chapter 1 Computer Vision processing Stages

1. Image Acquisition

2. Image Preprocessing

3. Feature Extraction

4. Segmentation

5. Object Detection and Recognition

6. Post-Processing

7. Analysis and Interpretation

8. Visualization and Output


Chapter 1 Example
Chapter 1 Computer Vision

Key tasks in computer vision

Image Recognition: Identifying objects, people, or patterns within images.

Object Detection: Locating and classifying multiple objects within an image or video
stream.

Image Segmentation: Dividing an image into meaningful segments or regions, often


to identify boundaries and structures.

Face Recognition: Identifying and verifying individuals based on facial features.

Gesture Recognition: Understanding and interpreting human gestures from images


or video.
Chapter 1 Computer Vision

Scene Understanding: Analyzing and comprehending the content and context of a


scene.

Motion Analysis: Detecting and tracking movements within video sequences.

3D Reconstruction: Creating three-dimensional models of objects or scenes from


two-dimensional images.
Chapter-2

Geometric primitives
and transformations
Chapter 2 Geometric primitives and transformations

Geometric primitives and transformations are fundamental concepts in


computer graphics and computer vision.

They form the basis for representing and manipulating visual elements in
both 2D and 3D spaces
Chapter 2 Geometric Primitives

Points: Represented by coordinates (x, y) in 2D or (x, y, z) in 3D space.

Lines and Line Segments: Defined by two points or a point and a direction vector.

Polygons: Closed shapes with straight sides. Triangles, quadrilaterals, and other
polygons are common geometric primitives.

Circles and Ellipses: Defined by a center point and radii (or axes in the case of
ellipses).

Curves: Bézier curves, spline curves, and other parametric curves are used to
represent smooth shapes.
Chapter 2 Geometric Transformations:
Chapter 2 Geometric Transformations:
Geometric transformations involve modifying the position, orientation, and scale of
geometric primitives. Common transformations include:

Translation: Moves an object by a certain distance along a specified direction.

Rotation: Rotates an object around a specified point or axis.

Scaling: Changes the size of an object along different axes.

Shearing: Distorts the shape of an object by stretching or compressing along one or


more axes.

Reflection: Mirrors an object across a specified plane.

Affine Transformations: Combine translation, rotation, scaling, and shearing.

Projective Transformations: Used for perspective transformations in 3D graphics


Chapter 2 Applications:

Computer Graphics: Geometric primitives and transformations are fundamental for


rendering 2D and 3D graphics in applications such as video games, simulations, and
virtual reality.

Computer-Aided Design (CAD):. Modeling objects in engineering and architecture.

Computer Vision: Geometric transformations are applied to align and process


images, correct distortions, and perform other tasks in image analysis.

Robotics: Essential for robot navigation, motion planning, and spatial reasoning.
Chapter-3

Photometric
image formation
Chapter 3 Photometric image formation

Photometric image formation refers to the process by which light interacts


with surfaces and is captured by a camera, resulting in the creation of a
digital image.

This process involves various factors related to the properties of light, the
surfaces of objects, and the characteristics of the imaging system.

Understanding photometric image formation is crucial in computer vision,


computer graphics, and image processing.
Chapter 3 Photometric image formation
Chapter 3 Here are some key concepts involved:

Illumination
Reflection
Shading
Surface Properties
Shadows
Chapter-4

The digital
camera
Chapter 4 The digital camera

A digital camera is an electronic device that captures and stores digital


images.

It differs from traditional film cameras in that it uses electronic sensors to
record images rather than photographic film.

Digital cameras have become widespread due to their convenience, ability


to instantly review images, and ease of sharing and storing photos digitally.
Chapter 4 The digital camera
Chapter 4 The digital camera

Here are key components and concepts related to digital cameras:

Image Sensor:
 Digital cameras use image sensors (such as CCD or CMOS) to convert light
into electrical signals.
 The sensor captures the image by measuring the intensity of light at each
pixel location.
Lens:
 The lens focuses light onto the image sensor.
 Zoom lenses allow users to adjust the focal length, providing optical zoom.

Aperture:
 The aperture is an adjustable opening in the lens that controls the amount of
light entering the camera.
 It affects the depth of field and exposure.
Chapter 4 The digital camera
Shutter:
 The shutter mechanism controls the duration of light exposure to the
image sensor.
 Fast shutter speeds freeze motion, while slower speeds create motion blur.
Viewfinder and LCD Screen:
 Digital cameras typically have an optical or electronic view finder for
composing shots.
 LCD screens on the camera back allow users to review and frame images.
Image
Processor:
 Digital cameras include a built-in image processor to convert raw sensor
data into a viewable image.
 Image processing algorithms may enhance color, sharpness, and reduce
noise.
Memory Card:
 Digital images are stored on removable memory cards, such as SD or CF
cards.
 Memory cards provide a convenient and portable way to store and transfer
images.
Chapter 4 The digital camera

Autofocus and Exposure Systems:


 Autofocus systems automatically adjust the lens to ensure a sharp image.
 Exposure systems determine the optimal combination of aperture, shutter
speed, and ISO sensitivity for proper exposure.
White Balance:
 White balance settings adjust the color temperature of the captured image to
match different lighting conditions.
Modes and Settings:
 Digital cameras offer various shooting modes (e.g., automatic, manual,
portrait, landscape) and settings to control image parameters.
Connectivity:
 USB, HDMI, or wireless connectivity allows users to transfer images to
computers, share online, or connect to other devices.
Battery:
 Digital cameras are powered by rechargeable batteries, providing the
necessary energy for capturing and processing images.
Chapter-5

Point operators
Chapter 5 Point operators

Point operators, also known as point processing or pixel-wise


operations, are basic image processing operations that operate on
individual pixels independently.

These operations are applied to each pixel in an image without considering


the values of neighboring pixels.

Point operators typically involve mathematical operations or functions that


transform the pixel values, resulting in changes to the image's appearance.
Chapter 5 Point operators
Chapter 5 Point operators

Brightness Adjustment:
 Addition/Subtraction: Increase or decrease the intensity of all pixels by adding
or subtracting a constant value.
 Multiplication/Division: Scale the intensity values by multiplying or dividing
them by a constant factor.

Contrast Adjustment:
 Linear Contrast Stretching: Rescale the intensity values to cover the full
dynamic range.
 Histogram Equalization: Adjust the distribution of pixel intensities to enhance
contrast.

Gamma Correction:
 Adjust overall brightness and contrast of an image.

Thresholding:
 Convert a grayscale image to binary by setting a threshold value. Pixels with
values above the threshold become white, and those below become black.
Chapter 5 Point operators

Bit-plane Slicing:
Decompose an image into its binary representation by considering individual
bits.

Color Mapping:
Apply color transformations to change the color balance or convert between color
spaces (e.g., RGB to grayscale).

Inversion:
Invert the intensity values of pixels, turning bright areas dark and vice versa.

Image Arithmetic:
Perform arithmetic operations between pixels of two images, such as addition,
subtraction, multiplication, or division.
Chapter 5 Point operators

[Link]
Chapter 5 Point operators
Chapter-6

Linear filtering
Chapter 6 Linear filtering

Linear filtering is a fundamental concept in image processing that involves


applying a linear operator to an image.

The linear filter operates on each pixel in the image by combining its value
with the values of its neighboring pixels according to a predefined
convolution kernel or matrix.

The convolution operation is a mathematical operation that computes the


weighted sum of pixel values in the image, producing a new value for the
center pixel.
Chapter 6 Linear filtering
Chapter 6 Linear filtering

The general formula


Chapter 6 Linear filtering
Common linear filtering operations include:
Blurring/Smoothing:
 Average filter: Each output pixel is the average of its neighboring pixels.
 Gaussian filter: Applies a Gaussian distribution to compute weights for pixel
averaging.
Edge Detection:
 Sobel filter: Emphasizes edges by computing gradients in the x and y
directions.
 Prewitt filter: Similar to Sobel but uses a different kernel for gradient
computation.
Sharpening:
 Laplacian filter: Enhances high-frequency components to highlight edges.
 High-pass filter: Emphasizes details by subtracting a blurred version of the
image
Embossing:
 Applies an embossing effect by highlighting changes in intensity
Chapter-7

More
neighborhood
operators
Chapter 7 More neighborhood operators

Neighborhood operators in image processing involve the consideration of


pixel values in the vicinity of a target pixel, usually within a defined
neighborhood or window.

Unlike point operators that operate on individual pixels, neighborhood


operators take into account the local structure of the image.
Chapter 7 More neighborhood operators

Here are some common neighborhood operators:


Chapter 7 More neighborhood operators

Median Filter

Gaussian Filter

Non-local Means Filter

Anisotropic Diffusion

Morphological Operators

Laplacian of Gaussian (LoG)

Homomorphic Filtering

Adaptive Histogram Equalization


Chapter-8

Fourier
transforms
Chapter 8 Fourier transforms

Fourier transforms play a significant role in computer vision for analyzing


and processing images.

They are used to decompose an image into its frequency components,


providing valuable information for tasks such as image filtering, feature
extraction, and pattern recognition
Chapter 8 Fourier transforms
Chapter 8 Fourier transforms
Here are some ways Fourier transforms are employed in computer vision:

Frequency Analysis:
Image Filtering:
Image Enhancement:
Texture Analysis:
Pattern Recognition:
Image Compression:
Image Registration:
Optical Character Recognition (OCR):
Chapter 8 Fourier transforms
Here are some ways Fourier transforms are employed in computer vision:

Homomorphic Filtering:
 Homomorphic filtering, which involves transforming an image to a
logarithmic domain using Fourier transforms, is used in applications such as
document analysis and enhancement.

Image Reconstruction:
 Fourier transforms are involved in techniques like computed tomography
(CT) or magnetic resonance imaging (MRI) for reconstructing images from their
projections.
Chapter-9

Pyramids and
wavelets
Chapter 9 Pyramids and wavelets

Pyramids and wavelets are both techniques used in image processing for
multi-resolution analysis, allowing the representation of an image at
different scales.
They are valuable for tasks such as image compression, feature extraction,
and image analysis.
Chapter 9 Pyramids and wavelets

Image Pyramids:
Image pyramids are a series of images representing the same scene but at different
resolutions.

There are two main types of image pyramids:

Gaussian Pyramid:
 Created by repeatedly applying Gaussian smoothing and downsampling to
an image.

 At each level, the image is smoothed to remove high-frequency information,


and then it is subsampled to reduce its size.

 Useful for tasks like image blending, image matching, and coarse-to-fine
image processing.
Chapter 9 Pyramids and wavelets

Laplacian Pyramid:
 Derived from the Gaussian pyramid.

 Each level of the Laplacian pyramid is obtained by subtracting the expanded


version of the higher level Gaussian pyramid from the original image.

 Useful for image compression and coding, where the Laplacian pyramid
represents the residual information not captured by the Gaussian pyramid.
Chapter 9 Pyramids and wavelets

Wavelets are mathematical functions that can be used to analyze signals


and images.

Wavelet transforms provide a multi-resolution analysis by decomposing an


image into approximation (low-frequency) and detail (high-frequency)
components.
Chapter 9 Pyramids and wavelets

Key concepts include:


Wavelet Transform:
 The wavelet transform decomposes an image into different frequency
components by convolving the image with wavelet functions.
 The result is a set of coefficients that represent the image at various scales
and orientations.
Multi-resolution Analysis:
 Wavelet transforms offer a multi-resolution analysis, allowing the
representation of an image at different scales.
 The approximation coefficients capture the low-frequency information, while
detail coefficients capture high-frequency information.
Chapter 9 Pyramids and wavelets

Haar Wavelet:
 The Haar wavelet is a simple wavelet function used in basic wavelet
transforms.
 It represents changes in intensity between adjacent pixels.

Wavelet Compression:
 Wavelet-based image compression techniques, such as JPEG2000, utilize
wavelet transforms to efficiently represent image data in both spatial and
frequency domains.

Image Denoising:
 Wavelet-based thresholding techniques can be applied to denoise images by
thresholding the wavelet coefficients.

Edge Detection:
 Wavelet transforms can be used for edge detection by analyzing the high-
frequency components of the image.
Chapter-10

Geometric
transformations
Chapter 10 Geometric transformations

Geometric transformations are operations that modify the spatial


configuration of objects in a digital image.

These transformations are applied to change the position, orientation,


scale, or shape of objects while preserving certain geometric properties.

Geometric transformations are commonly used in computer graphics,


computer vision, and image processing.

Here are some fundamental geometric transformations:


Chapter 10 Geometric transformations

Translation:
Description: Moves an object by a specified distance along the x and/or
y axes.

Transformation Matrix (2D):

Applications: Object movement, image registration.


Chapter 10 Geometric transformations

Rotation:
Description: Rotates an object by a specified angle about a fixed point.

Transformation Matrix (2D):

Applications: Image rotation, orientation adjustment.


Chapter 10 Geometric transformations

Description: Changes the size of an object by multiplying its


coordinates by scaling factors.

Transformation Matrix (2D):

Applications: Zooming in/out, resizing.


Chapter 10 Geometric transformations

Shearing:

Description: Distorts the shape of an object by varying its coordinates


linearly.

Transformation Matrix (2D):

Applications: Skewing, slanting.


Chapter 10 Geometric transformations

Affine Transformation:
Description: Combines translation, rotation, scaling, and shearing.

Transformation Matrix (2D):

Applications: Generalized transformations


Chapter 10 Geometric transformations

Perspective Transformation:
Description: Represents a perspective projection, useful for simulating
three-dimensional effects.

Transformation Matrix (3D):

Applications: 3D rendering, simulation.


Chapter 10 Geometric transformations

Projective Transformation:
Description: Generalization of perspective transformation with
additional control points.

Transformation Matrix (3D): More complex than the perspective


transformation matrix.

Applications: Computer graphics, augmented reality.


Chapter-11

Global
optimization
Chapter 11 Global optimization

Global optimization is a branch of optimization that focuses on finding the


global minimum or maximum of a function over its entire feasible domain.

Unlike local optimization, which aims to find the optimal solution within a
specific region, global optimization seeks the best possible solution across
the entire search space.

Global optimization problems are often challenging due to the presence of


multiple local optima or complex, non-convex search spaces.
Chapter 11 Global optimization
Here are key concepts and approaches related to global optimization:
Concepts:
Objective Function:
 The function to be minimized or maximized.
Feasible Domain:
 The set of input values (parameters) for which the objective function is
defined.
Global Minimum/Maximum:
 The lowest or highest value of the objective function over the entire feasible
domain.
Local Minimum/Maximum:
 A minimum or maximum within a specific region of the feasible domain.
Chapter 11 Global optimization

Approaches:

Grid Search:
 Dividing the feasible domain into a grid and evaluating the objective function
at each grid point to find the optimal solution.
Random Search:
 Randomly sampling points in the feasible domain and evaluating the objective
function to explore different regions.
Evolutionary Algorithms:
 Genetic algorithms, particle swarm optimization, and other evolutionary
techniques use populations of solutions and genetic operators to iteratively
evolve toward the optimal solution.
Simulated Annealing:
 Inspired by the annealing process in metallurgy, simulated annealing
gradually decreases the temperature to allow the algorithm to escape local
optima.
Chapter 11 Global optimization

Ant Colony Optimization:


 Inspired by the foraging behavior of ants, this algorithm uses pheromone trails
to guide the search for the optimal solution.
Genetic Algorithms:
 Inspired by biological evolution, genetic algorithms use mutation, crossover,
and selection to evolve a population of potential solutions.
Particle Swarm Optimization:
 Simulates the social behavior of birds or fish, where a swarm of particles
moves through the search space to find the optimal solution.
Bayesian Optimization:
 Utilizes probabilistic models to model the objective function and guide the
search toward promising regions.
Quasi-Newton Methods:
 Iterative optimization methods that use an approximation of the Hessian
matrix to find the optimal solution efficiently.
Chapter 12 Real-world Computer Vision Applications
Chapter 12 Computer Vision AI Applications and Use Cases
Manufacturing
Chapter 12 Computer Vision AI Applications and Use Cases
Healthcare
Chapter 12 Computer Vision AI Applications and Use Cases
Security
Chapter 12 Computer Vision AI Applications and Use Cases
Agriculture
Chapter 12 Computer Vision AI Applications and Use Cases
Smart Cities
Chapter 12 Computer Vision AI Applications and Use Cases
Retail
Chapter 12 Computer Vision AI Applications and Use Cases
Logistics
Chapter 12 Computer Vision AI Applications and Use Cases
Pharmaceutical

You might also like