0% found this document useful (0 votes)

85 views76 pages

Computer Vision Fundamentals Explained

Uploaded by

jpreethi0311

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views76 pages

Computer Vision Fundamentals Explained

Uploaded by

jpreethi0311

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

COMPUTER VISION

Chapter-1

Computer Vision
Chapter 1 Computer Vision

Computer Vision
Computer Vision (CV) is a field of Artificial Intelligence (AI) that deals
with computational methods to help computers understand and
interpret the content of digital images and videos.

Hence, CV aims to make computers see and understand visual data

input from cameras or sensors.
Chapter 1 Computer Vision
Chapter 1 Human Vision vs Computer Vision
Chapter 1 Computer Vision processing Stages

1. Image Acquisition

2. Image Preprocessing

3. Feature Extraction

4. Segmentation

5. Object Detection and Recognition

6. Post-Processing

7. Analysis and Interpretation

8. Visualization and Output

Chapter 1 Example
Chapter 1 Computer Vision

Key tasks in computer vision

Image Recognition: Identifying objects, people, or patterns within images.

Object Detection: Locating and classifying multiple objects within an image or video
stream.

Image Segmentation: Dividing an image into meaningful segments or regions, often

to identify boundaries and structures.

Face Recognition: Identifying and verifying individuals based on facial features.

Gesture Recognition: Understanding and interpreting human gestures from images

or video.
Chapter 1 Computer Vision

Scene Understanding: Analyzing and comprehending the content and context of a

scene.

Motion Analysis: Detecting and tracking movements within video sequences.

3D Reconstruction: Creating three-dimensional models of objects or scenes from

two-dimensional images.
Chapter-2

Geometric primitives
and transformations
Chapter 2 Geometric primitives and transformations

Geometric primitives and transformations are fundamental concepts in

computer graphics and computer vision.

They form the basis for representing and manipulating visual elements in
both 2D and 3D spaces
Chapter 2 Geometric Primitives

Points: Represented by coordinates (x, y) in 2D or (x, y, z) in 3D space.

Lines and Line Segments: Deﬁned by two points or a point and a direction vector.

Polygons: Closed shapes with straight sides. Triangles, quadrilaterals, and other
polygons are common geometric primitives.

Circles and Ellipses: Deﬁned by a center point and radii (or axes in the case of
ellipses).

Curves: Bézier curves, spline curves, and other parametric curves are used to
represent smooth shapes.
Chapter 2 Geometric Transformations:
Chapter 2 Geometric Transformations:
Geometric transformations involve modifying the position, orientation, and scale of
geometric primitives. Common transformations include:

Translation: Moves an object by a certain distance along a speciﬁed direction.

Rotation: Rotates an object around a speciﬁed point or axis.

Scaling: Changes the size of an object along different axes.

Shearing: Distorts the shape of an object by stretching or compressing along one or

more axes.

Reﬂection: Mirrors an object across a speciﬁed plane.

Aﬃne Transformations: Combine translation, rotation, scaling, and shearing.

Projective Transformations: Used for perspective transformations in 3D graphics

Chapter 2 Applications:

Computer Graphics: Geometric primitives and transformations are fundamental for

rendering 2D and 3D graphics in applications such as video games, simulations, and
virtual reality.

Computer-Aided Design (CAD):. Modeling objects in engineering and architecture.

Computer Vision: Geometric transformations are applied to align and process

images, correct distortions, and perform other tasks in image analysis.

Robotics: Essential for robot navigation, motion planning, and spatial reasoning.
Chapter-3

Photometric
image formation
Chapter 3 Photometric image formation

Photometric image formation refers to the process by which light interacts

with surfaces and is captured by a camera, resulting in the creation of a
digital image.

This process involves various factors related to the properties of light, the
surfaces of objects, and the characteristics of the imaging system.

Understanding photometric image formation is crucial in computer vision,

computer graphics, and image processing.
Chapter 3 Photometric image formation
Chapter 3 Here are some key concepts involved:

Illumination
Reﬂection
Shading
Surface Properties
Shadows
Chapter-4

The digital
camera
Chapter 4 The digital camera

A digital camera is an electronic device that captures and stores digital

images.

It differs from traditional ﬁlm cameras in that it uses electronic sensors to
record images rather than photographic ﬁlm.

Digital cameras have become widespread due to their convenience, ability

to instantly review images, and ease of sharing and storing photos digitally.
Chapter 4 The digital camera
Chapter 4 The digital camera

Here are key components and concepts related to digital cameras:

Image Sensor:
 Digital cameras use image sensors (such as CCD or CMOS) to convert light
into electrical signals.
 The sensor captures the image by measuring the intensity of light at each
pixel location.
Lens:
 The lens focuses light onto the image sensor.
 Zoom lenses allow users to adjust the focal length, providing optical zoom.

Aperture:
 The aperture is an adjustable opening in the lens that controls the amount of
light entering the camera.
 It affects the depth of field and exposure.
Chapter 4 The digital camera
Shutter:
 The shutter mechanism controls the duration of light exposure to the
image sensor.
 Fast shutter speeds freeze motion, while slower speeds create motion blur.
Viewfinder and LCD Screen:
 Digital cameras typically have an optical or electronic view finder for
composing shots.
 LCD screens on the camera back allow users to review and frame images.
Image
Processor:
 Digital cameras include a built-in image processor to convert raw sensor
data into a viewable image.
 Image processing algorithms may enhance color, sharpness, and reduce
noise.
Memory Card:
 Digital images are stored on removable memory cards, such as SD or CF
cards.
 Memory cards provide a convenient and portable way to store and transfer
images.
Chapter 4 The digital camera

Autofocus and Exposure Systems:

 Autofocus systems automatically adjust the lens to ensure a sharp image.
 Exposure systems determine the optimal combination of aperture, shutter
speed, and ISO sensitivity for proper exposure.
White Balance:
 White balance settings adjust the color temperature of the captured image to
match different lighting conditions.
Modes and Settings:
 Digital cameras offer various shooting modes (e.g., automatic, manual,
portrait, landscape) and settings to control image parameters.
Connectivity:
 USB, HDMI, or wireless connectivity allows users to transfer images to
computers, share online, or connect to other devices.
Battery:
 Digital cameras are powered by rechargeable batteries, providing the
necessary energy for capturing and processing images.
Chapter-5

Point operators
Chapter 5 Point operators

Point operators, also known as point processing or pixel-wise

operations, are basic image processing operations that operate on
individual pixels independently.

These operations are applied to each pixel in an image without considering

the values of neighboring pixels.

Point operators typically involve mathematical operations or functions that

transform the pixel values, resulting in changes to the image's appearance.
Chapter 5 Point operators
Chapter 5 Point operators

Brightness Adjustment:
 Addition/Subtraction: Increase or decrease the intensity of all pixels by adding
or subtracting a constant value.
 Multiplication/Division: Scale the intensity values by multiplying or dividing
them by a constant factor.

Contrast Adjustment:
 Linear Contrast Stretching: Rescale the intensity values to cover the full
dynamic range.
 Histogram Equalization: Adjust the distribution of pixel intensities to enhance
contrast.

Gamma Correction:
 Adjust overall brightness and contrast of an image.

Thresholding:
 Convert a grayscale image to binary by setting a threshold value. Pixels with
values above the threshold become white, and those below become black.
Chapter 5 Point operators

Bit-plane Slicing:
Decompose an image into its binary representation by considering individual
bits.

Color Mapping:
Apply color transformations to change the color balance or convert between color
spaces (e.g., RGB to grayscale).

Inversion:
Invert the intensity values of pixels, turning bright areas dark and vice versa.

Image Arithmetic:
Perform arithmetic operations between pixels of two images, such as addition,
subtraction, multiplication, or division.
Chapter 5 Point operators

[Link]
Chapter 5 Point operators
Chapter-6

Linear filtering
Chapter 6 Linear filtering

Linear ﬁltering is a fundamental concept in image processing that involves

applying a linear operator to an image.

The linear ﬁlter operates on each pixel in the image by combining its value
with the values of its neighboring pixels according to a predeﬁned
convolution kernel or matrix.

The convolution operation is a mathematical operation that computes the

weighted sum of pixel values in the image, producing a new value for the
center pixel.
Chapter 6 Linear filtering
Chapter 6 Linear filtering

The general formula

Chapter 6 Linear filtering
Common linear filtering operations include:
Blurring/Smoothing:
 Average filter: Each output pixel is the average of its neighboring pixels.
 Gaussian filter: Applies a Gaussian distribution to compute weights for pixel
averaging.
Edge Detection:
 Sobel filter: Emphasizes edges by computing gradients in the x and y
directions.
 Prewitt filter: Similar to Sobel but uses a different kernel for gradient
computation.
Sharpening:
 Laplacian filter: Enhances high-frequency components to highlight edges.
 High-pass filter: Emphasizes details by subtracting a blurred version of the
image
Embossing:
 Applies an embossing effect by highlighting changes in intensity
Chapter-7

More
neighborhood
operators
Chapter 7 More neighborhood operators

Neighborhood operators in image processing involve the consideration of

pixel values in the vicinity of a target pixel, usually within a deﬁned
neighborhood or window.

Unlike point operators that operate on individual pixels, neighborhood

operators take into account the local structure of the image.
Chapter 7 More neighborhood operators

Here are some common neighborhood operators:

Chapter 7 More neighborhood operators

Median Filter

Gaussian Filter

Non-local Means Filter

Anisotropic Diffusion

Morphological Operators

Laplacian of Gaussian (LoG)

Homomorphic Filtering

Adaptive Histogram Equalization

Chapter-8

Fourier
transforms
Chapter 8 Fourier transforms

Fourier transforms play a signiﬁcant role in computer vision for analyzing

and processing images.

They are used to decompose an image into its frequency components,

providing valuable information for tasks such as image ﬁltering, feature
extraction, and pattern recognition
Chapter 8 Fourier transforms
Chapter 8 Fourier transforms
Here are some ways Fourier transforms are employed in computer vision:

Frequency Analysis:
Image Filtering:
Image Enhancement:
Texture Analysis:
Pattern Recognition:
Image Compression:
Image Registration:
Optical Character Recognition (OCR):
Chapter 8 Fourier transforms
Here are some ways Fourier transforms are employed in computer vision:

Homomorphic Filtering:
 Homomorphic ﬁltering, which involves transforming an image to a
logarithmic domain using Fourier transforms, is used in applications such as
document analysis and enhancement.

Image Reconstruction:
 Fourier transforms are involved in techniques like computed tomography
(CT) or magnetic resonance imaging (MRI) for reconstructing images from their
projections.
Chapter-9

Pyramids and
wavelets
Chapter 9 Pyramids and wavelets

Pyramids and wavelets are both techniques used in image processing for
multi-resolution analysis, allowing the representation of an image at
different scales.
They are valuable for tasks such as image compression, feature extraction,
and image analysis.
Chapter 9 Pyramids and wavelets

Image Pyramids:
Image pyramids are a series of images representing the same scene but at different
resolutions.

There are two main types of image pyramids:

Gaussian Pyramid:
 Created by repeatedly applying Gaussian smoothing and downsampling to
an image.

 At each level, the image is smoothed to remove high-frequency information,

and then it is subsampled to reduce its size.

 Useful for tasks like image blending, image matching, and coarse-to-ﬁne
image processing.
Chapter 9 Pyramids and wavelets

Laplacian Pyramid:
 Derived from the Gaussian pyramid.

 Each level of the Laplacian pyramid is obtained by subtracting the expanded

version of the higher level Gaussian pyramid from the original image.

 Useful for image compression and coding, where the Laplacian pyramid
represents the residual information not captured by the Gaussian pyramid.
Chapter 9 Pyramids and wavelets

Wavelets are mathematical functions that can be used to analyze signals

and images.

Wavelet transforms provide a multi-resolution analysis by decomposing an

image into approximation (low-frequency) and detail (high-frequency)
components.
Chapter 9 Pyramids and wavelets

Key concepts include:

Wavelet Transform:
 The wavelet transform decomposes an image into different frequency
components by convolving the image with wavelet functions.
 The result is a set of coefficients that represent the image at various scales
and orientations.
Multi-resolution Analysis:
 Wavelet transforms offer a multi-resolution analysis, allowing the
representation of an image at different scales.
 The approximation coefficients capture the low-frequency information, while
detail coefficients capture high-frequency information.
Chapter 9 Pyramids and wavelets

Haar Wavelet:
 The Haar wavelet is a simple wavelet function used in basic wavelet
transforms.
 It represents changes in intensity between adjacent pixels.

Wavelet Compression:
 Wavelet-based image compression techniques, such as JPEG2000, utilize
wavelet transforms to eﬃciently represent image data in both spatial and
frequency domains.

Image Denoising:
 Wavelet-based thresholding techniques can be applied to denoise images by
thresholding the wavelet coeﬃcients.

Edge Detection:
 Wavelet transforms can be used for edge detection by analyzing the high-
frequency components of the image.
Chapter-10

Geometric
transformations
Chapter 10 Geometric transformations

Geometric transformations are operations that modify the spatial

conﬁguration of objects in a digital image.

These transformations are applied to change the position, orientation,

scale, or shape of objects while preserving certain geometric properties.