0% found this document useful (0 votes)
56 views4 pages

Stages of Computer Vision Process

The document outlines the five stages of the computer vision process: image acquisition, preprocessing, feature extraction, detection/segmentation, and high-level processing. Each stage is crucial for enhancing image quality, identifying relevant features, and interpreting visual data for applications such as autonomous driving and medical imaging. The document details techniques and algorithms used at each stage to improve the effectiveness of computer vision systems.

Uploaded by

Pratham Bhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views4 pages

Stages of Computer Vision Process

The document outlines the five stages of the computer vision process: image acquisition, preprocessing, feature extraction, detection/segmentation, and high-level processing. Each stage is crucial for enhancing image quality, identifying relevant features, and interpreting visual data for applications such as autonomous driving and medical imaging. The document details techniques and algorithms used at each stage to improve the effectiveness of computer vision systems.

Uploaded by

Pratham Bhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit 3: Making Machines See

COMPUTER VISION – PROCESS:


The Computer Vision process often involves five stages.
1. Image Acquisition:
 Image acquisition is the initial stage in the process of computer vision, involving the capture of
digital images or videos.
 Provides the raw data upon which subsequent analysis is based.
 Digital images can be acquired through digital cameras, scanning physical photographs or
documents, or even generating them using design software.
 The quality and characteristics of the acquired images greatly influence the effectiveness of
subsequent processing and analysis.
 Resolutions of different imaging devices play a significant role in determining the quality of
acquired images. Higher-resolution devices can capture finer details and produce clearer
images compared to those with lower resolutions.
 lighting conditions and angles can influence the effectiveness of image acquisition techniques.

In scientific and medical fields, specialized imaging techniques like MRI (Magnetic Resonance
Imaging) or CT (Computed Tomography) scans are employed to acquire highly detailed images of
biological tissues or structures.

2. Preprocessing:
Preprocessing in computer vision aims to enhance the quality of the acquired image. Some of the
common techniques are-
a. Noise Reduction: Removes unwanted elements like blurriness, random spots, or distortions.
This
makes the image clearer and reduces distractions for algorithms.
Example: Removing grainy effects in low-light photos.
b. Image Normalization: Standardizes pixel values across images for consistency. Adjusts the
pixel
values of an image so they fall within a consistent range (e.g., 0–1 or -1 to 1).
Ensures all images in a dataset have a similar scale, helping the model learn better.
Example: Scaling down pixel values from 0–255 to 0–1.
c. Resizing/Cropping: Changes the size or aspect ratio of the image to make it uniform.
Ensures all images have the same dimensions for analysis.
Example: Resizing all images to 224×224 pixels before feeding them into a neural network.
d. Histogram Equalization: Adjusts the brightness and contrast of an image. Spreads out the
pixel intensity values evenly, enhancing details in dark or bright areas.
Example: Making a low-contrast image look sharper and more detailed.

The main goal for preprocessing is to prepare images for computer vision tasks by:
Removing noise (disturbances).
Highlighting important features.
Ensuring consistency and uniformity across the dataset.
3. Feature Extraction:
Feature extraction involves identifying and extracting relevant visual patterns or attributes from the
pre-processed image.
(i). Edge detection identifies the boundaries between different regions in an image where there
is a significant change in intensity.
(ii). Corner detection identifies points where two or more edges meet. These points are areas
of high curvature in an image, focused on identifying sharp changes in image gradients, which
often correspond to corners or junctions in objects.
(iii). Texture analysis extracts features like smoothness, roughness, or repetition in an image.
(iv). Colour-based feature extraction quantifies colour distributions within the image, enabling
discrimination between different objects or regions based on their colour characteristics.

In deep learning-based approaches, feature extraction is often performed automatically by


convolutional neural networks (CNNs) during the training process.
4. Detection/Segmentation:
Detection and segmentation are fundamental tasks in computer vision, focusing on identifying
objects or regions of interest within an image. These tasks play a pivotal role in applications like
autonomous driving, medical imaging, and object tracking. This crucial stage is categorized into two
primary tasks:
1. Single Object Tasks
2. Multiple Object Tasks
Single Object Tasks: Single object tasks focus on analysing/ or delineate individual objects within an
image, with two main objectives:

i. Classification: This task involves determining the category or class to which a single object
belongs, providing insights into its identity or nature. KNN (K-Nearest Neighbour) algorithm
may be used for supervised classification while K-means clustering algorithm can be used for
unsupervised classification.
ii. Classification + Localization: In addition to classifying objects, this task also involves
precisely localizing the object within the image by predicting bounding boxes that tightly
enclose it.
Multiple Object Tasks: Multiple object tasks deal with scenarios where an image contains multiple
instances of objects or different object classes. These tasks aim to identify and distinguish between
various objects within the image, and they include:

i. Object Detection:
 Object detection focuses on identifying and locating multiple objects of interest within
the image.
 It involves analysing the entire image and drawing bounding boxes around detected
objects, along with assigning class labels to these boxes.
 The main difference between classification and detection is that classification
considers the image as a whole and determines its class whereas detection identifies
the different objects in the image and classifies all of them.
 In detection, bounding boxes are drawn around multiple objects and these are labelled
according to their particular class.
 Object detection algorithms typically use extracted features and learning algorithms to
recognize instances of an object category.
 Some of the algorithms used for object detection are: R-CNN (Region-Based
Convolutional Neural Network), R-FCN (Region-based Fully Convolutional Network),
YOLO (You Only Look Once) and SSD (Single Shot Detector).

Object Detection

ii. Image segmentation:

 It creates a mask around similar characteristic pixels and identifies their class in the
given input image.

 Image segmentation helps to gain a better understanding of the image at a granular


level.

 Pixels are assigned a class and for each object, a pixel-wise mask is created in the
image.

 This helps to easily identify each object separately from the other.

 Two of the popular segmentations are:


a. Semantic Segmentation: It classifies pixels belonging to a particular class. Objects
belonging to the same class are not differentiated. In this image for example the pixels are
identified under class animals but do not identify the type of animal.

b. Instance Segmentation: It classifies pixels belonging to a particular instance. All the objects
in the image are differentiated even if they belong to the same class. In this image for example
the pixels are separately masked even though they belong to the same class.

5. High-Level Processing:

 In the final stage of computer vision, high-level processing plays a crucial role in interpreting
and extracting meaningful information from the detected objects or regions within digital
images.
 This advanced processing enables computers to achieve a deeper understanding of visual
content and make informed decisions based on the visual data.
 Tasks involved in high-level processing include recognizing objects, understanding scenes,
and analysing the context of the visual content.
 Ultimately, high-level processing empowers computer vision systems to extract valuable
insights and drive intelligent decision-making in various applications, ranging from
autonomous driving to medical diagnostics.

You might also like