0% found this document useful (0 votes)
357 views41 pages

Introduction to Computer Vision Basics

The document provides an introduction to computer vision and image processing, explaining the differences between human vision and computer vision. It outlines the typical processes involved in computer vision, including image capturing, processing, and analysis, while also discussing the challenges faced in interpreting images. Additionally, it highlights the applications of computer vision across various sectors such as healthcare, transportation, and agriculture.

Uploaded by

mimitsegent
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
357 views41 pages

Introduction to Computer Vision Basics

The document provides an introduction to computer vision and image processing, explaining the differences between human vision and computer vision. It outlines the typical processes involved in computer vision, including image capturing, processing, and analysis, while also discussing the challenges faced in interpreting images. Additionally, it highlights the applications of computer vision across various sectors such as healthcare, transportation, and agriculture.

Uploaded by

mimitsegent
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Dire Dawa University

Institute of Technology
School of Computing

Abdulmejid T. (MSc in CS)


Course Instructor

Feb 12, 2025


Chapter 1
Introduction to Computer
Vision and Image Processing

Vision Computer Vision

Is the state of being It is a field of AI that


able to see trains computers to
interpret and
understand the
visual world.
Introduction to Computer
Vision and Image Processing

Human Vision Computer Vision

Allow humans to perceive and CV aims to duplicate (mimicking) the


understand the world surrounding them effect of human vision by electronically
perceiving and understanding an image.
It is concerned with the automatic
extraction, analysis & understanding of
useful information of an image.
A typical process of Computer vision is illustrated in the above image mainly performs three steps

1 CAPTURING AN IMAGE 2 PROCESSING THE IMAGE


CV SW always includes a digital camera Different CV algorithms are used to
or CCTV to capture image and puts it as process the digital data stored in a file.
a digital file that consists of Zero and These algorithms determine the basic
one's. geometric elements and generate the
image using the stored digital data.

3 ANALYZING AND TAKING REQUIRED ACTION


Finally, the CV analyzes the data, and
according to this analysis, the system takes the
required action for which it is designed.

The goal of computer vision is to develop algorithms that allow computer


to “see”.
Why is computer vision difficult?

LOSS OF
INTERPRETATION BRIGHTNESS
INFORMATION NOISE TOO MUCH DATA
OF IMAGE MEASURED
IN 3D → 2D

Its existence calls The radiance (≈


Occurs in typical for mathematical brightness, image
To understand an
image capture tools wich cope intensity) depends on
image need
devices such as a with which are the irradiance (light
previous
camera. able to cope with CV needs a large source type, intensity
knowledge
Their geometric uncertainty. database to be and position), the
Can be seen as a
properties have More complex truly effective. observer’s position,
mapping
been tools make the the surface local
interpretation:
approximated by image analysis geometry, and the
image data → model
a pinhole model much more surface reflectance
complicated. properties.
Interprate
Interprate
Why is computer vision difficult?
Local window vs. need for global view

Image analysis algorithms analyze a pixel in the image and its local neighborhood.
The computer sees the image through a keyhole; this makes it very difficult to
understand more global context.
How context is taken into account is an important facet of image analysis.

Figure 1: World seen through several keyholes providing only a local context.
It is very difficult to guess what object is depicted
Why is computer vision difficult?
Local window vs. need for global view

It is easy for humans to interpret an image if it is seen globally

Image understanding by a machine can be seen as an attempt to find a relation


between input image(s) and previously established models of the observed world.
Image understanding by a machine
Four possible levels of image representation suitable for image analysis problems

The bottom layer contains raw image data and the higher levels interpret the data.
What is an Image?
A two-dimensional array specifically arranged in rows and columns.

A two-dimensional function, f(x,y), where x and y are spatial (plane)


coordinates, and the amplitude of f at any pair of coordinates (x, y) is
called the intensity or gray level of the image at that point.

It is composed of a finite number of elements, each of which elements


have a particular value at a particular location.

These elements are referred to as picture elements, image elements, and


pixels.
Related fields in CV

Computer Vision is a very interdisciplinary field.


Computer Vision Vs Image Processing
Image Processing is focused on 2D images, studying how to transform one image
into another.

Example
Pixel-wise operation: such as contrast enhancement.
Local operations such as edge extraction, noise removal,
smoothing, and sharpening.
Geometrical transformations such as rotating the image,
rescaling.

Image processing neither require assumptions nor produce interpretations about


the image content.
Usually, we use image-processing techniques as the first step in our applications.
Computer Vision Vs Image Processing
We can think of image processing as a black box that receives an image as input,
transforms it internally, and returns a new image as output.

Example
Adjust the brightness and contrast of an image.
Computer Vision Vs Image Processing

Computer Vision is made to gain high-level understanding from the input digital
images or videos with the purpose of automating tasks that the human visual
system can do.

Image Processing is the field of enhancing the images by tuning many parameter
and features of the images.

So Image Processing is the subset of Computer Vision.

It uses many techniques and Image Processing is just one of them.

Computer vision will try to interpret what is represented in the picture or video.
Image processing, on the other hand, is a subfield of computer vision that
specifically deals with the manipulation and analysis of digital images.
Computer Vision Vs Image Processing
The aim of Computer Vision is to replicate human vision.
object detection application
Example Unlike IP, when we input an image we have a
bounding box and a label with the detected object
Computer Vision Vs Image Processing

Image Processing Computer Vision

Mainly focused on processing the Focused on extracting information from


raw input images to enhance them the input images to understand and
or preparing them to do other tasks predict the visual input like human.

It is a subset of Computer Vision. It is a superset of Computer Vision.

The input can be an image, The output


The input and output are images.
can be a label or a bounding box.
It does not change the input’s
Changes the input’s properties.
properties.
Extracts useful information from the
Does not interpret an image.
input.
We use it after the image-processing
Often the first step of an application.
stages.
Classification of DIP and Computer Vision Processes
Low-level process: (DIP): Primitive operations where inputs and outputs are images
Major functions: imag pre-processing like noise reduction, contrast enhancement,
image sharpening, etc.

Mid-level process (DIP and Computer Vision and Pattern Recognition): Inputs are
images, outputs are attributes (e.g., edges) major functions: segmentation, description,
classification / recognition of objects

High-level process (Computer Vision: make sense of an ensemble of recognized


objects; perform the cognitive functions normally associated with vision
Classification of DIP and Computer Vision Processes
Different Image processing examples
It involves applying various mathematical and computational
operations to images to enhance their quality, extract useful
information, or perform specific tasks.
Image processing techniques can include
Image filtering
Edge detection
Image restoration
Image compression
Feature extraction
Image synthesis
Application of CV and IP

Applications of computer vision across different sectors


Computer Vision in Healthcare
X-Ray Analysis:
The state-of-art image recognition algorithm can be used to detect
patterns in an X-ray image that are too subtle for the human eyes.

Surgical Workflow
Analysis with AI by RSIP
(Real Time Signal and
Image Processing)
Vision
Computer Vision in Healthcare
Cancer Detection
Computer vision is being successfully applied for breast and skin
cancer detection.
With automated cancer detection, doctors can diagnose cancer
faster from an MRI scan.
CT Scan and MRI
Computer vision has now been greatly applied in CT (Computed
Tomography) scans and MRI (Magnetic Resonance Imaging) analysis.
AI with computer vision analyses the radiology images with a high
level of accuracy, and also reduces the time for disease detection,
enhancing the chances of saving a patient's life.
Computer Vision in Transportation
Self Driving Car/Autonomous Car
Computer Vision in Transportation
Pedestrian detection Road Condition Monitoring & Defect
detection
Computer Vision in Manufacturing
Defect Detection Analyzing text and barcodes (OCR)
Computer Vision in Agriculture
Crop Monitoring Automatic Weeding
Computer Vision in Agriculture
Plant Disease Detection
Computer Vision in Retail
Automatic replenishment
Computer Vision in Retail
Detect PPE(Personal Protective Equipment)
Computer Vision in Retail
Self-checkout
Computer Vision in Sport
Computer Vision in Sport
Player Performance Insight
Computer Vision in Sport
Analyze a Soccer game Using Tensoflow Object Detection
Fundamental Image Processing Steps
Fundamental Image Processing Steps
Image Acquisition
This step is also known as preprocessing in image processing.
It involves retrieving the image from a source, usually a hardware-
based source.
Image Enhancement
Image enhancement is the process of bringing out and highlighting
certain features of interest in an image that has been obscured.
This can involve changing the brightness, contrast, etc.
Image Enhancement
is the process of improving the appearance of an image. However,
unlike image enhancement, image restoration is done using certain
mathematical or probabilistic models
Fundamental Image Processing Steps
Color Image Processing
Color image processing includes a number of color modeling
techniques in a digital domain. This step has gained prominence due
to the significant use of digital images over the internet.

Wavelets andMultiresolution Processing


Wavelets are used to represent images in various degrees of
resolution. The images are subdivided into wavelets or smaller
regions for data compression and for pyramidal representation.
Compression
Compression is a process used to reduce the storage required to
save an image or the bandwidth required to transmit it. This is done
particularly when the image is for use on the Internet.
Fundamental Image Processing Steps
Morphological Processing
Morphological processing is a set of processing operations for
morphing images based on their shapes.
Segmentation
It is one of the most difficult steps of image processing. It involves
partitioning an image into its constituent parts or objects.
Fundamental Image Processing Steps
Representation and Description
After segmentation process, each region is represented and described
in a form suitable for further computer processing.
Representation deals with the image’s characteristics and regional
properties. Description deals with extracting quantitative information
that helps differentiate one class of objects from the other.

Recognition
Recognition assigns a label to an object based on its description.

Common questions

Powered by AI

Image processing acts primarily as a preprocessing stage in computer vision, focusing on enhancing and transforming images to prepare them for further analysis. Unlike computer vision, which aims to interpret and extract high-level semantic information from images, image processing is concerned with improving image quality through operations like noise reduction, contrast enhancement, and image sharpening. It does not involve interpreting the image content but rather focuses on preparing the input images for subsequent tasks in computer vision .

The concept of global vs. local views significantly affects image interpretation in computer vision systems. A local view focuses on a small portion of an image, often lacking context, which can lead to misunderstandings or misinterpretations of what the image depicts. Conversely, a global view provides a broader perspective, integrating local details within the entire context, which humans inherently use, allowing for more comprehensive and accurate image interpretation by systems .

Computer vision aims to replicate human visual understanding by developing algorithms that allow systems to interpret and perceive visual data similarly to human vision. Techniques employed include object detection, recognition, and classification, using methods like machine learning and deep learning to model the human cognitive processes involved in visual analysis. These techniques enable systems to identify and categorize objects within images, recognize patterns, and infer relationships between objects .

Mathematical tools are essential in interpreting computer vision data as they provide frameworks to manage uncertainties that arise from noisy data, 3D-to-2D transformations, and varying lighting conditions. Techniques such as probabilistic models, optimization algorithms, and computational geometry are employed to extract meaningful information from complex visual inputs, enabling more accurate and reliable interpretation and decision-making by computer vision systems .

In computer vision, image representation consists of multiple layers, starting with the raw image data at the bottom, which involves basic pixel-level information. As one moves up the hierarchy, the data is processed and interpreted, leading to extraction of higher-level features such as edges, shapes, and textures, eventually resulting in symbolic or semantic representation that provides meaningful insights into the image. This multi-level approach allows the system to progressively refine and understand complex visual information .

The transition from image processing to computer vision occurs when the focus shifts from enhancing images to understanding and interpreting the content within images. Image processing involves operations such as noise reduction or contrast enhancement, which serve to prepare images for further analysis. Mature computer vision tasks, however, involve higher-level interpretation like object detection and image classification, aiming to replicate human understanding and automate decision-making processes based on visual data .

Computer vision has diverse applications across various industries. In healthcare, it aids in X-ray and MRI analysis for cancer detection. In transportation, it supports autonomous driving technologies, including pedestrian detection. In manufacturing, it enhances defect detection and OCR for text and barcode analysis. Agriculture uses computer vision for crop monitoring and automatic weeding, while retail benefits from systems like automatic replenishment and self-checkout .

The fundamental steps in image processing include image acquisition, enhancement, color processing, wavelet and multiresolution processing, compression, morphological processing, segmentation, recognition, and representation and description. Segmentation is considered particularly challenging because it involves dividing an image into meaningful parts or objects, which requires precise analysis and understanding of the image's features amidst variability and noise .

Wavelets play a crucial role in image processing by allowing representation of images at various levels of resolution. This multi-resolution approach benefits digital image processing by facilitating data compression and enabling effective image representation and analysis. Wavelets help in identifying localized changes in an image which is beneficial in edge detection, noise reduction, and other enhancements, allowing for efficient storage and processing of digital images .

The key challenges in computer vision when transitioning from 3D to 2D interpretation of images include loss of information which occurs because 3D objects are captured as 2D projections, introducing ambiguity in understanding the actual structure. Noise introduced by image capture devices further complicates interpretation. Additionally, the vast amount of image data and the complexity of the data necessitate sophisticated mathematical tools to handle uncertainty and derive accurate interpretations .

You might also like