0% found this document useful (0 votes)
426 views4 pages

Computer Vision Overview and Applications

Uploaded by

n4zneen1509
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
426 views4 pages

Computer Vision Overview and Applications

Uploaded by

n4zneen1509
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT 6

COMPUTER VISION
1. Define Computer Vision

The Computer Vision domain of Artificial Intelligence enables machines to see through images or
visual data, process and analyse them based on algorithms and methods to analyse actual phenomena
with images.

2. Explain in detail the application of Computer Vision?

Facial Recognition
Computer vision is essential to the advancement of the home in the era of smart cities and smart
homes. The most crucial application of computer vision is facial recognition in security. Either visitor
identification or visitor log upkeep is possible.
Face Filters
Many of the functionality in today’s apps, including Instagram and Snapchat, rely on computer vision.
One of them is the usage of facial filters. The computer or algorithm may recognise a person’s facial
dynamics through the camera and apply the chosen facial filter.
Google’s Search by Image
Most data that is searched for using Google’s search engine is textual information, but it also has the
intriguing option of returning search results via an image. This makes use of computer vision since it
examines numerous attributes of the input image while also comparing them to those in the database
of images to provide the search result.
Computer Vision in Retail
One of the industries with the quickest growth is retail, which is also utilising computer vision to
improve the user experience. Retailers can analyse navigational routes, find walking patterns, and
track customer movements through stores using computer vision techniques.
Self-Driving Cars
Vision is the fundamental technology behind developing autonomous vehicles. Most leading car
manufacturers in the world are reaping the benefits of investing in artificial intelligence for
developing on-road versions of hands-free technology. This involves the process of identifying the
objects, getting navigational routes and also at the same time environment monitoring.
Medical Imaging
A reliable resource for doctors over the past few decades has been computer-supported medical
imaging software. It doesn’t just produce and analyse images; it also works as a doctor’s helper to aid
in interpretation. The software is used to interpret and transform 2D scan photos into interactive 3D
models that give medical professionals a thorough insight of a patient’s health.
Google Translate App
To read signs written in a foreign language, all you have to do is point the camera on your phone at
the text, and the Google Translate software will very immediately translate them into the language of
your choice. This is a useful application that makes use of Computer Vision, utilising optical character
recognition to view the image and augmented reality to overlay an accurate translation.

3. Explain the different tasks used in a computer vision application?

The various applications of Computer Vision are based on a certain number of tasks which are
performed to get certain information from the input image which can be directly used for prediction
or forms the base for further analysis. The tasks used in a computer vision application are :
Classification Image: Classification problem is the task of assigning an input image one label from
a fixed set of categories. This is one of the core problems in CV that, despite its simplicity, has a large
variety of practical applications.
Classification + Localisation: This is the task which involves both processes of identifying what
object is present in the image and at the same time identifying at what location that object is present
in that image. It is used only for single objects.
Object Detection: Object detection is the process of finding instances of real-world objects such as
faces, bicycles, and buildings in images or videos. Object detection algorithms typically use extracted
features and learning algorithms to recognize instances of an object category. It is commonly used in
applications such as image retrieval and automated vehicle parking systems.
Instance Segmentation: Instance Segmentation is the process of detecting instances of the objects,
giving them a category and then giving each pixel a label based on that. A segmentation algorithm
takes an image as input and outputs a collection of regions (or segments).

4. Explain the term

a. Pixels
The word “pixel” means a picture element. Every photograph, in digital form, is made up of pixels.
They are the smallest unit of information that make up a picture. Usually round or square, they are
typically arranged in a 2-dimensional [Link] more pixels you have, the more closely the image
resembles the original.

b. Pixel value
Each of the pixels that make up an image that is stored on a computer has a pixel value that specifies
its brightness and/or intended color. The byte image, which stores this number as an 8-bit integer with
a possible range of values from 0 to 255, is the most popular pixel format.
Zero is typically used to represent no color or black, and 255 is used to represent full color or white.
c. Resolution
The resolution of an image is occasionally referred to as the number of pixels. One approach is to
define resolution as the width divided by the height when the phrase is used to describe the number
of pixels, for example, a monitor resolution of 1280×1024. Accordingly, there are 1280 pixels from
side to side and 1024 pixels from top to bottom.

d. Grayscale Images
Grayscale images are images which have a range of shades of gray without apparent colour. The
darkest possible shade is black, which is the total absence of colour or zero value of pixel. The lightest
possible shade is white, which is the total presence of colour or 255 value of a pixel. Intermediate
shades of gray are represented by equal brightness levels of the three primary colours.

e. RGB Images
Every image we encounter is a coloured image. Three main colors—Red, Green, and Blue—make up
these graphics. Red, green, and blue can be combined in various intensities to create all the colours
that are visible.

f. Image Features
• A feature is a description of an image.
• Features are the specific structures in the image such as points, edges or objects.
• Other examples of features are related to tasks of CV motion in image sequences, or to shapes
defined in terms of curves or boundaries between different image regions.

g. OpenCV
• OpenCV or Open Source Computer Vision Library is that tool which helps a computer extract
the features from the images. It is used for all kinds of images and video processing and
analysis.
• It is capable of processing images and videos to identify objects, faces, or even handwriting.
• OpenCV for basic image processing operations on images such as resizing, cropping and
many more.

5. How do computers store RGB images?


• Every RGB image is stored in the form of three different channels called the R channel, G
channel and the B channel.
• Each plane separately has several pixels with each pixel value varying from 0 to 255.
• All the three planes when combined form a colour image.
• This means that in a RGB image, each pixel has a set of three different values which together
give colour to that pixel.

6. Why do computer system have a value of 255-pixel value?


Or
Why do pixel values have numbers?

In the computer systems, computer data is in the form of ones and zeros, which we call the binary
system. Each bit in a computer system can have either a zero or a one. Since each pixel uses 1 byte
of an image, which is equivalent to 8 bits of data. Since each bit can have two possible values which
tells us that the 8 bit can have 255 possibilities of values which starts from 0 and ends at 255.

Common questions

Powered by AI

RGB channels are fundamental in color image representation as they form the basis for creating all visible colors by combining red, green, and blue light in varying intensities . Each RGB image comprises three separate channels that store pixel values for the respective colors, ranging from 0 to 255 . When these channels are superimposed, they determine the final color of each pixel in the image. The capacity to adjust the intensity of each channel allows for the creation of a vast spectrum of colors, enabling digital images to closely resemble real-world visuals.

OpenCV (Open Source Computer Vision Library) is a crucial tool for processing and analyzing images and videos in computer vision tasks. It helps extract features such as points, edges, and objects from images . OpenCV is widely used in identifying objects, faces, and even handwriting. It provides functionalities for basic image processing operations like resizing and cropping, which are essential for preprocessing images before analysis . OpenCV's comprehensive tools and libraries make it invaluable for developing computer vision applications across various fields, enhancing automation and accuracy in image processing.

Facial recognition technology leverages computer vision to analyze and interpret facial features from images and camera feeds. By mapping facial dynamics, the technology can identify individuals and maintain visitor logs for security . This application is critical in enhancing security systems, monitoring access control, and enabling personalized user experiences in smart homes and cities . The technology's ability to accurately identify people through visual data has made it a valuable tool in security, law enforcement, and consumer applications, despite privacy and ethical considerations.

Computer vision significantly enhances the retail industry by improving customer interaction and experience. It allows retailers to analyze customer navigational routes, understand walking patterns, and track movements through stores . These insights enable stores to optimize product placement and store layout, enhancing the shopping experience and increasing sales. Additionally, computer vision can assist in inventory management by monitoring product stock levels and automating checkout processes, thereby reducing waiting times and operational costs . This integration of technology transforms retail spaces into smart environments, offering tailored and efficient services to customers.

Pixel values in digital images represent the brightness and color of each pixel, with each value ranging from 0 to 255 . This range is due to the binary system used in computer data storage, where each pixel value is an 8-bit integer. In this system, 8 bits allow for 256 possibilities, ranging from 0, representing black or the absence of color, to 255, representing white or full intensity of color . This range allows for a wide spectrum of shades and colors, critical for accurately rendering images digitally.

Instance segmentation and object detection are different tasks within computer vision. Object detection involves finding instances of real-world objects, such as faces or buildings, in images or videos, typically by using features and learning algorithms to recognize object categories . In contrast, instance segmentation not only detects the instances of objects but also assigns a category to each and labels every pixel based on its category . This means instance segmentation provides more detailed information by distinguishing between individual instances of objects and offering pixel-level annotation, while object detection focuses on identifying the presence and location of objects.

The Google Translate app uses computer vision to perform real-time language translation by incorporating optical character recognition (OCR) and augmented reality technologies . When users point their cameras at text in a foreign language, the app recognizes and reads the text using OCR. It then translates this text and overlays the translation back onto the real world through the camera's display using augmented reality . This combination of technologies allows for seamless and instant translations of written text, enhancing user experience in multilingual environments.

Computer vision technologies form the backbone of autonomous vehicle development by enabling the accurate interpretation of visual data. Vision systems in self-driving cars identify objects in their environment, determine navigational routes, and monitor surroundings continuously . The processing of image data allows the vehicles to make real-time decisions, such as identifying pedestrians, recognizing traffic signals, and maintaining safe distances from other vehicles . This integration of computer vision is crucial for developing hands-free navigation technologies and ensuring safety and efficiency on the road.

Image classification is considered a core problem in computer vision due to its foundational role in various applications that require understanding and categorizing visual data. The task involves assigning an image one label from a fixed set of categories, making it useful in numerous fields such as medical diagnosis, where images need accurate categorization to inform treatment decisions . Other applications include facial recognition systems, autonomous vehicles, and retail analytics, where classification helps in segmenting users, objects, or product types . Its simplicity yet broad applicability across industries underscores its importance in advancing computer vision technologies.

Computer vision applications involve several tasks that process images to extract valuable information. The essential tasks include image classification, which assigns an input image one label from a fixed set of categories . Classification and localization involve identifying the object in the image and its precise location, applicable only for single objects . Object detection detects instances of real-world objects, such as faces or bicycles, in images or videos using features and learning algorithms . Instance segmentation detects instances of objects, categorizes them, and labels each pixel based on its category . These tasks collectively allow computer vision systems to interpret visual data effectively.

You might also like