0% found this document useful (0 votes)
416 views4 pages

Computer Vision Applications and Basics

The document discusses the field of Computer Vision within Artificial Intelligence, detailing its applications such as facial recognition, self-driving cars, and medical imaging. It also outlines key tasks in Computer Vision, including image classification and object detection, and explains fundamental concepts like pixels, resolution, and color representation in images. Overall, it highlights the significance of Computer Vision in various industries and its underlying technology.

Uploaded by

halawa0071
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
416 views4 pages

Computer Vision Applications and Basics

The document discusses the field of Computer Vision within Artificial Intelligence, detailing its applications such as facial recognition, self-driving cars, and medical imaging. It also outlines key tasks in Computer Vision, including image classification and object detection, and explains fundamental concepts like pixels, resolution, and color representation in images. Overall, it highlights the significance of Computer Vision in various industries and its underlying technology.

Uploaded by

halawa0071
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit-4 Computer Vision

The Computer Vision domain of Artificial Intelligence, enables machines to see through images or visual data, process
and analyse them on the basis of algorithms and methods in order to analyse actual phenomena with images.

Applications of Computer Vision


Facial Recognition*:
1. With the advent of smart cities and smart homes, Computer Vision plays a vital role in making the home
smarter. Security being the most important application involves use of Computer Vision for facial recognition.
2. It can be either guest recognition or log maintenance of the visitors. It also finds its application in schools for
an attendance system based on facial recognition of students.

Face Filters*:
The modern-day apps like Instagram and snapchat have a lot of features based on the usage of computer vision. The
application of face filters is one among them. Through the camera the machine or the algorithm is able to identify the
facial dynamics of the person and applies the facial filter selected.

Google’s Search by Image*:


1. The maximum amount of searching for data on Google’s search engine comes from textual data, but at the
same time it has an interesting feature of getting search results through an image.
2. This uses Computer Vision as it compares different features of the input image to the database of images and
give us the search result while at the same time analysing various features of the image.

Computer Vision in Retail*:

1. The retail field has been one of the fastest growing field and at the same time is using Computer Vision for
making the user experience more fruitful. Retailers can use Computer Vision techniques to track customers’
movements through stores, analyse navigational routes and detect walking patterns.
2. Inventory Management is another such application. Through security camera image analysis, a Computer
Vision algorithm can generate a very accurate estimate of the items available in the store.
Self-Driving Cars:
1. Computer Vision is the fundamental technology behind developing autonomous vehicles. Most leading car
manufacturers in the world are reaping the benefits of investing in artificial intelligence for developing on-
road versions of hands-free technology.
2. This involves the process of identifying the objects, getting navigational routes and also at the same time
environment monitoring.

Medical Imaging*:
1. Computer supported medical imaging application has been a trustworthy help for physicians. It doesn’t only
create and analyse images, but also becomes an assistant and helps doctors with their interpretation.
2. The application is used to read and convert 2D scan images into interactive 3D models that enable medical
professionals to gain a detailed understanding of a patient’s health condition.
Google Translate App*:
1. All you need to do to read signs in a foreign language is to point your phone’s camera at the words and let
the Google Translate app tell you what it means in your preferred language almost instantly.
2. By using optical character recognition to see the image and augmented reality to overlay an accurate
translation, this is a convenient tool that uses Computer Vision.
Computer Vision Tasks
The various applications of Computer Vision are based on a certain number of tasks which are performed to
get certain information from the input image which can be directly used for prediction. The tasks used in a
computer vision application are :

1. Classification
Image Classification is a process in Computer Vision where an image is classified depending on its visual
content.

2. Classification + Localisation

This is the task which involves both processes of identifying what object is present in the image and at the same time
identifying at what location that object is present in that image. It is used only for single objects.

3. Object Detection
a) Object detection is the process of finding instances of real-world objects such as faces, bicycles, and buildings
in images or videos.
b) Object detection algorithms typically use extracted features and learning algorithms to recognize instances of
an object category. It is commonly used in applications such as image retrieval and automated vehicle parking
systems.
4. Instance Segmentation

Instance Segmentation is the process of detecting instances of the objects, giving them a category and then giving
each pixel a label on the basis of that. A segmentation algorithm takes an image as input and outputs a collection of
regions (or segments).
Basics of Pixels
The word “pixel” means a picture element. Every photograph, in digital form, is made up of pixels. They are the
smallest unit of information that make up a picture.

Usually round or square, they are typically arranged in a 2-dimensional grid. In the image below, one portion has
been magnified many times over so that you can see its individual composition in pixels. As you can see, the pixels
approximate the actual image. The more pixels you have, the more closely the image resembles the original.

Resolution
The number of pixels in an image is sometimes called the resolution. When the term is used to describe pixel count,
one convention is to express resolution as the width by the height, for example a monitor resolution of 1280×1024.
This means there are 1280 pixels from one side to the other, and 1024 from top to bottom.

Another convention is to express the number of pixels as a single number, like a 5 mega pixel camera (a megapixel is
a million pixels).

Pixel value
1. Each of the pixels that represents an image stored inside a computer has a pixel value which describes how
bright that pixel is, and/or what colour it should be.
2. The most common pixel format is the byte image, where this number is stored as an 8-bit integer giving a
range of possible values from 0 to 255.
3. Typically, zero is to be taken as no colour or black and 255 is taken to be full colour or white.
4. In the computer systems, computer data is in the form of ones and zeros, which we call the binary system.
Each bit in a computer system can have either a zero or a one. Since each pixel uses 1 byte of an image, which
is equivalent to 8 bits of data.
5. Since each bit can have two possible values which tells us that the 8 bit can have 255 possibilities of values
which starts from 0 and ends at 255.
Grayscale Images
1. Grayscale images are images which have a range of shades of gray without apparent colour.
2. The darkest possible shade is black, which is the total absence of colour or zero value of pixel.
3. The lightest possible shade is white, which is the total presence of colour or 255 value of a pixel .
4. Intermediate shades of gray are represented by equal brightness levels of the three primary colours.
Here is an example of a grayscale image. as you check, the value of pixels are within the range of 0- [Link]
computers store the images we see in the form of these numbers.

RGB Images
All the images that we see around are coloured images. These images are made up of three primary colours Red,
Green and Blue. All the colours that are present can be made by combining different intensities of red, green and
blue.

Every RGB image is stored in the form of three different channels called the R channel, G channel and the B channel.
Each plane separately has a number of pixels with each pixel value varying from 0 to 255.

All the three planes when combined together form a colour image.

Common questions

Powered by AI

Computer vision contributes to medical imaging by creating, analyzing, and converting 2D images into interactive 3D models, aiding physicians in understanding a patient's health condition more comprehensively. This technology assists doctors by acting as a reliable partner in interpreting medical images, enhancing diagnostic accuracy and treatment planning .

Grayscale images use a pixel value range from 0 to 255, where 0 represents black (absence of color) and 255 represents white (full color presence). Intermediate values represent varying shades of gray. This range is significant as it enables computers to accurately render an image by adjusting pixel brightness, providing a foundation for further image processing tasks .

RGB images utilize three channels: Red, Green, and Blue, each representing intensity levels from 0 to 255. The combination of different intensities across these channels generates a full spectrum of colors, allowing for accurate color representation in digital images. This multi-channel approach is foundational for creating realistic images in computer and visual processing .

Computer vision enhances security in smart homes and cities primarily through facial recognition technology, which can identify individuals for guest recognition and log maintenance of visitors, thus tightening access control and monitoring. This application extends to educational settings, where it is used for automated attendance systems .

Computer vision is transforming the retail industry by improving customer experience and operational efficiency. Retailers use it to track customer movement and analyze shopping patterns, enhancing store layouts and marketing strategies. Additionally, it aids in inventory management by analyzing security camera footage to estimate stock levels accurately, reducing the risk of overstock or shortage .

Object detection involves finding instances of objects in images or videos and is used for identifying the presence and location of objects. In contrast, instance segmentation extends upon this by also categorizing each pixel within detected objects, providing a detailed map of object boundaries, allowing for more granular analysis .

Google Translate employs computer vision by using optical character recognition (OCR) to read text from images and augmented reality to overlay translations in real-time. This seamless combination allows users to instantly understand foreign language signs by simply pointing their camera at them .

Computer vision is fundamental to autonomous vehicles, enabling them to identify objects, navigate routes, and monitor the environment. These vehicles rely on this technology to safely maneuver without human intervention, as it processes visual data to make real-time driving decisions, which is essential for developing hands-free technology .

Challenges in computer vision-based search include accurately matching varied visual features of images, requiring vast and diverse image databases to ensure relevant results. Unlike text searches, visual data often brings complexities related to image recognition across different contexts, resolutions, and quality, potentially affecting the precision and relevance of search outcomes .

Resolution, defined by pixel count, directly impacts image quality, where higher resolution images offer more detail, improving visual data analysis for computer vision applications. This is crucial for detailed tasks like facial recognition and medical imaging, where clarity and precision of images can significantly influence outcomes and accuracy of the technology .

You might also like