0% found this document useful (0 votes)

36 views64 pages

Digital Image Processing Overview

The lecture covers digital images and image processing, including the formation and representation of digital images, as well as transformations and filtering techniques. Key topics include the human visual system, image resolution, color models (RGB, CMY, HSV, LAB), and common image file formats. The lecture also introduces Python packages for image handling and processing, emphasizing the importance of data types in image manipulation.

Uploaded by

rahmatzada586

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views64 pages

Digital Image Processing Overview

Uploaded by

rahmatzada586

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Computer Vision

INM460/IN3060

Lecture 2
Digital images and image processing

Dr Giacomo Tarroni
Slides credits: Giacomo Tarroni, Sepehr Jalali
Recap from the previous lecture
Human visual system:
• How the eye works
• Different theories for colour perception
• How the signal is transmitted from the eye to the visual cortex
• Different functions of the visual cortex parts
• Simple/complex cells
• Advantages & limitations of the human visual system
• Gestalt laws: interpreting how perception is created
• Depth and motion perception
Overview of today’s lecture
• Digital images:
◦ Formation
◦ Digital representation
• Digital image processing:
◦ Brightness and colour transformations
◦ Geometric transformations
◦ Filtering (next lecture)
Images and more images
• Images are produced at an extremely high (and growing) rate
1990s Today

Credits: [Link]
Imaging devices

Digital cameras Home surveillance

Webcams Action cameras

360° cameras Camera arrays

Smartphones
What is an image?
“An image is multi-dimensional signal that measures a physical quantity”
• Multi-dimensional signal:
◦ 2D: Image (a function of 𝑥 and 𝑦)
◦ 3D:
• Image (a function of 𝑥, 𝑦, and 𝑧, e.g. CT scan)
• Video (a function of 𝑥, 𝑦 and 𝑡)
• Physical quantity:
◦ Typically visible light for standard photographs (called “natural images”)
◦ Many other possibilities (temperature, acoustic properties, etc.)

Digital photo Thermal image Ultrasound video

2D Digital image
A 2D digital image is described by a multi-dimensional function (mapping)
between spatial coordinates (𝑥 and 𝑦) and image intensity (for greyscale
images) or colour channel values (for colour images)

Greyscale image
Greyscale image x
𝐼 𝑥, 𝑦 : ℤ2 → ℤ y
with
• 𝑥, 𝑦 spatial coordinates
• 𝐼 image intensity (typical range: [0, 255])

Colour image
𝐼 𝑥, 𝑦 : ℤ2 → ℤ3
with
• 𝑥, 𝑦 spatial coordinates
• 𝐼 = 𝐼𝑅 , 𝐼𝐺 , 𝐼𝐵 colour channel values 𝐼(200, 400) = 178
Digital camera
Typical components of a modern DSLR (i.e. digital single lens reflex) and
mirrorless cameras (also see [Link]

Credits: [Link]
From light to pixels
• The light impinging on the image sensor is a spatially continuous signal
• This signal is converted to a digital signal (i.e. a matrix of pixel values)
through spatial sampling and quantisation

Quantisation

Spatial sampling
Spatial sampling
Sampling converts the domain of a continuous signal into a discrete one:
• The light impinging the image plane is a continuous signal in 𝑥 and 𝑦
• In a digital camera, this signal is sampled on a regular 2D grid
• One pixel per grid cell
Spatial sampling
Sampling converts the domain of a continuous signal into a discrete one:
• The light impinging the image plane is a continuous signal in 𝑥 and 𝑦
• In a digital camera, this signal is sampled on a regular 2D grid
• One pixel per grid cell

Let’s examine the actual process focusing on one dimension:

Spatial sampling
Output signal intensity
(continuous values)

Sensor 𝑥 axis
Pixel width
Image resolution
• Image resolution measures the detail an image holds
• It can be described in different ways:
◦ Pixel resolution: image dimensions in pixels (e.g. 640x480, 0.3 MP)
◦ Spatial resolution: size of each pixel (for each dimension) in length (e.g.
1.4x1.4 mm). Used in specific fields (e.g. medical imaging)
◦ Pixel density (for sensor/screens), pixels per inch (to be used when
printing the image), etc.
640
1.4 x 1.4 mm

480

Credits: [Link]
Colour filter array and demosaicing
• Light captured by digital cameras is binned into separate red, green, and blue
values using a colour filter array (CFA)
• Bayer pattern is most common CFA design, based on GRBG: twice as many
G bins than R or B because of higher sensitivity of human eye to green light
• A demosaicing algorithm interpolates the data so that each pixel has a red,
green, and blue (RGB) value
◦ Final pixel resolution preserved!
◦ Raw file is what is collected before demosaicing
• More detailed explanation: [Link]

Mosaic
Demosaiced image
Quantization
Quantisation limits the values a pixel can have to a finite set

Let’s examine the actual process focusing on one dimension:

Quantisation
Output pixel intensity
(discrete values)

Sensor 𝑥 axis
Quantization
Quantisation limits the values a pixel can have to a finite set
• In a greyscale image, if we use one byte (8 bits) per pixel, then we can
represent 28 = 256 different intensities (range [0, 255])

8 bits per pixel 4 bits per pixel 1 bit per pixel

28 = 256 shades 24 = 16 shades 21 = 2 shades
binary image

Credits: [Link]
Quantization
• In standard colour photographic images, we use 8 bits for each colour
channel (red, green, blue), or 24 bits per pixel (i.e. 24-bit colour)
• That means there are 224 possible colours for a pixel: ~16.7 million colours!
• Human eye can distinguish roughly ~10 million colours
Colour
• Colour images typically have three colour channels corresponding to the
amount of red, green, and blue present at each pixel
• Combining them together produces the final colour

Credits: [Link]
Colour theory: human retina
• Trichromatic theory of vision:
◦ Human eyes perceive colour through stimulation of three different types
cones (S, M, L) in the retina
◦ Three types have peak sensitivity roughly around to red, green and blue
• RGB colour model was defined based on the physiology of human eye
• Spectral responses of each colour filter in CFA are quite similar to that of
LMS cones

S M L

Spectral sensitivity
RGB colour model
• First introduced in 1800s, now used in displays
• Additive mixing property of light: red, green, blue lights are summed to form
the final colour
• RGB colour model:
◦ One axis per colour channel
◦ Range from 0 (no light) to 255 (full intensity) along each axis
◦ A colour is a point in this space and is represented as a vector: 𝐼𝑅 , 𝐼𝐺 , 𝐼𝐵

Credits: [Link]
RGB colour model
• Additive colour mixing: RGB describes what kind of light needs to
be emitted to produce a given colour
• Light is added together to move from black to white

Common colour
Colour components
name
R G B

0 0 0

0 0 255

0 255 0

0 255 255

255 0 0

255 0 255

255 255 0

255 255 255

255 127 0
Credits: [Link]
RGB colour model
• Additive colour mixing: RGB describes what kind of light needs to
be emitted to produce a given colour
• Light is added together to move from black to white

Common colour
Colour components
name
R G B

0 0 0 Black

0 0 255 Blue

0 255 0 Green

0 255 255 Cyan

255 0 0 Red

255 0 255 Magenta

255 255 0 Yellow

255 255 255 White

255 127 0 Orange

Credits: [Link]
CMY(K) colour model
• Developed for printing
• Subtractive mixing property of ink: cyan, magenta, yellow (and black) inks
are summed to form the final colour. Each ink subtracts a portion of the light
that would otherwise be reflected from a white background
• RGB colours are subtracted from white light (W):
W-R = G+B = C; W-G = M; W-B=Y
• Black (K) helps to create additional colours (otherwise dark green instead of
black)

Credits: [Link]
[Link]
HSV colour model
• More intuitive and more perceptually relevant than RGB
• Cylindrical coordinate system with
◦ Hue (H): discernible colour based on the dominant wavelength
◦ Saturation (S): vividness of the colour (zero saturation means a greyscale
colour in the centre of the cylinder)
◦ Value (V): brightness
• Example: a bright red colour has a red hue, full saturation, and high value

“How to halve the saturation in RGB space?”

Credits: [Link]
LAB colour model
• RGB and HSV are not perceptually uniform:
the distance between colours in the colour space does not match human
visual perception of colour difference
• CIE LAB attempts to be perceptually uniform
• It is based on opponent process theory: observation that for humans,
retinal colour stimuli are translated into distinctions between
◦ blue versus yellow
◦ red versus green
◦ black (dark) versus white (light)
• Larger gamut than both RGB and CMYK
Other common image types
• RGBD
In addition to a colour image (RGB), acquires
a depth image (D), which represents the
distance from each pixel from the camera
𝐼 𝑥, 𝑦 : ℤ2 → ℤ4

• Volumetric images
• Image data stored as voxels (i.e. volume
elements: small cubes or 3D pixels)
• Example: computed tomography (CT)
𝐼 𝑥, 𝑦, 𝑧 : ℤ3 → ℤ

• Video
A sequence of images over time
𝐼 𝑥, 𝑦, 𝑡 : ℤ3 → ℤ3
Image file formats
• BitMap (BMP): uncompressed, therefore large and lossless. If 24-bit colour
format, then:
◦ a 12MP image requires 36MB of storage
◦ a 30 fps video generates 30 x 36MB = 1.008GB per second
◦ a 2 hour video (image only) requires 7257.6GB (7.26TB) of storage
• JPEG: Joint Photographic Experts Group is a lossy compression method
• TIFF: Tagged Image File Format is a flexible format (both loss and lossless)
• GIF: Graphics Interchange Format is usually limited to an 8-bit palette (256
colours). The GIF format is most suitable for storing graphics with few
colours, such as simple diagrams, shapes, logos and cartoon style images
• PNG: the PNG (Portable Network Graphics) file format was created as a free,
open-source alternative to GIF
Images in Python
• Some of the most common Python packages for handling images are scikit-
image, OpenCV (with its bindings in Python), Pillow, SciPy (with its ndimage
submodule)
• For most common tasks, we will use scikit-image (skimage), which is simple
to use and has a large collection of image processing algorithms
• For plotting images, we will instead use matplotlib

from [Link] import imread

import [Link] as plt

img = imread(‘[Link]’)
[Link](img)
[Link]()

Credits: "Surinamese peppers" by Daveness_98 licensed under CC BY 2.0

Images in Python
What is the img variable? It’s a numpy ndarray object with:
• Shape:
print([Link])
>> (777, 1024, 3)

This means the image has a height of 777 (number of rows of the ndarray), width
of 1024 (number of columns), and 3 colour channels 𝐼𝑅 , 𝐼𝐺 , 𝐼𝐵

• Data type:
print([Link])
>> uint8

This means that each value in each colour

channel is an unsigned integer with 8 bits of
precision. Therefore there are 2 8 = 256
values possible, in the range [0, 255], for
each channel

Credits: "Surinamese peppers" by Daveness_98 licensed under CC BY 2.0

Image data types in skimage
• Images are typically uint8 (range [0, 255]) when loaded
• Performing operations on uint8 images can generate unwanted results:
# Trying to increase red channel
# brightness

img_r = img[:, :, 0]
img_r_mod = img_r+100

fig, ax = [Link](1, 2)
ax[0].imshow(img_r, cmap=‘gray’)
ax[1].imshow(img_r_mod, cmap=‘gray’)
[Link]() Pixel values greater than 255 will be
replaced by the result of %255

• In addition, some image processing operations (e.g. filtering) provide

accurate results only using floating point data dypes
• Thus it is often required to convert images to float (range [0, 1] or [-1, 1])
• skimage has functions for data types conversions: e.g. img_as_float and
img_as_ubyte. These functions expect the input to be in a correct range
• Never use numpy’s astype!
Image data types in skimage
Additional notes:
• Float data type is not restricted to the [-1, 1] range, but you should make sure
your images are to avoid issues in future conversions
• Some skimage functions may return different data types for convenience.
You can use the conversion functions on the output if needed
• Some functions (e.g. filtering) can generate negative values, and that is
normal. However if you want to reconvert to an unsigned type you should first
rescale to the [0, 1] range, otherwise negative values will be clipped to 0
• Rescaling of intensities can be performed using the
exposure.rescale_intensity function
• Have a look at [Link]
[Link]/docs/stable/user_guide/data_types.html for more details on these
aspects
Image plotting with matplotlib
The main plotting function that we will use is [Link]:
• For one channel (width, height, 1) inputs, the data is rescaled to the [0, 1]
range and then mapped to a chosen colormap
• For RGB (height, width, channels) inputs, the data has to be either:
◦ Float type in the [0, 1] range
◦ Uint8 ([0, 255] range)
• Out of range RGB values are clipped:
# Note: syntax not recommended!

img_r = img_as_float(img[:, :, 0])

img_r_mod = img_r+100

fig, ax = [Link](1, 2)
ax[0].imshow(img_r, cmap=‘gray’)
ax[1].imshow(img_r_mod, cmap=‘gray’)
[Link]()
One-channel images are rescaled before plotting

• [Link]
• [Link]
Colour space conversions
The color submodule provides many functions for colour space conversions
• rgb2gray converts from RGB to grayscale, reducing the channels from 3 to 1
and converting the data type from uint8 to float ([0, 1] range)

from [Link] import imread

from [Link] import rgb2gray
import [Link] as plt

img = imread(‘[Link]’)
img_gray = rgb2gray(img)

fig, ax = [Link](1, 2)
ax[0].imshow(img)
ax[1].imshow(img_gray, cmap=‘gray’)
[Link]() Conversion to grayscale

print(img_gray.dtype)
>> float64

There are functions for the most common conversions:

• rgb2hsv and hsv2rgb
• rgb2lab and lab2rgb
Imperfections in images

Low
resolution

Noise Bloom
(i.e. light bleeding on darker background)
Imperfections in images

Motion blur Poor contrast

Compression artefacts Lens distortion

Digital image processing
• Digital image processing is the use of computer algorithms to transform an
image, sometimes trying to fix imperfections
• Example transformations:
◦ Brightness and colour transformations
◦ Geometric transformations
◦ Filtering (next lecture)
Brightness transformations
• Brightness transformations are position-independent and take the form

𝐽=𝑓 𝐼

where
• 𝐼(𝑥, 𝑦) is the intensity (or colour channel value) of original image
• 𝐽(𝑥, 𝑦) is the intensity (or colour channel value) of transformed image
• f describes the transformation between them (e.g.: 𝐽 = 𝐼 + 100)

• The function 𝑓 is independent of position in the image (therefore the (𝑥, 𝑦)

arguments are dropped in the equation above): the result of 𝑓 depends on
the intensity of each pixel and not on their position

• Additional notes:
◦ Care must be taken to ensure data types and their ranges are respected
◦ Most transformations will be introduced for grayscale images. For colour
images, either apply the transformation channel-by-channel or convert to
HSV and apply to V channel
Negative
The negative of a 24-bit colour image can be formed simply as
𝐽 = 255 − 𝐼

[imports]

img = imread(‘[Link]’)
img_neg = 255 – img

fig, ax = [Link](1, 2)
ax[0].imshow(img)
ax[1].imshow(img_neg)
[Link]()
Original image Negative
Tinting
Tinting (and colour balancing) applies an adjustment to the colours, normally
through a multiplication
𝐼𝑅 ′ 𝑠𝑅 𝐼𝑅
𝐼𝐺 ′ = 𝑠𝐺 𝐼𝐺
𝐼𝐵 ′ 𝑠𝐵 𝐼𝐵

where 𝑠𝑅 , 𝑠𝐺 , and 𝑠𝐵 scale the red, green, and blue colour of each pixel
Tinting
Example: increasing redness of an image
𝑠𝑅 = 2, 𝑠𝐺 = 1, and 𝑠𝐵 = 1:

Original image
[imports]
s_r, s_g, s_b = 2, 1, 1

# Load and convert to float

img = img_as_float(imread(‘[Link]’))
img_tint = np.empty_like(img)

img_r = img[:, :, 0]
img_g = img[:, :, 1]
img_b = img[:, :, 2]

Transformed image
img_tint[:, :, 0] = s_r * img_r
img_tint[:, :, 1] = s_g * img_g
img_tint[:, :, 2] = s_b * img_b
img_tint[img_tint > 1] = 1 # Clip values > 1

fig, ax = [Link](2, 1)
ax[0].imshow(img)
ax[1].imshow(img_tint)
[Link]()
Histogram
• A histogram gives the count of pixel intensities in an image
• The full available range is divided into bins for easier interpretation
• Can be computed both for each colour channel or grayscale version
• It can be computed with matplotlib’s hist function (remember to flatten
your input!) or with numpy’s histogram function

• Note: the dynamic range of an image is easily defined using the histogram as
the difference between the min and max intensity values in an image
Image Histogram

min 𝐼 = 103

Dynamic
range

max 𝐼 = 210
Histogram
• A histogram can be normalised using the total number of pixels
• The result is a probability density function (pdf) representing the probability
that a random pixel has a particular intensity
• [Link] with density = True

• In the example in this slide, we could say that a pixel has a higher probability
of having an intensity of 160 than an intensity of 200
Contrast stretching
• If most of the intensities are clustered in the [100, 200] range, the image looks
“washed out” (lack of darker and brighter values)
• Contrast stretching is a linear transformation that aims at extending the
dynamic range of the image Original

• We can transform the values

using a function
𝐽 = 𝛼𝐼 + 𝛽
with 𝛼 and 𝛽 user-defined
parameters
After contrast stretching
• skimage function:
exposure.rescale_intensity
(where instead of 𝛼 and 𝛽, the
user defines the intensity values
to be linearly stretched)
Gamma correction
• Gamma correction applies a non-linear transformation of pixel values
• It tries to take advantage of the non-linear manner in which humans perceive
light (greater sensitivity to differences between darker tones than lighter ones)
• Non-linear: relative Original
distances between pixel
values can both increase
and decrease for different
ranges
• The mathematical form is
𝐽 = 𝐴𝐼 𝛾

After gamma correction

where 𝛾 is a parameter and Credits: [Link]
usually 𝐴 = 255(1−𝛾) for an
image with range [0, 255]
• skimage function:
exposure.adjust_gamma
Cumulative distribution function
• The cumulative distribution function (cdf) gives, for a given intensity value,
the percentage of pixels in the image that have that value or a lower one
• It can be easily computed from the normalised histogram (pdf) by summing to
the value of each bin those of the previous bins
• [Link] with density = True and cumulative = True

cdf
Histogram equalization
• Histogram equalisation applies a non-linear transformation that tries to
uniformly distribute the pixel values in the dynamic range
• Pixels with different intensities can be assigned to the same bin as a
consequence of this transformation Original
• Histogram equalisation leverages
the information in the cdf. The
output’s cdf will be roughly a
straight line
• skimage functions:
◦ exposure.equalize_hist
◦ exposure.equalize_adapt
hist After histogram equalization

• [Link]
ns/26818568/whats-the-
difference-between-histeq-and-
adapthisteq
Geometric transformations
• Geometric transformations change the spatial position of pixels in the
image
• Geometric transformations (generically called image warpings) have a variety
of practical uses, including
◦ Registration: aligning different (but similar) images
◦ Removing distortion
◦ Simplifying further processing

Corrected image
Distorted image
Geometric transformations
• The positions of pixels in the image are transformed
• Mathematically, this is expressed as

𝒙′ = 𝑇 𝒙

where:
◦ 𝒙 = (𝑥, 𝑦) is the position of a point in the distorted image 𝐼
◦ 𝒙′ = (𝑥′, 𝑦′) is the position of a point in the corrected image 𝐽
◦ 𝑇 𝒙 is a mapping function

• The easiest way to implement an image warping from 𝐼 to 𝐽 is as follows:

For every pixel position 𝒙′ in the corrected image:
◦ Using 𝑇 −1 , determine 𝒙 (i.e. where 𝒙′ came from in the distorted image)
◦ Interpolate a value from 𝐼(𝒙) to produce 𝐽(𝒙′)
Geometric transformations
• You may notice this is applied somewhat backwards: rather than using a
(forward) mapping 𝑇 to transform pixels from the distorted image to the
corrected image, we use the (inverse) transform 𝑇 −1

𝑇 Forward mapping
may result in gaps

𝑇 −1 Inverse mapping
ensures no gaps

Interpolation

• This ensures that all the pixels in the corrected image will be filled
• However, it’s often necessary to interpolate pixels from the distorted image
Affine transformation
• Affine transformations are transformations that preserve (among others):
◦ Collinearity (i.e. aligned points will remain aligned)
◦ Parallelism (i.e. parallel lines will remain parallel)
• It can be fully expressed through matrix product:

𝒙′ = 𝐴𝒙

𝑥′ 𝑎11 𝑎12 𝑎13 𝑥 𝑎11 𝑥 + 𝑎12 𝑦 + 𝑎13

𝑦′ = 𝑎21 𝑎22 𝑎23 ∙ 𝑦 = 𝑎21 𝑥 + 𝑎22 𝑦 + 𝑎23
1 0 0 1 1 1

• In skimage, you can define an affine transformation using

[Link], either:
◦ Passing the matrix parameters
A = [Link]([[a11, a12, a13], [a21, a22, a23], [0, 0, 1]])
tform = [Link](A)
◦ Passing the explicit transformation parameters (see next slides)
• To apply the transformation to an image, you can use [Link]
Special cases: translation
• Translation
𝑥′ 1 0 𝑡𝑥 𝑥 𝑥 + 𝑡𝑥
𝑦′ = 0 1 𝑡𝑦 ∙ 𝑦 = 𝑦 + 𝑡𝑦
1 0 0 1 1 1
tform = [Link](translation=(tx, ty))

from [Link] import imread

from skimage import transform
import [Link] as plt

img = imread(‘[Link]’)

# Define affine transformation

tform = [Link](translation=(100, 0))

# Apply to image (note: inverse transformation!)

img_transformed = [Link](img, [Link])

fig, ax = [Link](2, 1)
ax[0].imshow(img)
ax[1].imshow(img_transformed)
[Link]()
Special cases: rotation
• Rotation
𝑥′ cos 𝜃 − sin 𝜃 0 𝑥 𝑥 cos 𝜃 − 𝑦 sin 𝜃
𝑦′ = sin 𝜃 cos 𝜃 0 ∙ 𝑦 = 𝑥 sin 𝜃 + 𝑦 cos 𝜃
1 0 0 1 1 1
tform = [Link](rotation=theta)

from [Link] import imread

from skimage import transform
import [Link] as plt
from math import radians

img = imread(‘[Link]’)

# Define affine transformation

tform = [Link](rotation=radians(30))

# Apply to image (note: inverse transformation!)

img_transformed = [Link](img, [Link])

fig, ax = [Link](2, 1)
ax[0].imshow(img)
ax[1].imshow(img_transformed)
[Link]()
Special cases: scaling
• Scaling
𝑥′ 𝑠𝑥 0 0 𝑥 𝑠𝑥 𝑥
𝑦′ = 0 𝑠𝑦 0 ∙ 𝑦 = 𝑠𝑦 𝑦
1 0 0 1 1 1
tform = [Link](scale=(sx, sy))

from [Link] import imread

from skimage import transform
import [Link] as plt

img = imread(‘[Link]’)

# Define affine transformation

tform = [Link](scale=(0.8, 0.5))

# Apply to image (note: inverse transformation!)

img_transformed = [Link](img, [Link])

fig, ax = [Link](2, 1)
ax[0].imshow(img)
ax[1].imshow(img_transformed)
[Link]()
Special cases: shear
• Shear (or skew)
𝑥′ 1 −sin 𝜃 0 𝑥 𝑥 − 𝑦 sin 𝜃
𝑦′ = 0 cos 𝜃 0 ∙ 𝑦 = 𝑦 cos 𝜃
1 0 0 1 1 1
tform = [Link](shear=theta)

from [Link] import imread

from skimage import transform
import [Link] as plt
from math import radians

img = imread(‘[Link]’)

# Define affine transformation

tform = [Link](shear=radians(30))

# Apply to image (note: inverse transformation!)

img_transformed = [Link](img, [Link])

fig, ax = [Link](2, 1)
ax[0].imshow(img)
ax[1].imshow(img_transformed)
[Link]()
Additional notes
Regarding transformations:
• You can always access the affine matrix of an AffineTransform object by
using [Link]
• Different tform objects can be combined sequentially: see [Link]
[Link]/docs/stable/auto_examples/transform/plot_transform_types.html
• For rotation around the image centre and simple rescaling and resizing
operations, there are dedicated functions in the transform submodule called
rotate, rescale, resize. More details at [Link]
[Link]/docs/stable/auto_examples/transform/plot_rescale.html

Regarding [Link]:
• Interpolation is usually needed, the type of which can be specified
• How to deal with points outside the boundaries can also be specified
• In line with what said in the previous slides, in transformations the source
image is considered the output, and the destination the input. As a
consequence, tform must be inverted before warping
Exercise
Test: manually transform image in order to have well-aligned digits
from [Link] import imread
from skimage import transform
import [Link] as plt
from math import radians

img = imread(‘[Link]’)

# Rotate around centre

img_tran = [Link](img, -15)

# Define affine transformation (translation + shear)

tform = [Link](translation=(-30, 0),
shear=radians(-17))

# Apply to image
img_tran2 = [Link](img_tran, [Link])

fig, ax = [Link](2, 1)
ax[0].imshow(img)
ax[1].imshow(img_tran)
ax[2].imshow(img_tran2)
[Link]()
Common CV tasks
Image matching/alignment/registration
• Identify the transformation which aligns two or more images
• Used for multi-modal comparison, image stitching, etc.

Matching

M. Brown. Automatic panoramic image stitching using invariant features, IJCV 2007
Image matching
Test: find the affine transformation that aligns two images
• The goal is to determine the six coefficients in the 𝐴 matrix
• This can be achieved by finding at least 3 correspondences (e.g. plate
corners) in the two images

𝒑3 𝒒1
𝒒3
𝒑1
𝒒2
𝒑2

Let’s assume that we have manually selected the following correspondences:

𝒑1 = 18, 47 𝑇 𝒒1 = 48, 50 𝑇
𝒑2 = 15, 100 𝑇 𝒒2 = 48, 100 𝑇
𝒑3 = 178, 6 𝑇 𝒒3 = 212, 50 𝑇
Image matching
For each couple of corresponding points 𝒑-𝒒 we must have
𝑥′ 𝑎11 𝑎12 𝑎13 𝑥 𝑎11 𝑥 + 𝑎12 𝑦 + 𝑎13
𝑦′ = 𝑎21 𝑎22 𝑎23 ∙ 𝑦 = 𝑎21 𝑥 + 𝑎22 𝑦 + 𝑎23
1 0 0 1 1 1

Then for three correspondences we can write

𝑥1′ 𝑥1 𝑦1 1 0 0 0 𝑎11
′
𝑦1 0 0 0 𝑥1 𝑦1 1 𝑎12
′
𝑥2 𝑥2 𝑦2 1 0 0 0 𝑎13
= ∙
𝑦2′ 0 0 0 𝑥2 𝑦2 1 𝑎21
𝑥3′ 𝑥3 𝑦3 1 0 0 0 𝑎22
𝑦3′ 0 0 0 𝑥3 𝑦3 1 𝑎23

Or in matrix form
𝒒 = 𝑀𝒂
• The 6 coefficients of the affine transformation can be found with 𝒂 = 𝑀−1 𝒒
• If we had more than 3 correspondences, we could have solved this using
least-squares
Image matching
skimage method: [Link]

[imports]

img = imread('[Link]’)

# Define arrays of point coordinates

P_points = [Link]([[18, 47], [15, 100], [178, 6]])
Q_points = [Link]([[48, 50], [48, 100], [212, 50]])

# Create empty tform object

tform = [Link]()

# Estimate tform parameters using point correspondences

[Link](Q_points, P_points)

# Apply to image
img_aligned = [Link](img, tform)
Projective transformation
• Images normally acquired by photographic cameras are formed by
perspective projection
• A rectangular surface not parallel to the image plane will be projected into a
trapezoid
• If we want to transform it into a rectangle, an affine transformation will not be
enough
• Instead, we must use a projective transformation:

𝑥′ 𝑝11 𝑝12 𝑝13 𝑥

𝑦′ = 𝑝21 𝑝22 𝑝23 ∙ 𝑦
𝑤′ 𝑝31 𝑝32 1 1

• To perform image matching with a projective transformation, at least 4

correspondences are needed
• This is due to the eight unknowns 𝑝𝑖𝑗 , which correspond to the degrees of
freedom (DOF) of this type of projective transformations
Projective transformation
Test: transform the pinball surface into a square

Original Affine Projective

Hierarchy of 2D transformations
Transformation Matrix Transformed squares Properties preserved
𝑝11 𝑝12 𝑝13
Projective 𝑝21 𝑝22 𝑝23
8 DOF Collinearity
𝑝31 𝑝32 1

𝑎11 𝑎12 𝑎13

Affine 𝑎21 𝑎22 𝑎23 + Parallelism of lines
6 DOF 0 0 1

cos 𝜃 − sin 𝜃 𝑡𝑥
Rigid + Lengths, angles, areas
sin 𝜃 cos 𝜃 𝑡𝑦
3 DOF
0 0 1

with DOF meaning “degrees of freedom”

Credits: Marc Pollefeys

Non-linear transformations
• Non-linear transformations cannot be represented as a matrix multiplication
• For them, we need to use the general function that transforms pixel locations
independently:

𝒙′ = 𝑇 𝒙

• Examples of common non-linear transformations include:

Radial lens distortion Non-rigid transformation

Overview of next week’s lecture
• Image filtering
• Linear filtering:
◦ Convolution
◦ Common filters: moving average, Gaussian, sharpening
• Non-linear filtering: median filter
• Edge detection
◦ Gradient-based
◦ Laplacian-based
◦ Non-maximum suppression

Image Processing Fundamentals Explained
No ratings yet
Image Processing Fundamentals Explained
39 pages
Color Image Processing Techniques
No ratings yet
Color Image Processing Techniques
31 pages
Understanding Digital Image Processing
No ratings yet
Understanding Digital Image Processing
101 pages
Understanding Color Perception and Imaging
No ratings yet
Understanding Color Perception and Imaging
31 pages
Digital Image Acquisition Techniques
No ratings yet
Digital Image Acquisition Techniques
52 pages
Understanding Image Formation and Processing
No ratings yet
Understanding Image Formation and Processing
20 pages
Graphics and Image Data Types Explained
No ratings yet
Graphics and Image Data Types Explained
12 pages
Lecture 5
No ratings yet
Lecture 5
77 pages
Understanding Image Processing Techniques
No ratings yet
Understanding Image Processing Techniques
25 pages
Color Image Processing Techniques
No ratings yet
Color Image Processing Techniques
12 pages
Unit 1 PIP
No ratings yet
Unit 1 PIP
16 pages
RGB and CMY Color Models Explained
No ratings yet
RGB and CMY Color Models Explained
43 pages
Image Formation and Digitization Basics
No ratings yet
Image Formation and Digitization Basics
83 pages
Color Science in Multimedia Systems
No ratings yet
Color Science in Multimedia Systems
68 pages
Color Models and Image Processing Techniques
No ratings yet
Color Models and Image Processing Techniques
8 pages
Computer Vision - Image&Color
No ratings yet
Computer Vision - Image&Color
83 pages
Understanding Images and Graphics Basics
No ratings yet
Understanding Images and Graphics Basics
43 pages
Understanding Color Processing in Vision
No ratings yet
Understanding Color Processing in Vision
42 pages
IT5409 - Ch2-Basic Processing-4pages
No ratings yet
IT5409 - Ch2-Basic Processing-4pages
12 pages
Image Compression Techniques Overview
No ratings yet
Image Compression Techniques Overview
72 pages
Understanding Color Processing in Vision
No ratings yet
Understanding Color Processing in Vision
43 pages
Understanding Image Data Formats
No ratings yet
Understanding Image Data Formats
27 pages
Color Models in Digital Image Processing
No ratings yet
Color Models in Digital Image Processing
71 pages
Computer Graphics Concepts Overview
No ratings yet
Computer Graphics Concepts Overview
24 pages
Understanding Digital Image Basics
No ratings yet
Understanding Digital Image Basics
38 pages
Color Image Processing Fundamentals
No ratings yet
Color Image Processing Fundamentals
14 pages
Understanding Digital Image Formats
No ratings yet
Understanding Digital Image Formats
63 pages
Understanding Images and Color Models
No ratings yet
Understanding Images and Color Models
18 pages
Dip (Micro)
No ratings yet
Dip (Micro)
8 pages
Understanding Images and Color Models
No ratings yet
Understanding Images and Color Models
18 pages
Machine Vision Fundamentals and Techniques
100% (1)
Machine Vision Fundamentals and Techniques
78 pages
Image and Video Processing Course Overview
No ratings yet
Image and Video Processing Course Overview
95 pages
Digital Image Processing Fundamentals
No ratings yet
Digital Image Processing Fundamentals
42 pages
Computer Graphics Chapter 1
No ratings yet
Computer Graphics Chapter 1
92 pages
Digital Image Processing Basics
No ratings yet
Digital Image Processing Basics
4 pages
Digital Image Processing Fundamentals
No ratings yet
Digital Image Processing Fundamentals
27 pages
DIP - Unit - I
No ratings yet
DIP - Unit - I
26 pages
Digital Image Processing Fundamentals
No ratings yet
Digital Image Processing Fundamentals
27 pages
Understanding Image Quality and Digitization
No ratings yet
Understanding Image Quality and Digitization
29 pages
Digital Image Processing Overview
No ratings yet
Digital Image Processing Overview
37 pages
Understanding Image Representation and Resolution
No ratings yet
Understanding Image Representation and Resolution
53 pages
Understanding Light and Color Perception
No ratings yet
Understanding Light and Color Perception
2 pages
Color Spaces in Computer Vision
No ratings yet
Color Spaces in Computer Vision
11 pages
Digital Image Fundamentals Overview
No ratings yet
Digital Image Fundamentals Overview
76 pages
Image Sampling and Quantization Explained
No ratings yet
Image Sampling and Quantization Explained
23 pages
Color Image Processing Techniques
No ratings yet
Color Image Processing Techniques
58 pages
Image Analysis Training with Python
No ratings yet
Image Analysis Training with Python
70 pages
Digital Image Fundamentals in Medicine
No ratings yet
Digital Image Fundamentals in Medicine
51 pages
Multimedia Color Theory and Processing
No ratings yet
Multimedia Color Theory and Processing
58 pages
Understanding Color Image Processing
No ratings yet
Understanding Color Image Processing
20 pages
Understanding Color Image Processing
No ratings yet
Understanding Color Image Processing
20 pages
Retina Medical Question Bank PDF
No ratings yet
Retina Medical Question Bank PDF
29 pages
Colour Image Processing Overview
No ratings yet
Colour Image Processing Overview
32 pages
Image and Video Processing Course Syllabus
No ratings yet
Image and Video Processing Course Syllabus
290 pages
Digital Image Processing Overview
No ratings yet
Digital Image Processing Overview
174 pages
Color Models in Digital Imaging
No ratings yet
Color Models in Digital Imaging
23 pages
Understanding RGB and HSI Color Models
No ratings yet
Understanding RGB and HSI Color Models
34 pages
Creating 8-Bit Color Lookup Tables
No ratings yet
Creating 8-Bit Color Lookup Tables
24 pages
Turing Machines and Computation Limits
No ratings yet
Turing Machines and Computation Limits
51 pages
Logic Gates and Circuits Exercises
No ratings yet
Logic Gates and Circuits Exercises
5 pages
Recursive Data Types and Functions
No ratings yet
Recursive Data Types and Functions
11 pages
Variations of Turing Machines Explained
No ratings yet
Variations of Turing Machines Explained
37 pages
Multimedia and Animation Question Bank
No ratings yet
Multimedia and Animation Question Bank
11 pages
Text Types and Unicode Standards Explained
No ratings yet
Text Types and Unicode Standards Explained
24 pages
Fundamental Steps in Digital Image Processing
No ratings yet
Fundamental Steps in Digital Image Processing
17 pages
Digital Image Processing Q&A Bank
No ratings yet
Digital Image Processing Q&A Bank
3 pages
Computer Vision Course Syllabus 2022-26
No ratings yet
Computer Vision Course Syllabus 2022-26
96 pages
RGB Color Model Overview
100% (1)
RGB Color Model Overview
30 pages
Color Names Directory and Codes
No ratings yet
Color Names Directory and Codes
23 pages
Understanding Colour Theory Principles
No ratings yet
Understanding Colour Theory Principles
13 pages
Color Image Processing Techniques
No ratings yet
Color Image Processing Techniques
80 pages
Histogram Intersection in Image Retrieval
No ratings yet
Histogram Intersection in Image Retrieval
21 pages
A Short History of Color Theory
No ratings yet
A Short History of Color Theory
14 pages
Rowmark ADA Applique Color Guide
No ratings yet
Rowmark ADA Applique Color Guide
2 pages
Color Perception and Vision Explained
No ratings yet
Color Perception and Vision Explained
42 pages
Image Formation Techniques Explained
No ratings yet
Image Formation Techniques Explained
37 pages
Color Theory and Application Guide
No ratings yet
Color Theory and Application Guide
5 pages
Color Image Processing Fundamentals
No ratings yet
Color Image Processing Fundamentals
105 pages
Vibrant Color Photography Techniques
100% (2)
Vibrant Color Photography Techniques
163 pages
Shade Selection in Fixed Prosthodontics
No ratings yet
Shade Selection in Fixed Prosthodontics
12 pages
Graphic Designer Roadmap Overview
No ratings yet
Graphic Designer Roadmap Overview
20 pages
RGB, CMYK, HSV, and YIQ Explained
No ratings yet
RGB, CMYK, HSV, and YIQ Explained
4 pages
Properties of Light in Computer Graphics
No ratings yet
Properties of Light in Computer Graphics
14 pages
Introduction to Color Theory Guide
No ratings yet
Introduction to Color Theory Guide
3 pages
Erosion and Dilation in Image Processing
No ratings yet
Erosion and Dilation in Image Processing
50 pages
History of Virtual Reality Overview
No ratings yet
History of Virtual Reality Overview
47 pages
Concrete Color Changes from Heat Exposure
No ratings yet
Concrete Color Changes from Heat Exposure
9 pages
Computer Graphics Overview and Exam Pattern
No ratings yet
Computer Graphics Overview and Exam Pattern
16 pages
Multimedia Question Bank for 6th Sem
No ratings yet
Multimedia Question Bank for 6th Sem
6 pages
A Simple Digital Imaging Method For Meas PDF
No ratings yet
A Simple Digital Imaging Method For Meas PDF
6 pages
Color Models in Computer Graphics
No ratings yet
Color Models in Computer Graphics
22 pages

Digital Image Processing Overview

Uploaded by

Digital Image Processing Overview

Uploaded by

Computer Vision

Digital cameras Home surveillance

360° cameras Camera arrays

Digital photo Thermal image Ultrasound video

Let’s examine the actual process focusing on one dimension:

Let’s examine the actual process focusing on one dimension:

8 bits per pixel 4 bits per pixel 1 bit per pixel

255 255 255

0 255 255 Cyan

255 0 255 Magenta

255 255 0 Yellow

255 255 255 White

255 127 0 Orange

“How to halve the saturation in RGB space?”

from [Link] import imread

Credits: "Surinamese peppers" by Daveness_98 licensed under CC BY 2.0

This means that each value in each colour

Credits: "Surinamese peppers" by Daveness_98 licensed under CC BY 2.0

• In addition, some image processing operations (e.g. filtering) provide

img_r = img_as_float(img[:, :, 0])

from [Link] import imread

There are functions for the most common conversions:

Motion blur Poor contrast

Compression artefacts Lens distortion

• The function 𝑓 is independent of position in the image (therefore the (𝑥, 𝑦)

# Load and convert to float

• We can transform the values

After gamma correction

• The easiest way to implement an image warping from 𝐼 to 𝐽 is as follows:

𝑥′ 𝑎11 𝑎12 𝑎13 𝑥 𝑎11 𝑥 + 𝑎12 𝑦 + 𝑎13

• In skimage, you can define an affine transformation using

from [Link] import imread

# Define affine transformation

# Apply to image (note: inverse transformation!)

from [Link] import imread

# Define affine transformation

# Apply to image (note: inverse transformation!)

from [Link] import imread

# Define affine transformation

# Apply to image (note: inverse transformation!)

from [Link] import imread

# Define affine transformation

# Apply to image (note: inverse transformation!)

# Rotate around centre

# Define affine transformation (translation + shear)

Let’s assume that we have manually selected the following correspondences:

Then for three correspondences we can write

# Define arrays of point coordinates

# Create empty tform object

# Estimate tform parameters using point correspondences

𝑥′ 𝑝11 𝑝12 𝑝13 𝑥

• To perform image matching with a projective transformation, at least 4

Original Affine Projective

𝑎11 𝑎12 𝑎13

with DOF meaning “degrees of freedom”

Credits: Marc Pollefeys

• Examples of common non-linear transformations include:

Radial lens distortion Non-rigid transformation

You might also like