VARIOUS RECENT EDGE
DETECTION METHODS
HOLISTICALLY-NESTED EDGE DETECTION WITH
OPENCV AND DEEP LEARNING
• It’s a Convolutional Neural Network (CNN)-based technique designed
to detect edges in images.
• Unlike classical methods (Sobel, Canny, etc.), which use filters and
gradients, HED learns edge features directly from data.
KEY IDEAS IN HED
[Link] Learning
⚬ Traditional edge detectors look at local pixel changes.
⚬ HED takes the whole image (holistic) into account using deep CNNs to
understand both local and global context.
[Link] Supervision
⚬ Edges can appear at different scales (thin hairline edges vs. object boundaries).
⚬ HED uses side outputs from multiple CNN layers (shallow → fine edges, deep
→ coarse edges).
⚬ Each side output is supervised (trained) with the ground truth, so the network
learns multi-scale edge features.
[Link] Outputs
⚬ The outputs from different scales are combined (fused) to get the final edge
map, which is sharper and more complete.
ADVANTAGES OF HED
[Link] cleaner and more accurate edges than Sobel, Canny,
or even structured forests.
[Link]-scale detection → captures both fine details and large
object boundaries.
[Link] well in complex natural images.
PHASE STRETCH TRANSFORM (PST)
• Introduced by UCLA researchers (2016).
• It is a physics-inspired algorithm for detecting edges and
features in images.
• Inspired by the way light spreads out (disperses) when
passing through a medium.
HOW IT WORKS
• Phase kernel: PST applies a phase function (like a lens
spreading light) to the image in the frequency domain.
• Transformation: This phase spreading enhances subtle
changes in intensity (edges, corners, textures).
• Edge detection: The phase output highlights edges and
features that might be missed by traditional gradient-based
methods.
ADVANTAGES
• Robust to noise → works well even in noisy/low-quality
images.
• Detects fine features → picks up very faint edges.
• Useful in biomedical imaging → e.g., detecting features in
medical scans.
LIMITATIONS
• Computationally heavy compared to simple operators (like
Sobel, Canny).
• Less common in general-purpose CV tasks (more research-
oriented).
BLOB DETECTION
• A blob is a connected region of pixels that share similar characteristics
and are significantly different from the background or neighboring
regions.
• The primary goal of blob detection is to isolate these distinct regions to
facilitate further analysis, such as object recognition, tracking, image
segmentation, or quality control.
BLOB DETECTION
METHODS AND TECHNIQUES
• Laplacian of Gaussian (LoG): This method involves convolving the image with
a Laplacian of Gaussian filter at different scales to identify regions that exhibit a
peak in the filter response, indicating a blob. Although the LoG approach can
identify blobs of different sizes well, it can return several results for a single blob.
• Difference of Gaussians (DoG): Similar to LoG, DoG approximates the LoG and
is computationally more efficient. It involves subtracting two Gaussian-smoothed
versions of the image at different scales.
• Determinant of Hessian (DoH): This method uses the determinant of the
Hessian matrix to identify regions with high curvature, which often correspond to
blobs. The local maxima in the Hessian's determinant can be used to locate blobs
of various sizes and forms.
APPLICATIONS AND USE CASES OF BLOB DETECTION
• Object Detection: Finds objects in an image by detecting blob-like regions.
Helps separate objects from the background.
• Feature Extraction: Blobs give information about shape, size, and position.
These features can be used to classify objects or match them across images.
• Tracking: By detecting blobs in each frame of a video, we can track moving
objects. Helps estimate direction and speed, which is useful in applications
such as autonomous driving or robotics.
• Segmentation: Blob detection is used to segment an image into different
regions based on their texture or color. This segmentation is useful for
identifying regions of interest in an image and for separating them from the
background.
LIMITATIONS OF BLOB DETECTION
• Overlapping Objects: Difficulty in accurately segmenting and
identifying individual blobs when objects overlap or touch each other.
• Uneven Lighting or Complex Backgrounds: Performance can be
affected by inconsistent lighting conditions or complex backgrounds
that obscure clear boundaries between blobs and the background.
• Shape Assumption: Many traditional blob detection techniques
inherently assume circular or elliptical shapes for blobs, which may not
hold true for objects with irregular forms.
WHAT IS A CORNER?
• A corner is a point in an image where two edges meet. It is characterized
by intensity variations in multiple directions.
• In a flat region, pixel intensities do not change much.
• Along an edge, intensity changes strongly in one direction, but remains
constant in the perpendicular direction.
• At a corner, intensity changes are significant in both directions.
• This makes corners unique, stable, and easy to track across different
images.
The various corner detection methods are:
MORAVEC CORNER DETECTOR (1977)
Idea: It was the first corner detector.
• A small square window is placed on a pixel and shifted in different
directions (up, down, left, right, diagonals).
• If the intensity change is large in all directions → the pixel is likely a corner.
Limitation:
• Noise sensitive → false corners may appear.
• Not rotation invariant → the response depends on the chosen directions, so
rotated images give inconsistent results.
• Contribution: Established the basic principle of using intensity change in
multiple directions to detect corners.
SHI–TOMASI DETECTOR (1994)
Improvement over Harris: Instead of computing the Harris "corner
response function" RRR, Shi-Tomasi looks directly at the eigenvalues
(λ1,λ2) of the second-moment matrix MMM.
Key idea:
• For a point to be a corner, both eigenvalues should be large.
• Shi-Tomasi defines the corner quality as min(λ1,λ2).
• If the smaller eigenvalue is above a threshold → it’s a good corner.
Advantages:
• More mathematically stable than Harris.
• Corners are better for tracking, so Shi-Tomasi is widely used in optical flow
methods.
FAST (FEATURES FROM ACCELERATED SEGMENT TEST, 2006)
Goal: Very fast corner detection for real-time applications (e.g., robotics, AR, SLAM,
mobile).
How it works:
• Around each candidate pixel, draw a circle of 16 pixels (like Bresenham circle).
• If a set of n consecutive pixels are significantly brighter or darker than the center →
the point is a corner.
Advantages:
• Extremely fast (does not require eigenvalue computation).
• Widely used in ORB (Oriented FAST and Rotated BRIEF) for computer vision tasks.
Limitations:
• Susceptible to false positives if not combined with non-maximum suppression.
MODERN CNN-BASED CORNER DETECTORS (2017 → NOW)
• Instead of manually defining rules (like eigenvalues or intensity changes),
deep learning models learn to detect keypoints (corners + other features).
• Examples:
• SuperPoint (2018): Self-supervised deep network that learns both
corner/keypoint detection and descriptors.
• D2-Net (2019): Learns dense feature detection + description using CNNs.
• R2D2, LF-Net and others.
MODERN CNN-BASED CORNER DETECTORS (2017 → NOW)
• Advantages:
• Much more robust to scale, rotation, illumination, and viewpoint changes.
• Work well in real-world, large-scale applications (autonomous driving,
mapping, AR).
• Drawbacks:
• Require training data and GPU computation.
• Slower compared to traditional methods like FAST/Harris.
APPLICATIONS OF CORNER DETECTION
[Link] Recognition
• Corners are unique points, so they can be matched between different images.
• Example: Recognizing the same building from different angles using its window corners.
[Link] Matching in Images
• Used in algorithms like SIFT and SURF, which rely on corners as keypoints.
• Example: Matching two overlapping photos to create a panorama.
[Link] in Videos
• Corners are easy to track across frames since they remain stable.
• Example: Tracking a football or player’s movement in a sports video.
1.3D Reconstruction
• By finding and matching corners from different views, computers can estimate depth and 3D structure.
• Example: Reconstructing a 3D model of a room using photos.
[Link] and Navigation
• Robots and drones use corner detection to understand their surroundings.
• Example: Detecting room corners for path planning and localization.
HARRIS CORNER DETECTION
• The whole image is divided into similar blocks called
windows. Harris Corner Detector determines which windows
produce a huge difference in intensities when moved in both x
and y directions. For each window, a score R is calculated. We
define a threshold to this score and strong corners are found.
STEP 1: MEASURING INTENSITY CHANGE
We want to know how much the image intensity changes if
we move a small window (u,v)(u,v) around a pixel.
• E(u,v): Change in intensity when window is shifted.
• w(x,y)w(x,y): Weighting function (Gaussian window).
• I(x,y): Intensity at pixel (x,y).
Corners will have large change in both x and y directions.
STEP 2: TAYLOR EXPANSION
STEP 3: MATRIX FORM
STEP 4: EIGENVALUE ANALYSIS
STEP 5: HARRIS RESPONSE FUNCTION
STEP 6: DECISION RULE
STEP 7: FINAL STEPS
SUMMARY ALGORITHM
ADVANTAGES OF HARRIS CORNER DETECTION
• Accuracy: It detects corners with high precision making it reliable for feature
extraction.
• Noise Resilience: The method works well even in images with noise, as it is
based on intensity gradients.
• Rotation and Scale Invariance: Corners detected using this method remain
consistent across rotated or scaled versions of the image.
• Foundation for Advanced Algorithms: It is used as the base for more
advanced feature detection methods like SIFT and SURF.
CHALLENGES OF HARRIS CORNER DETECTION
• Sensitivity to Parameters: The algorithm’s performance depends on choosing
the correct values for parameters like neighborhood size and Sobel kernel size.
• Computationally Expensive: It can be slow on large images or videos due to its
computational complexity.
• Difficulty with Flat Surfaces: It struggles to detect corners in images with flat
or uniform surfaces that lack significant intensity changes.
• Not Ideal for Real-Time Applications: Given its computational cost, it's less
suitable for real-time systems where speed is critical.
THANK YOU