POTHOLE DETECTION USING CNN
AND ALEXNET
INTRODUCTION
Existing methods for detection and estimation of potholes
usually use sophisticated equipment and impose
computationally intensive tasks.
In this project, we present a new unsupervised vision-based
method, which does not require expensive equipment,
additional filtering and training phase.
Our method deploys on image processing and Deep learning
techniques to implement for identification and rough
estimation of potholes.
AIM
To detect the Potholes using different image segmentation
method (SC approach) and Deep Learning techniques.
OBJECTIVES
To study in detail about different existing pothole segmentation
methods.
To implement the Spectral Clustering (SC) approach for
Pothole segmentation.
To do the performance analysis of Pothole Detection using SC
approach.
To construct a CNN and AlexNet network (Deep Learning
techniques).
To compare the results using SC approach and CNN approach.
FLOW CHART FOR POHOLE DETECTION BY
SPECTRAL CLUSTERING (SC) APPROACH
Original Pothole Image
Spectral Clustering
Binarization
Erosion Region growing
N % of Black Y
Non - Pothole
pixels < Pothole Identified
Identified
Threshold
OTSU’S THRESHOLDING ALGORITHM
In our case Otsu’s Thresholding algorithm is used to determine
the Threshold value (T), by using this T value, a binary image
is formed.
By using the below equation with Threshold value(T) and the
mean value of the pixels binary image is formed,
𝑝𝑖𝑗
• O = |T −∑ ∑ |*2
𝑥𝑦
In our case other binary image g (x, y) is formed with the
Threshold(T’) by using the below equation,
𝑇′
• g (x, y) = 1 if ci(x, y) > else
4
𝑇′ 𝑇+255
= 0 if ci(x, y) < and T’ =
4 2
Removing Linear Shapes: After segmentation process small
shapes are removed by following below steps,
• Determine the connected components surrounding the
segmented pothole image.
• Compute the area of each component.
• Remove the objects surrounding the segmented image.
SPECTRAL CLUSTERING ALGORITHM
EROSION
It is one of the non linear operations related to shape or features in a
image.
Erosion removes pixels in object boundaries. It generally decreases
the sizes of objects and removes small shapes with the help of
structure element.
In this process, the binary image is examined through the structured
element and the structuring element is the shaped image.
REGION GROWING ALGORITHM
An initial set of small areas are iteratively merged according to
similarity constraints
Start by choosing an arbitrary seed pixel and compare it with
neighboring pixels.
Region is grown from the seed pixel by adding in neighboring
pixels that are similar, increasing the size of the region.
When the growth of one region stops, choose another seed
pixel that does not belong to any region and start again.
RESULTS BY SPECTRAL CLUSTERING
Original image
Otsu’s
Thresholding
Algorithm
Binary Image
formed by
equation O
Binary Image
formed by
equation g
Image formed
by removing
small shapes
Spectral clustered
images
Erosion
images
Region
growing
images
POTHOLE AND NON-POTHOLE
DATASET
In this project, we considered a data set of 300 images,
including (150 pothole + 150 non-pothole images).
Pothole images are collected from various websites through
google whereas the non-pothole images are captured using
mobile device.
Four non-pothole images are chosen randomly from the dataset
and their histograms are plotted.
Parameters Plot-1 Plot-2 Plot-3 Plot-4
Mean 192.2373 146.4286 140.4707 164.1689
Standard 14.5544 30.9306 43.4142 25.2304
Deviation
PERFORMANCE ANALYSIS
Accuracy: To measure the accuracy, we will calculate the True
Positive(TP), True Negative(TN), False Positive(FP) and False
Negative(FN) by comparing with the Ground Truth values.
• TP: correctly detected as a pothole.
• TN: correctly detected as a non-pothole .
• FP: wrongly detected as a pothole.
• FN: wrongly detected as a non-pothole.
(𝑇𝑃+𝑇𝑁)
Accuracy = ×100
(𝑇𝑃+𝐹𝑃+𝑇𝑁+𝐹𝑁)
Table 1.1: TP,TN,FP,FN values comparison at different thresholds.
Threshold TP TN FP FN
40% 107 150 0 43
50% 119 150 0 31
60% 130 140 10 20
Precision: It is the ratio of correctly detected potholes to
total number of detected potholes.
𝑇𝑃
Precision = ×100
(𝑇𝑃+𝐹𝑃)
Recall or Sensitivity: It is the ratio of correctly detected
potholes to actual potholes.
𝑇𝑃
Recall = ×100
(𝑇𝑃+𝐹𝑁)
Specificity: It is the ratio of correctly detected non-
potholes to actual non-potholes.
𝑇𝑁
Specificity = ×100
(𝑇𝑁+𝐹𝑃)
True Negative Accuracy (Negative predictive value):
fraction of total number of correctly detected non-potholes
to the total number of detected non-potholes.
𝑇𝑁
True Negative Accuracy = ×100
(𝑇𝑁+𝐹𝑁)
Error Rate: In both the classes i.e. positive class(potholes)
and negative class(non-potholes), the total number of
unclassified samples.
𝐹𝑃+𝐹𝑁
Error Rate = 1- Accuracy = ×100
(𝑇 𝑃 +𝑇 𝑁 +𝐹 𝑃 +𝐹 𝑁)
False Alarm Rate (False Positive rate): fraction of the
wrongly detected non-potholes to the total number of non-
potholes.
𝐹𝑃
False Alarm Rate = ×100
(𝑇𝑁+𝐹𝑃)
Miss Rate (False Negative rate): fraction of total number of
wrongly detected potholes to total number of potholes.
𝐹𝑁
Miss Rate = ×100
(𝐹𝑁+𝑇𝑃)
You den’s Index (Y I): combines the specificity(TNR) and
sensitivity(TPR) parameters. Its value ranges from 0 to 1.
• 0 indicates poor classification and 1 indicates perfect
classification.
Y I = TPR + TNR – 1
Matthews correlation coefficient (MCC): signifies the
relationship among predicted and observed classification.
𝑇𝑃×𝑇𝑁−𝐹𝑃×𝐹𝑁
MCC =
(𝑇 𝑃 + 𝐹 𝑃)(𝑇 𝑃 + 𝐹 𝑁)(𝑇 𝑁 + 𝐹 𝑃)(𝑇 𝑁 + 𝐹 𝑁)
• +1: perfect estimation, −1: total contrast among the
prediction values.
F-measure: harmonic mean of recall and precision.
2𝑇𝑃
F-measure =
2𝑇 𝑃 + 𝐹 𝑃 + 𝐹 𝑁
Table 1.2: Evaluation parameters at different values of threshold.
Parameters 40% 50% 60%
Accuracy(%) 85 89 90
Precision(%) 100 100 92
True Negative 77 82 87
Accuracy(%)
Recall(%) 71 79 86
Specificity(%) 100 100 93
Error Rate(%) 15 11 10
False alarm 0 0 6.6
rate
Miss rate(%) 28 20 13
YI 0.71 0.79 0.8
MCC 0.74 0.81 0.82
F- measure 0.83 0.88 0.89
ROC CURVE
A Receiver Operating Characteristic curve, is a graphical plot
that illustrates the performance of a classification model at all
classification threshold's.
The ROC curve is created by plotting True Positive Rate (TPR)
against the False Positive Rate (FPR) at various threshold
settings.
TPR defines how many correct positive results occur among all
positive samples available during the test.
𝑇𝑃
TPR =
𝑇𝑃+𝐹𝑁
FPR defines how many incorrect positive results occur among
all negative samples available during the test.
𝐹𝑃
FPR =
𝐹𝑃+𝑇𝑁
Table 1.3: TPR and FPR values comparison at different thresholds.
Threshold TPR FPR
40% 0.71333333333 0.00000000000
50% 0.79333333333 0.00000000000
60% 0.86666666667 0.06666666667
Figure 2: ROC curve plot at different values of threshold.
PRECISION RECALL CURVE
Figure 3: Precision Recall curve plot at different values of
threshold.
CONVOLUTIONAL NEURAL NETWORK
A typical CNN is a five layer architecture consisting of
convolution layer, Max pool layer and a fully connected layer.
• Convolutional layer: extracts the unique features from the
input image.
• Max pool layer: reduces the dimensionality.
• Fully Connected layer: classifies the images.
Data augmentation: Technique of increasing the dataset by
using properties like rotation, translation, flipping and
cropping. Instead of using new dataset of images,
augmentation will be performed.
Figure 4: CNN architecture.
Figure 5: Layer 2 and Layer 6 in CNN.
Table 1.5: Evaluation parameters of CNN.
Parameters Without data With data
Table 1.4: Confusion matrix of CNN. augmentation augmentation
Accuracy(%) 95 96
Confusion CNN without Data CNN with Data
Precision(%) 100 100
matrix augmentation augmentation
True Negative 91 93
Accuracy(%)
Pothole Non- Pothole Non-
Pothole Pothole Recall(%) 91 93
Error Rate(%) 4 3
Pothole TN = 41 FP = 0 TN = 42 FP = 0 Specificity(%) 100 100
False Alarm 0 0
Non-Pothole FN = 4 TP = 45 FN = 3 TP = 45
Rate(%)
Miss Rate(%) 8 6
YI 0.9 0.9
MCC 0.97 0.9
F-measure 0.95 0.93
ALEXNET
Alexnet is a deep neural network consisting of eight deep
layers.
It is a pre-trained trained network i.e., it is trained over
millions of images over thousand categories.
Figure 6: Alexnet architecture.
Figure 7: Layer 2 and Layer 6 in Alexnet.
Table 1.7: Evaluation parameters of Alexnet.
Parameters Values
Accuracy(%) 100
Table 1.6: Confusion matrix of Alexnet. Precision(%) 100
True Negative 100
CONFUSION Pothole Non-Pothole Accuracy(%)
MATRIX Recall(%) 100
Pothole TN = 45 FP = 0 Error Rate(%) 0
Non-Pothole FN = 0 TP = 45 Specificity(%) 100
False Alarm Rate(%) 0
Miss Rate(%) 0
YI 1
MCC 1
F-measure 1
CONCLUSION
We presented an approach to detect potholes using SC
approach, CNN and Alexnet, the evaluation parameters are
computed and compared.
Table 1.8: Analysis of Project.
Approaches Accuracy(%)
SC 84
CNN 96
Alexnet 100
WORK PLAN
PHASE 1:
[Link] Objectives Time Allocated
1 To study in detail about the different Pothole detection methods. August
2 To apply Spectral Clustering method for Pothole detection. September to
October
3 To do the performance analysis in pothole detection using RGB November to
images. December
4 To implement the ROC Curve and Precision Recall Curve. January
PHASE 2:
[Link] Objectives Time Allocated
5 To construct CNN and Alexnet network. January to
February
6 To compare the results using SC approach and CNN approach. March to April
Article Published in ICCIP (International
Conference on Communication and Image
Processing)
• Srinidhi Gorityala, and Renuka Devi SM. "Pothole
Detection using CNN and AlexNet." 3648822 (2020).
REFERENCES
• C. Koch, I. Brilakis, Pothole detection in asphalt pavement image,
Advanced Engineering Informatics, Vol. 25(3), pp. 507-515, 2011
• J. Zhang, J. Hu Image Segmentation Based on 2D Otsu Method with
Histogram Analysis, Computer Science and Software Engineering, pp.
105-108, 2008.
• Buza, E.; Omanovic, S.; Huseinnovic, A. Pothole detection with image
processing and spectral clustering. In Proceedings of the 2nd International
Conference on Information Technology and Computer Networks, Antalya,
Turkey, 8–10 October 2013; pp. 48–53.
• O’Malley AJ, Zou KH. Bayesian multivariate hierarchical transformation
models for ROC analysis. 2006; 25
• [Link].M.I. Jordan, Y. Weiss, On spectral clustering: analysis and an
algorithm, Advances in Neural Information Processing Systems 14, pp.
849-856, 2001.
• T. Guo, J. Dong, H. Li and Y. Gao, "Simple convolutional neural network
on image classification," 2017 IEEE 2ndInternational Conference on Big
Data Analysis (ICBDA), Beijing, 2017, pp. 721-724.