Problem 02: Classification and Accuracy Assessment
A. Preprocessing:
• Select an optical image with zero cloud cover from GEE Data Catalog.
• Clip it to a district or watershed boundary of your choice.
• Select the relevant optical bands and apply scaling if required.
B. Unsupervised Classification:
• Generate an appropriate number of random sample points and extract the
corresponding pixel values from the selected bands.
• Train a K-means clustering model with at least 20 clusters.
• Assign the clusters to meaningful Land Use/Land Cover (LULC) classes, ensuring
a minimum of five classes.
C. Create Training Samples:
• Manually digitize point geometries, making sure they correspond to the same
LULC classes identified from the unsupervised classification, with a minimum of
50 points per class.
• Assign labels to the digitized geometries and merge all classes into a single Feature
Collection.
• Split the dataset into training and testing subsets using either a 7:3 or 8:2 ratio.
D. Supervised Classification:
• Build a machine learning model using one of the following algorithms, Random
Forest (RF), Classification and Regression Tree (CART), Gradient Boosting (GB), or
Support Vector Machine (SVM) and set appropriate hyperparameters.
• Train the selected model using the prepared training samples and then classify the
image with the trained model.
E. Accuracy Assessment:
• Extract classified values for the testing dataset from both unsupervised and
supervised classified images.
• Construct a confusion matrix and calculate overall accuracy as well as Kappa
coefficient.
• Compare the performance of the two classification techniques and provide an
interpretation of the results.
Note: Prepare a report that includes this problem along with the previously discussed
Problem 01. The report should incorporate tables and figures (such as screenshots or
layouts) and clearly outline the steps followed, the results obtained and their interpretation.