Image Processing x Machine Learning: Classifying Leaves

Visualizing an example

Let’s try preparing this image for segmentation. We also crop the borders of the image as there some to be extra pixels from the original picture:

Let’s now try to segment this image:

Reading the Images

Let’s get the labels from the filenames:

Cleaning the Images

We define the following thresholds per label obtained through trial and error:

We then proceed with define our cleaning by erosion and dilation:

Then we apply this to all the files and store the resulting image for further processing:

We then segment each image to their individual leaves:

From here we can see that we’ve ended up with 260 leaves.

Extracting Features

Let us now extract features from the resulting individual leaves. We will get the following features:

  • Length
  • Width
  • Perimeter
  • Area
  • Centroid Location

Let us first define the functions to get these features:

Then we apply the above functions to each leaf:

Let’s visualize these features through a pairplot:

Machine Learning Models

We then pass the resulting dataframe though machine learning models namely:

  • KNN
  • Logistic Regression (L1 & L2 Regularizations)
  • SVC (L1 & L2 Regularizations)

But first let us compute for the proportion chance criterion:

We end up with a PCC of

Running our input through the aforementioned Machine Learning models, we acquire the following accuracies:

We then conclude that Logistic Regression with an L1 Regularization is the best model for our dataset with length as the top predictor across the board.

Am I doing this data science thing right?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store