View Source

Introduction

Our Machine Learning system is used to tag the images obtained in the SRPScrape process.

We try to extract two features from the images:

Image type:
- Dealer, Stock, Placeholder
Color:
- Black, Blue, Brown, Burgundy, Green, Grey, Orange, Pink, Purple, Red, White, Yellow

Machine Learning pipes and models

In order to extract the aforementioned data from the images we use two different pipes composed of 3 different models:

Pipe 1: Type → Crop → Color
1. Apply the Type model and extract the image Type
  1. If the image Type is “Placeholder” we skip the next steps
2. Extract a crop from the image with the vehicle so we remove useless information from the image such as the image background and so
3. Apply the Color model
  1. If the 2nd step was able to extract the vehicle the color model is applied to the cropped image
  2. Else it is applied to the original image
Pipe 2: Crop → Type → Color
1. Apply the crop model to the original image
  1. If the crop model is able to extract a vehicle from the cropped image the following steps are applied to the cropped image
  2. Else it is applied to the original image
2. Apply the Image Type model
  1. If the image Type is “Placeholder” we skip the next steps
3. Apply the Color model

Notes:

The Image Type model has to be trained with cropped images when used with the Pipe 2

Edit
Machine Learning Pipe and Model Historical

First Iteration (End of 2018)

The first implementation was not handled by hoot and we don't have too much information about it. Our current Color detection model was trained for this first iteration and it was not retrained or changed since.

E

Second Iteration (Feb 2020)

The second iteration was handled by Hoot and the idea of using the crop model was introduced. With this version, we finally decide to use Pipe 1 with a freshly trained Image Type model.

To build this new model a Transfer Learning approach was used. Using Resnet50 architecture trained with Imagenet data set, freezing its inner layers, and modifying the output layers to fit them to a 3 class classification problem.

In order to obtain the best model possible, we experimented with different output layer architectures and dropout values. Due to the nature of the problem we want to solve, we want to have the recall as high as possible over placeholders, to avoid tagging them incorrectly so those images are never used in videos. A scoring system to evaluate the models was created where we scored both the model accuracy and recall over the placeholder data.

The data set was also extended to also contain examples of images of other vehicle types such as boats, bikes, quads, or RVs. As there was a unbalance between the different types of images, some image augmentation techniques were applied.

Third and Fourth Iterations (August 2020)

This iteration was caused due to some dealers having very bad results for the model. In order to improve the results examples from all the faulty dealers were added to the data set and a new model was trained following the same approach as the previous one.

In the Third iteration testing phase, we noticed that the images from a particular advertiser impacted the results for other advertisers so we decided to remove them from the data set of the Fourth iteration.

Fifth Iteration (October 2021)

This iteration comes with a change in approach, testing if Pipe 2 can improve the Dealer vs Stock accuracy/recall.

In order to improve the Stock vs Dealer classification, we are trying to create datasets with cropped images to remove as much background image information in order to test if the vehicles themselves contain enough information to correctly tell them apart.

The new dataset was created as follows:

Extend previous data set with ML manual fixes from backend/frontend
Manual review of the data sets to remove incorrect manual fixes
Apply data_set filter to remove images smaller than 200×200
Apply Crop model over all the Stock and Dealer data set images

Sixth Iteration (November 2021)

In this iteration, we maintain the same approach as the previous one. This iteration was caused by two situations:

A client notified us of several issues with classification:
- Digitally created images that look “real” were manually classified by us as “Stock”.
- Even though they are digitally created, some clients consider them “Dealer” images and others “Stock”
- The decided path of action was:
  - Delete conflicting images from the DataSet
  - Let the ML system tag the images normally and let the clients decide what those images are by manually changing the values
2nd client concern regarding some classifications:
- In this case, there were issues with some images that needed correction so we included examples of those in the dataset
- Also found that manual changes that had been made were inaccurate. However, since manual changes are ignored the system never re-classified the images
  - For this, we had to clear the system cache to allow for the reclassification process to work correctly with the new data sets.