Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

When you take a look at the sitemap image and color classification you will see something like this:

...

  1. Image Type received from ML system

    1. This value defaults to “Placeholder” if the ML system could not return a value

  2. Color received from ML System

    1. This value defaults to null if the ML system could not return a value

    2. If the color value is configured in the scrape the ML classification for the model is not applied

  3. Image Type Manual fix

    1. Image Type manual fix list selects the value returned from the ML system or defaults to Placeholder if None is received.

  4. Color Manual fix

    1. Color manual fix list selects the value returned from the ML system or defaults to “Black” if None is received

  5. Example of Placeholder tagged from ML system: {Image Type = Placeholder, Color = None}

    1. As you see, as color is not tagged, 5 is not showing anything

    2. Previously tagged Placeholders were {Image Type = Placeholder, Color = 'N/A'}. If you find some of these they are cached values.

       Edit

...

  • How to identify:

    • Lot of images classified as {Image Type = 'Placeholder', Color = 'None'} when they are not obvious placeholder

      • Example: 

        Image RemovedImage Added
  • Solution:

    • Relaunch the scrape and if the problem persists contact the dev team.

...

  • How to identify:

    • The image is classified as {Image Type = 'Placeholder', Color != 'None'}

    • Example: 

      Image RemovedImage Added
  • Solution:

    • Manually retag it back or just ignore it. After all the last word on the classification comes from the advertisers.

Different criteria for image types between the advertisers and us

This case happened on 2021-11 and it is documented in Jira.

There were two problematic situations happening simultaneously:

  1. They used digitally created/enhanced images that under our criteria were “Stock” Images so we manually retagged but the advertiser considered them Dealer images

    1. We concluded that we would only manually retag OBVIOUS misclassifications

    2. We concluded that we would only add to the ML training sets significative examples of each class leaving out conflicting images.

  2. The CDN that distributed the images changes the URL and image headers after a few days

    1. This makes the manual retag to not “stay” because for us the image coming after a few days is different due to the image_key calculation

This is a hard case to identify and solve so it probably has to be addressed on an advertiser to advertiser way when it happens.

Examples of conflicting images:

...