Skip to content

Commit

Permalink
docs: add doc about image prediction (#764)
Browse files Browse the repository at this point in the history
  • Loading branch information
alexgarel authored May 25, 2022
1 parent b5ab0e7 commit 0882878
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 4 deletions.
14 changes: 11 additions & 3 deletions doc/introduction/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,15 @@ Robotoff allows to predict many information (also called _insights_), mostly fro
Each time a contributor uploads a new image on Open Food Facts, the text on this image is extracted using Google Cloud Vision, an OCR (Optical Character Recognition) service. Robotoff receives a new event through a webhook each time this occurs, with the URLs of the image and the resulting OCR (as a JSON file).
We use simple string matching algorithms to find patterns in the OCR text to generate new predictions [^predictions].

We also use a ML model to extract logos from images. These logos are then embedded in a vector space using a pre-trained model. In this space we use a k-nearest-neighbor approach to try to classify the logo, predicting a brand or a label [^logos].
We also use a ML model to extract objects from images. [^image_predictions]

We use the image to detect the grade of the Nutri-Score (A to E) with a computer vision model (object detection).
One model tries to detect any logo [^logos].
Detected logos are then embedded in a vector space using a pre-trained model.
In this space we use a k-nearest-neighbor approach to try to classify the logo, predicting a brand or a label.
Hunger game also collects users annotations to have ground truth ([logo game](https://hunger.openfoodfacts.org/logos)).

Another model tries to detect the grade of the Nutri-Score (A to E)
with a computer vision model.

The above detections generate predictions which in turn generate many types of insights [^insights]:

Expand All @@ -38,9 +44,11 @@ Predictions, as well as insights are stored in the PostgreSQL database.

[^predictions]: see `robotoff.models.Prediction`

[^image_predictions]: see `robotoff.models.ImagePrediction` and `robotoff.workers.tasks.import_image.run_import_image_job`

[^insights]: see `robotoff.models.ProductInsight`

[^logos]: see `robotoff.logos`
[^logos]: see `robotoff.models.ImageAnnotation` `robotoff.logos`

These new insights are then accessible to all annotation tools (Hunger Games, mobile apps,...), that can validate or not the insight.

Expand Down
18 changes: 17 additions & 1 deletion robotoff/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,14 @@ class Meta:

class ImagePrediction(BaseModel):
"""Table to store computer vision predictions (object detection,
image segmentation,...) made by custom models."""
image segmentation,...) made by custom models.
They are created by api `ImagePredictorResource`, `ImagePredictionImporterResource`
or cli `import_logos`
Predictions come from a model, from settings `OBJECT_DETECTION_TF_SERVING_MODELS`
this can be a nutriscore, a logo, etc...
"""

type = peewee.CharField(max_length=256)
model_name = peewee.CharField(max_length=100, null=False, index=True)
Expand All @@ -227,6 +234,15 @@ class ImagePrediction(BaseModel):


class LogoAnnotation(BaseModel):
"""Annotation(s) for an image prediction
(an image prediction might lead to several annotations)
At the moment, this is mostly for logo (see run_object_detection),
when we have a logo prediction above a certain threshold we create an entry,
to ask user for annotation on the logo (https://hunger.openfoodfacts.org/logos)
and eventual annotation will land there.
"""

image_prediction = peewee.ForeignKeyField(
ImagePrediction, null=False, backref="logo_detections"
)
Expand Down

0 comments on commit 0882878

Please sign in to comment.