Dataset Collection System¶
The Dataset Collection System enables developers to capture images, model predictions, and user-generated annotations directly from their mobile app.
Data is a critical component to all machine learning workflows. And the best data comes from the real world, where ML models actually make their predictions. For mobile machine learning, that means the device itself. The Dataset Collection System creates a data feedback loop, where developers can work with users to see what their models predict, what those users expect models to predict, and the difference between the two.
Below are some core concepts to help you get started.
User-generated images are captured during real-world use of your application. Because they are collected from the end user, they are most representative of how your model is used in production, making them extremely valuable for monitoring model accuracy and retraining models over time.
Model annotations are annotations (keypoints, bounding boxes, etc.) generated from predictions made with user-generated images. By pairing images with model predictions, developers and product managers gain full visibility into what users experience when using an app in production.
User annotations are annotations (keypoints, bounding boxes, etc.) submitted by a user. They represent the model output a user expected. User annotations complement model annotations by allowing app makers to measure the gap between actual and expected behavior. User annotations also function as ground-truth data for model retraining.
Model-based Image Collection¶
A model-based image collection stores all images, model annotations, and user annotations associated with a single model.
Privacy is important and one of the main benefits of on-device machine learning. When collecting data from users, always make sure you have their explicit permission.
1. Register your model with the Fritz SDK¶
Once you have created a Fritz AI account and been granted access, you’ll need to upload a model and register it with the SDK. Instructions for implementing a custom pose estimation model on iOS can be found in iOS Custom Pose Estimation.
2. Create a Model-based Image Collection¶
From the Datasets tab in the webapp, select ADD IMAGE COLLECTION. Select the “Model-based” radio button and click Next. You will be asked to provide a name, description, and select a model to collect data from. In order to ensure that all annotations match, each collection can only be associated with one model. The annotation configuration (e.g. the number of keypoints in pose estimation) will be inferred automatically when data is first collected.
If an annotation with a different configuration is sent to a model-based image collection, any missing objects are added.
3. Use the record method on the predictor to collect data¶
Each FritzVision predictor has a record method that allows you to send the model’s input image, output predictions, and any user-modified annotations back to the Fritz webapp.
With images and model predictions, developers can assess model performance and indentify common errors. User annotations provide an opportunity to crowdsource annotations from real-world use.
Implementation details for specific predictors and platforms can be found in documentation for each individual model type:
4. Inspect collected images and annotations in the browser¶
Images and annotations can be viewed in the model-based image collection created in the browser. Select a given image to see additional details and switch between model and user annotations. Click the CREATE DATASET button to create a COCO-formatted export of your collection that can be used for measuring model accuracy or retraining.