Dataset Checkpoints

Use checkpoints to manage versions of your dataset over time and measure the impact of improving data quality or acquiring new data.

A checkpoint is a combination of frames, labels and associated metadata at a specific point in time - effectively a version of the dataset.

We built checkpoints so that you can isolate and measure the impact of changes to your dataset.

  • Creating a checkpoint freezes the state of the dataset's frames and labels as of a point in time.

  • Aquarium allows you to swap between your checkpoints and assess how your model performance, dataset quality, and dataset distribution have changed over time.

Checkpoints should help you answer questions like:

  • How much did my relabeling effort improve my models' performance?

  • How much did my newly curated data improve my models' performance?

  • How has my dataset changed over the last 3 months?

Creating and Managing Checkpoints

Create checkpoints from the Dataset Information page, accessible for each dataset from the Project Overview.

Use the Checkpoints tab on the Dataset Information page to create new checkpoints or manage existing ones. Select any checkpoint in the table to archive the checkpoint, edit the metadata, or explore the data.

From the Add Checkpoint menu, select a time to create the checkpoint. The eligible times match distinct upload and processing events for the dataset.

If you make frequent simultaneous updates to the dataset, the events may be batched into ~1 hour intervals to avoid presenting too many options. See the Notes and Limitations section for more details, or contact Aquarium to discuss upload best practices.

Accessing Checkpoints in the UI

Once you've created a checkpoint, use the drop down in the upper right of the Grid, Analysis, Embedding or Metrics views to switch checkpoints.

By default the latest version of the dataset (regardless of whether you've tagged it as a checkpoint) is selected.

Once you select a checkpoint, Aquarium will update to show the frames and labels as they were at that point in time.

Common Use Cases for Checkpoints

Measure the Impact of a Fixing Data Quality Issues

  1. Create a checkpoint on the initial version of the dataset (v0).

  2. Use Aquarium to identify any label quality issues and export them to your labeling provider to be fixed.

  3. Ingest the correct data into Aquarium as an update to the dataset.

  4. Create a checkpoint on the updated version of the dataset (v1).

  5. Aquarium will automatically recalculate model performance metrics on the new checkpoint for up to the 5 most recently uploaded inference sets.

  6. Compare the model performance across checkpoints v0 and v1 - any performance differences are directly attributable to the improved data quality.

Take this a step further by training a new model on the corrected dataset and uploading the new inferences. Aquarium will automatically calculate performance across both checkpoints, allowing you to assess the effect of relabeling, the effect of retraining, and the combined effect of relabeling and retraining.

Measure the Impact of Curating New Data

  1. Create a checkpoint on the initial version of the dataset (v0).

  2. Use Aquarium to identify gaps in your model performance or dataset distribution and curate new unlabeled data to send to your labeling provider.

  3. Ingest the new data into Aquarium as an update to the dataset.

  4. Create a checkpoint on the updated version of the dataset (v1).

  5. Retrain your model on the updated dataset and submit a new inference set to Aquarium.

  6. Aquarium will automatically calculate model performance across the overlapping frames in checkpoints v0 and v1, allowing you to isolate the effect of your curation and retraining effort.

Take this a step further by uploading inferences from your old model on the newly curated data. Comparing the new and old model performance on checkpoint v1 will then allow you to validate that the retraining and additional data curation performed as expected without any material regressions.

Important Notes and Limitations

Working with Checkpoints in the UI

  • By default the current (latest) state of the dataset is available in the UI, regardless of whether it's been manually tagged as a checkpoint.

  • Model performance metrics are only automatically recalculated on the 5 most recently updated inference sets and the 5 most recently created checkpoints.

  • Occasionally a checkpoint <> inference set combination will not be selectable. This happens when either the checkpoint or inference set is beyond the 5 most recent limit.

  • Segments do not currently support time travel via checkpoints - each segment contains the element as it was when the data was added to the segment.

Managing Checkpoints and Uploads

  • Because checkpoints are created from upload events, simultaneous uploads may be batched together into the same eligible-for-checkpointing processing window. If you have multiple updates to the dataset to make and want to ensure each update is eligible for a distinct checkpoint, separate your update submissions by more than 1 hour.

  • Currently, checkpoints must be created post-upload and processing in the UI. Please submit a feature request if you'd like to create checkpoints directly at upload time via the Python client.

Inference Set Reprocessing Eligibility

  • Inference sets uploaded prior to May 31st 2022 are not eligible for automatic metric recalculation using checkpoints. This is due to a change in the processing pipeline we made to support continuous IOU and confidence thresholds.

  • If you'd like an inference set to become eligible for checkpoints, simply reupload it. Contact your Aquarium representative if you have questions.

Last updated