Tracking And Fixing Issues
During the course of analyzing your datasets and model inferences, you are likely to find problems! These can range from bad data / labels to actual deficiencies in model performance.
Aquarium provides a way to track these issues and then take the proper corrective action to fix them through a feature called Segments.

Creating and Adding To Segments

A segment is a grouping of datapoints, sometimes with specific labels + inferences associated with them. You can create or add to an segment from most views in Aquarium.
In the Grid View and Model Analysis View, you can select datapoints by clicking the circle on the top-left of each card. You can hold shift and click to add multiple datapoints. Once you've selected the datapoints, you can use the "Add To Segment" button to create a new issue or select an existing issue in the dropdown to add to.
In the Embedding View, you can select individual datapoints by clicking on them or select a group of datapoints with shift-click lasso-ing. Then you can use the "Add To Segment" button as before.

Filtering Based on Issues

You can use the "in_segment" and "not_in_segment" filters in the query bar to include or exclude datapoints. This is particularly useful for looking at the queue of failure datapoints that you have not yet triaged into an issue.
The following filter filters for datapoints that are not in any issue:

Viewing and Editing Segments

The Segments page, accessible from the top bar, allows you to access all segments you've created so far.
You can click into a segment to view what datapoints you've added to it so far, remove datapoints from the segment, export datapoints as a JSON file, or take corrective action to fix it.
Protip: we recommend keeping the Segments page open in a separate tab. This way, you can keep adding to a segment in one tab and then review what you've added to that segment in another tab.

Fixing Issues

Typically, an issue is caused by either a problem in the data or a problem with your model.
If the issue is caused by a problem with your data (such as a missing or inconsistent label), Aquarium can integrate into your labeling provider. This way, you can click a button, send this data to your labeling provider to fix it, and then retrain + re-evaluate your model on clean data.
If the issue is caused by a deficiency in your model (the model does badly on a particular type of difficult example), Aquarium can help automate the process of collecting more data of that difficult example so that your model performs better the next time you retrain. You can upload large unlabeled datasets, we will use our neural network embedding search to find examples in those unlabeled datasets that are similar to the current issue, and we will send them to your labeling provider.
If these features are of interest to you, contact your Aquarium representative or email us at [email protected] for more details.