We've spent the last few months working on features to allow an Aquarium user to go through all of the steps in a data curation workflow - finding mistakes, searching through unlabeled datasets to find the best data to label, dispatching data to labeling, and updating the dataset with new labels - without needing to write any code.
The following video highlights this end to end workflow:
First upload a mutable dataset if you do not already have one (see docs).
You can tell that you have a mutable dataset available if you have successful "Streaming Uploads" listed on your Project Details page.
If you don't already have labeling integrations set up for your org (you won't have any if you're trying this feature for the first time!), you will need to add a labeling integration by going to your Organization Settings:
You should see the option to add a new labeling integration:
Once clicked, you should see the following modal:
You will need to enter your provider-specific API key. In the case of Labelbox, you should be able to find or generate your API keys here: https://app.labelbox.com/account/api-keys.
We'll check if the integration key seems valid, and will display an error otherwise.
The data curation workflow also incorporates a new Create Issue UX. To create an issue that supports collections and labeling, you will need to use the + Create Issue button rather than the + Add to Issue button:
You will then see the following modal:
To be able to use the collection and labeling flows, you will need to select "Rare Scenario". If you simply want functionality similar to "legacy" issues, choose the "Generic" option.
You can then select your desired Labelbox project and dataset as follows:
Note that we do some validation to ensure that:
Your Aquarium project's primary_task is either Object Detection (the default) or Classification.
The Labelbox ontology schema matches your project type (e.g. if your Aquarium project has primary_task
CLASSIFICATION , your corresponding Labelbox project needs to support a classification labeling flow).
Your Labelbox labels have the same names as the classmap for your Aquarium project.
Your Issue Detail View will also look slightly different:
Sometimes you may only have a few examples of your rare scenario, and you may want to find more to provide a good "seed" for collection campaigns.
You can use Find Similar Dataset Elements to find and add more elements from your existing dataset:
To run an Unlabeled Indexed Collection via the UI, you will first need to upload an unlabeled indexed dataset. (See docs)
You can see your unlabeled datasets in your Project Details page:
Then, in the Collected tab of your Issue Detail View, you can (1) from the dropdown, select an unlabeled dataset to search through and (2) select export your desired results to labeling:
Once you've exported new collection frames to Labelbox, the frames that are pending labeling will be visible in the Labeling tab:
Aquarium monitors the status of these frames and will update them once a labeler has completed.
You can view completed labeled frames in the Done tab. Select the frames you want added back into your original dataset, and click the Add All Frames to Dataset button.
After a refresh, you can see a "loading icon" in the lower right corner of the frame, indicating that it is being processed:
Frames that have been successfully added will be marked with a check icon: