Collection Campaign Classifier

The Collection Classifier Workflow is a beta feature that is off by default. To opt in, please reach out to the Aquarium team and we'll enable it for your organization.

With "within dataset similarity search" and "unlabeled collection campaigns", you can accept or discard specific results (e.g. see the right click menu in the screenshot below).

With the classifier feature, Aquarium will incorporate your accept/discard decisions to improve the results of subsequent similarity searches.

(1) Activating the Classifier

In order to provide enough data to train the classifier, you will need to accept/discard a minimum number of elements. These thresholds can be toggled via the Collection Settings button.

It is possible to override the suggested threshold by clicking a button, as long as you have at least one example for the specified category. When looking at your initial set of results, you will need to find at least 1 bad example for the classifier to work. See below:

(2) Running the Classifier

The subsequent similarity search should run using a classifier trained on your accepted/discarded results. If you click on the Collection Settings, you should be able to set a toggle that determines whether your results are sorted by (a) the classifier score or (b) the original similarity score.

Sometimes the classifier may not have enough examples, and the results may be noisier than the original similarity search. In this case, we recommend finding more "discard" examples to help refine it.

Once the classifier is activated, you can run loops of (1) accept/discard results and (2) rerun the search. Generally, we've seen that results improve with more iterations, but please let us know if that's not the case!

Last updated