Adding Custom Embeddings

Add custom embeddings to your uploads

Overview

Aquarium uses neural network embeddings to enable features like clustering and similarity search. They can be thought of as lists of numbers that represent the essential visual qualities of an image. By default, Aquarium will try to compute embeddings for you by using our standard neural network, as long as you are not working in Anonymous mode.

If you are a customer working in Anonymous mode or you wish to provide your own embeddings, our Python library makes it easy to attach your own values.

If you chose a data sharing scheme that requires you to use your own embeddings, check out our full page on embeddings for some sample code, along with guidance for using your own models.

Assumptions

Your embeddings need to be:

  • A vector of up to length 2048

    • If your embedding vector is longer, contact Aquarium in order to accommodate

How to Add Custom Embeddings

If Working In Anonymous Mode

If you are adding custom embeddings because you are in Anonymous mode, please add in the flag is_anon_mode when you call create_dataset().

Example:

al_client.create_dataset(
        PROJECT_NAME, 
        DATASET_NAME, 
        dataset=dataset, 
        embedding_distance_metric='cosine',
        is_anon_mode=True
)

Adding Your Embedding Vectors

Aquarium uses the terms frame and crop quite often. With respect to embeddings, think of a frame as the entire image, and think of crop as an object/region of interest.

Important!

When you upload custom embeddings, you must upload an embedding at both the frame level and the crop level. In addition, you need to add an embedding to each crop using add_crop_embedding().

If you are working with a Classification or Semantic Segmentation project, you most likely will only have image level embeddings (whereas with object detection there are usually image and object level embeddings). For these projects, to satisfy the requirement of frame and crop level embeddings, please add your image level embeddings to each crop using add_crop_embeddings().

Example code snippet:

# example of adding label level embeddings
frame.add_frame_embedding(embedding=[1.0,2.0,3.0, ...])
for label_id, label_embedding in label_embeddings.items():
    frame.add_crop_embedding(label_id=label_id, embedding=label_embedding)

# once you have added all the labels, metadata, and embeddings to the frame object
# then add the frame to the dataset    
dataset.add_frame(frame)

# example of adding inference level embeddings
inference_frame.add_frame_embedding(embedding=[1.0,2.0,3.0, ...])
for inf_id, inf_embedding in inference_embeddings.items():
    inference_frame.add_crop_embedding(label_id=inf_id, embedding=inf_embedding)

# once you have added all the inferences, metadata, and embeddings to the inference frame object
# then add the inference frame to the dataset    
inference_set.add_frame(inference_frame)

Last updated