Creating Projects in Aquarium

Creating and defining projects within Aquarium

Overview

Projects are the highest level grouping in Aquarium, and are expected to contain data for the same core task. For example, point cloud data for an autonomous vehicle or image classification of flora and fauna for classification.

When defining a project you'll need to have already decided:

  • The type of ML task your project will represent (classification, 3D object detection, semantic segmentation, etc.)

  • Ontology for the classes you will be working with

Projects are the starting point in Aquarium and need to exist before uploading any kind of dataset.

Prerequisites

In order to create a project you will need to make sure that you have already done the following:

  • Determined which data sharing methodology you would prefer

    • Ensure you have your data hosted in an accessible location for your team

  • Ensure you have installed the latest Aquarium client

    • pip install aquariumlearning

  • A development environment running a version of Python 3.7+

Creating a Project

To create a project, we use the function create_project() that requires the following arguments:

  • Project name

  • Specific core task - classification, object detection, semantic segmentation, etc

  • Label class map/ontology - define the full set of valid classes

It is required that each project in your organization has a unique name. If you try to create one with a duplicate name you will receive an error. We recommend adding a random string, a version, or a date or timestamp to a project name to keep it unique if you plan on creating projects with similar names.

We need to create a project in order to give our labeled, inference, and unlabeled datasets a place to go.

Here is an example of how to create a project:

import string
import random
import json

# Project names are globally unique namespaces, similar to
# a bucket name in S3 or GCS. For this quickstart, we make sure
# we create a unique project name.
PROJECT_NAME = "My_Project_Name"

# the classnames file is a literal list of strings in a JSON file
# ex: ["red", "blue", "yellow"]
with open('./classnames.json') as f:
    classnames = json.load(f)

# now we actually use the Aquarium client to create the project
# from_classnames() converts our string list into valid class objects
# primary_task tells us what kind of ML task this project will be used for
al_client.create_project(
    AL_PROJECT, 
    al.LabelClassMap.from_classnames(classnames), 
    primary_task="CLASSIFICATION"
)

While this seems like a relatively simplified example, defining projects with the client is generally straightforward. We elaborate on custom options below.

Primary Tasks Options

ML tasks vary quite a bit, and the ideal presentation depends on the specific task. Projects can specify a "primary task", which will be used to fine tune the UI for that type of task.

When you create a project you'll set a parameter named primary_task like this for example:

al_client.create_project(
    AL_PROJECT, 
    class_map, 
    primary_task="2D_OBJECT_DETECTION"
)

The different options you can set primary_task are:

  • 2D_OBJECT_DETECTION

    • Used for object detection, keypoint tracking, and skeletal tracking projects

  • CLASSIFICATION

    • Used when your task is fundamentally a classification task, where each data point has exactly one ground truth label and one predicted label

  • MULTI_LABEL_CLASSIFICATION

    • Used when your task is fundamentally a classification task, where each data point has zero or more labels, and can be predicted as zero or more classes.

  • 2D_SEMSEG

    • Used for a 2D semantic segmentation task, where you have per-pixel labels

  • 2D_INSTANCE_SEGMENTATION

    • Used for 2D instance segmentation task, where you have per-pixel labels

Label Class Maps

Your project predicts labels for things. At its most basic, a label class map defines how to map between integer ids, display name strings, and colors for rendering. At its most complex, it can also contain information about how to map between ground truth classes and inferred classes, and how each class should be interpreted when computing metrics.

In Aquarium, a LabelClassMap is a collection of ClassMapEntry objects. Aquarium provides the utility in the form of from_classnames() in order to convert a list of strings into ClassMapEntry objects.

A ClassMapEntry object can be defined like:

my_class_map_entry = al.ClassMapEntry(name='class_name', class_id=0, color=(255,0,0))

In the example above, we show a nice straightforward example of how to use a list of strings to create your class map. Here are other options available when it comes to fine-tuning your class map:

If you just have a list of classes with no strong opinions on presentation, we make that easy:

CLASSNAMES = ['dog', 'cat', 'horse', 'hamster']
# Attempts to assign presentational colors intelligently
label_class_map = al.LabelClassMap.from_classnames(CLASSNAMES)

We also have an additional section diving into more complex Label Class Map creation if you can't find what you're looking for here.

Confirming Your Project Creation

As an example of what your project will look like once it is created this is the script we are using:

import string
import random
import json

# establishing a connection to the client
al_client = al.Client()
al_client.set_credentials(api_key=YOUR_API_KEY)

# Project names are globally unique namespaces, similar to
# a bucket name in S3 or GCS. For this quickstart, we make sure
# we create a unique project name.
PROJECT_NAME = "My_Project_Name"

# the classnames file is a literal list of strings in a JSON file
# ex: ["red", "blue", "yellow"]
with open('./classnames.json') as f:
    classnames = json.load(f)

# now we actually use the Aquarium client to create the project
# from_classnames() converts our string list into valid class objects
# primary_task tells us what kind of ML task this project will be used for
al_client.create_project(
    AL_PROJECT, 
    al.LabelClassMap.from_classnames(classnames), 
    primary_task="CLASSIFICATION"
)

Project creation using the client is pretty quick, so once you've run the above script, you will be able to login and see your newly created project in Aquarium.

Adding Project Preview Image (Optional)

After you create a project, you can add an image that would be displayed at the card level so that it is easier to identify the specific task for each project.

Your uploaded image has to be smaller than 1MB

To add an image click your project card and navigate to your project overview page. You will find a teal "Upload" button where you can add an image:

Select the image you would like to upload and click the Save button. You should now see a preview image on the project overview page for your specific project.

Last updated