Create a Bounding Box Project#

When your project depends on bounding boxes, the uploaded dataset needs to contain the required bounding box information alongside each image as part of a single datapoint.

First we will create a dataset, and later on, a dummy model that returns bounding box information.

Create dataset#

A convenient approach for creating a bounding box dataset is by ensuring that the local format is in COCO-compatible format.

import efemarai as ef

# Create a project
project = ef.Session().create_project(
    name="Example Bounding Box Project (COCO)",
    description="Example project using the COCO dataset format.",
    exists_ok=True,
)

dataset = project.create_dataset(
    name="Bounding Box dataset",
    data_url="./data/coco/test",
    annotations_url="./data/coco/annotations/test_instances.json",
    stage=ef.DatasetStage.Test,
    format=ef.DatasetFormat.COCO,
)

If your dataset is remote or part of an existing database with custom formats, you can easily upload it to the system by (1) iterating over the dataset and (2) creating datapoints containing the images and required targets.

In upload_dataset.py add the following script to perform the above steps.

import numpy as np
import efemarai as ef


def upload_dataset():
    images = [
        np.random.randint(0, 256, size=(480, 640, 3), dtype=np.uint8) for _ in range(10)
    ]

    project = ef.Session().create_project(
        name="Example Bounding Box Project",
        description="Example project with custom datapoint upload.",
        exists_ok=True,
    )

    dataset = project.create_dataset(
        name="Bounding Box dataset",
        stage=ef.DatasetStage.Test,
        format=ef.DatasetFormat.Custom,
    )

    class_count = 3
    for class_id in range(class_count):
        dataset.add_annotation_class(id=class_id, name=f"class_{class_id}_name")

    for image_data in images:
        # Create the inputs to the model
        # image = ef.Image(file_path="path/to/data") # if image is stored as a file
        image = ef.Image(data=image_data)  # if we have direct access to the image

        # Let's create random boxes with random classes and store the boxes
        boxes = []
        for i in range(3):
            image_height, image_width = 480, 640
            box_height, box_width = np.random.randint(40, 200, size=2)
            box_y1, box_x1 = np.random.randint(0, image_height - box_height), np.random.randint(0, image_width - box_width)
            box_y2, box_x2 = box_y1 + box_height, box_x1 + box_width

            # Get the label of the box from the dataset.
            # You can do it from either the id or the name
            # label = dataset.get_annotation_class(name=f"class_{np.random.randint(0, class_count)}_name")
            label = dataset.get_annotation_class(id=np.random.randint(0, class_count))

            bbox = ef.BoundingBox(
                xyxy=[box_x1, box_y1, box_x2, box_y2],  # top-left, bottom-right of the box
                label=label,  # the class label of the box
                ref_field=image,  # the image that this bounding box corresponds to (can also be a list of images)
                instance_id=i,  # Optional to say that this is a different instance in the same image
            )
            boxes.append(bbox)

        # A sample, or a datapoint, contains reference to the dataset,
        # the inputs and target outputs
        datapoint = ef.Datapoint(
            dataset=dataset,
            inputs={"image": image},  # inputs and targets can be dicts or lists
            targets=boxes,
        )

        # Upload the datapoint
        datapoint.upload()

    # Let us know that these were all the datapoints for this dataset, so we
    # can calculate additional metadata.
    dataset.finalize()


if __name__ == "__main__":
    upload_dataset()

Run python upload_dataset.py and wait for it to be completed.

After wrapping up any processing, you can confirm the status in the UI and explore the inputs and annotations.

Create a model#

A model that works with bounding boxes dataset will need to return a list of ef.BoundingBox objects that will be matched to the ones stored in the dataset. In a file dummy_model.py save the following code:

import efemarai as ef
import numpy as np

class DummyModel:
    """A DummyModel returning a random bbox"""

    def __init__(self, device):
        self.device = device # Move model to device

    def __call__(self, image):
        return {
            "class_id": np.random.randint(0, 3),
            "bbox": [100, 150, 250, 350],
            "score": np.random.random(),
        }


def predict_images(datapoints, model, device):
    outputs = []
    for datapoint in datapoints:
        image = datapoint.get_input("image") # This corresponds to the key from the datapoint input creation dict

        image_post_process = image.data / 255 - 0.5 # perform any pre-processing

        output = model(image_post_process)

        # Here again the label can be referenced by name or class
        # label = ef.AnnotationClass(name=output["class_name"])
        label = ef.AnnotationClass(id=output["class_id"])

        outputs.append(
            [
                ef.BoundingBox(
                    xyxy=output["bbox"],
                    confidence=output["score"], # Confidence of detection
                    ref_field=image, # Say which image this output refers to
                    label=label,     # And what label it has
                ),
            ]
        )
    return outputs


def load_model(device):
    model = DummyModel(device)
    return model


def test():
    # Load the model
    device = "cpu"
    model = load_model(device=device)

    image_RGB = np.random.randint(0, 256, size=(480, 640, 3), dtype=np.uint8)

    datapoint = ef.Datapoint(
        dataset=None,
        inputs={
            "image": ef.Image(data=image_RGB),
        },
    )

    # Pass a list of datapoints to the predictor function
    output = predict_images([datapoint], model, device)
    print(output)

    assert isinstance(output, list)
    assert isinstance(output[0][0], ef.BoundingBox)

if __name__ == "__main__":
    test()

If you run it with python dummy_model.py you’ll be able to confirm that the output of the model is a list of detections per input datapoint.

efemarai.yaml file#

To run the model, you need to have defined the loading and inference capabilities in the efemarai.yaml file.

Here’s the one corresponding to the dummy model.

project:
  name: "Example Bounding Box Project"

models:
  - name: Dummy Model
    description: This is a dummy model to show consuming inputs and outputs

    runtime:
      image: python:3.10-slim-buster
      device: "gpu"
      batch:
        max_size: 10
      load:
        entrypoint: dummy_model:load_model
        inputs:
          - name: device
            value: ${model.runtime.device}
        output:
          name: model

      predict:
        entrypoint: dummy_model:predict_images
        inputs:
          - name: datapoints
            value: ${datapoints}
          - name: model
            value: ${model.runtime.load.output.model}
          - name: device
            value: ${model.runtime.device}
        output:
          name: predictions
          keys:
            - bbox

Register the model#

To register the model, use the CLI to upload it by going into the root of the file directory, next to the efemarai.yaml.

ef model create .

Now you should be able to see the model uploaded and active with this project.