Skip to content

Plumerai People Detection Python API

This document describes the Python API for the Plumerai People Detection software for videos.

The API

The Python API consists of a single class with a constructor that needs to be ran once, and a process_frame function that needs to be executed on each input frame. Additionally, there is a single_image function that can be used to process a single image independent of a video sequence.

PeopleDetection

plumerai_people_detection.PeopleDetection(video_height: int = 0, video_width: int = 0)

Initializes a new people detection object. This needs to be called only once at the start of the application.

Arguments

  • video_height int: The height of the input image in pixels. Leave to the default of 0 if only the single_image API is used.
  • video_width int: The width of the input image in pixels. Leave to the default of 0 if only the single_image API is used.

Returns

PeopleDetection: Instance of the PeopleDetection class.

process_frame

PeopleDetection.process_frame(image, delta_t: float) -> tuple[BoxPrediction, ...]

Process a single frame from a video sequence with RGB input. The image must have the height and width that was specified when the PeopleDetection object was created.

Arguments

  • image ArrayLike: A tensor of shape (video_height, video_width, 3) with RGB image data. It can be a Numpy, TensorFlow, PyTorch or Jax tensor.
  • delta_t float: The time in seconds it took between this and the previous video frame (1/fps).

Returns

tuple[BoxPrediction, ...]: The resulting box predictions with confidence scores, class ids and tracking ids

single_image

PeopleDetection.single_image(image, confidence_threshold: float) -> Tuple[BoxPrediction, ...]

Process a single image not part of a video sequence. This should not be used for video data, but only for single image evaluation and debugging. The video_height and video_width parameters from the constructor are ignored.

Arguments

  • image ArrayLike: A tensor of shape (*, *, 3) with RGB image data. It can be a Numpy, TensorFlow, PyTorch or Jax tensor.
  • confidence_threshold float: Any box with a confidence value below this threshold will be filtered out. Range between 0 and 1. For mAP computation this can be set to 0.

Returns

tuple[BoxPrediction, ...]: The resulting box predictions with confidence scores and class ids

BoxPrediction

class BoxPrediction(NamedTuple):
    y_min: float  # top coordinate between 0 and 1 in height dimension
    x_min: float  # left coordinate between 0 and 1 in width dimension
    y_max: float  # bottom coordinate between 0 and 1 in height dimension
    x_max: float  # right coordinate between 0 and 1 in width dimension
    confidence: float  # between 0 and 1, higher means more confident
    class_id: int  # the class of the detected object
    tracker_id: int | None  # the tracked identifier of this box

A structure representing a single resulting bounding box. Coordinates are between 0 and 1, the origin is at the top-left.

Example usage

Below is an example of using the Python API shown above.

import numpy as np
from plumerai_people_detection import PeopleDetection

# Settings, to be changed as needed
width = 1600  # camera image width in pixels
height = 1200  # camera image height in pixels

# Initialize the people detection algorithm
ppd = PeopleDetection(height, width)

# Loop over frames in a video stream
while True:
    # Some example input here, normally this is where camera data is acquired
    image = np.zeros((height, width, 3), dtype=np.uint8)

    # Process the frame
    predictions = ppd.process_frame(image, 0.0)

    # Display the results to stdout
    for p in predictions:
        print(
            f"Box #{p.tracker_id} of class {p.class_id} with confidence {p.confidence:.2f} "
            f"@ (x,y) -> ({p.x_min:.2f},{p.y_min:.2f}) till ({p.x_max:.2f},{p.y_max:.2f})"
        )