Skip to content

Plumerai People Detection Python API

This document describes the Python API for the Plumerai People Detection software for videos.

The API

The Python API consists of a single class with a constructor that needs to be ran once, and a process_frame function that needs to be executed on each input frame. Additionally, there is a single_image function that can be used to process a single image independent of a video sequence.

PeopleDetection

plumerai_people_detection.PeopleDetection(height: int, width: int)

Initializes a new people detection object. This needs to be called only once at the start of the application.

Arguments

  • height int: The height of the input image in pixels.
  • width int: The width of the input image in pixels.

Returns

PeopleDetection: Instance of the PeopleDetection class.

process_frame

PeopleDetection.process_frame(image, delta_t: float = 0.0) -> tuple[BoxPrediction, ...]

Process a single frame from a video sequence with RGB input. The image must have the height and width that was specified when the PeopleDetection object was created.

Arguments

  • image ArrayLike: A tensor of shape (video_height, video_width, 3) with RGB image data. It can be a Numpy, TensorFlow, PyTorch or Jax tensor.
  • delta_t float: The time in seconds it took between this and the previous video frame (1/fps). If left to the default of 0, then the system clock will be used to compute this value.

Returns

tuple[BoxPrediction, ...]: The resulting bounding-boxes found in the frame.

single_image

PeopleDetection.single_image(image, confidence_threshold: float) -> Tuple[BoxPrediction, ...]

Process a single image not part of a video sequence. This should not be used for video data, but only for single image evaluation and debugging. The returned box id values are not related to those returned by process_frame or other calls to single_frame.

Arguments

  • image ArrayLike: A tensor of shape (*, *, 3) with RGB image data. It can be a Numpy, TensorFlow, PyTorch or Jax tensor.
  • confidence_threshold float: Any box with a confidence value below this threshold will be filtered out. Range between 0 and 1. A value of 0.63 is recommended for regular evaluation, but for mAP computation this can be set to 0.

Returns

tuple[BoxPrediction, ...]: The resulting bounding-boxes found in the image.

BoxPrediction

class BoxPrediction(NamedTuple):
    y_min: float  # top coordinate between 0 and 1 in height dimension
    x_min: float  # left coordinate between 0 and 1 in width dimension
    y_max: float  # bottom coordinate between 0 and 1 in height dimension
    x_max: float  # right coordinate between 0 and 1 in width dimension
    confidence: float  # between 0 and 1, higher means more confident
    id: int = 0  # the tracked identifier of this box
    class_id: int = DetectionClass.CLASS_UNKNOWN  # the class of the detected object

A structure representing a single resulting bounding box. Coordinates are between 0 and 1, the origin is at the top-left.

Example usage

Below is an example of using the Python API shown above.

import numpy as np
import plumerai_people_detection as ppd_api

# Settings, to be changed as needed
width = 1600  # camera image width in pixels
height = 1200  # camera image height in pixels

# Initialize the people detection algorithm
ppd = ppd_api.PeopleDetection(height, width)

# Loop over frames in a video stream
while True:
    # Some example input here, normally this is where camera data is acquired
    image = np.zeros((height, width, 3), dtype=np.uint8)

    # Process the frame
    predictions = ppd.process_frame(image)

    # Display the results to stdout
    for p in predictions:
        print(
            f"Box #{p.id} of class {p.class_id} with confidence {p.confidence:.2f} "
            f"@ (x,y) -> ({p.x_min:.2f},{p.y_min:.2f}) till ({p.x_max:.2f},{p.y_max:.2f})"
        )