Plumerai People Detection Python API¶
This document describes the Python API for the Plumerai People Detection software for videos.
The API¶
The Python API consists of a single class with a constructor that needs to be ran once, and a process_frame
function that needs to be executed on each input frame. Additionally, there is a single_image
function that can be used to process a single image independent of a video sequence.
PeopleDetection¶
init¶
Initializes a new people detection object.
This needs to be called only once at the start of the application.
Arguments:
- height:: The height of the input image in pixels.
- width:: The width of the input image in pixels.
Returns:
- Nothing.
process_frame¶
PeopleDetection.process_frame(
image, delta_t: float = 0.0
) -> tuple[ErrorCode, tuple[BoxPrediction, ...]]
Process a single frame from a video sequence with RGB input.
The image must have the height and width that was specified when the PeopleDetection object was created.
Arguments:
- image: A tensor of shape
(video_height, video_width, 3)
with RGB image data. It can be a Numpy array or TensorFlow, PyTorch or Jax tensor. - delta_t: The time in seconds it took between this and the previous video frame (1/fps). If left to the default of 0, then the system clock will be used to compute this value.
Returns:
- An error code of type
ErrorCode
and the resulting bounding-boxes found in the frame.
single_image¶
Process a single image not part of a video sequence.
This should not be used for video data. The returned box id values are not related to those returned by process_frame
or other calls to single_image
.
Arguments:
- image: A tensor of shape
(*, *, 3)
with RGB image data. It can be a Numpy array or TensorFlow, PyTorch or Jax tensor.
Returns:
- An error code of type
ErrorCode
and the resulting bounding-boxes found in the frame.
BoxPrediction¶
A structure representing a single resulting bounding box. Coordinates are between 0 and 1, the origin is at the top-left.
class BoxPrediction(NamedTuple):
y_min: float # top coordinate between 0 and 1 in height dimension
x_min: float # left coordinate between 0 and 1 in width dimension
y_max: float # bottom coordinate between 0 and 1 in height dimension
x_max: float # right coordinate between 0 and 1 in width dimension
confidence: float # between 0 and 1, higher means more confident
id: int = 0 # the tracked identifier of this box
class_id: int = DetectionClass.CLASS_UNKNOWN # the class of the object
DetectionClass¶
ErrorCode¶
Error codes returned which can be returned by the API.
class ErrorCode(Enum):
# All went well
SUCCESS = 0
# Should not occur, contact Plumerai if this happens
INTERNAL_ERROR = -1
# The `delta_t` parameter should be >= 0
INVALID_DELTA_T = -2
Example usage¶
Below is an example of using the Python API shown above.
import numpy as np
import plumerai_people_detection as ppd_api
# Settings, to be changed as needed
width = 1600 # camera image width in pixels
height = 1200 # camera image height in pixels
# Initialize the people detection algorithm
ppd = ppd_api.PeopleDetection(height, width)
# Loop over frames in a video stream (example: 10 frames)
for t in range(10):
# Some example input here, normally this is where camera data is acquired
image = np.zeros((height, width, 3), dtype=np.uint8)
# The time between two video frames in seconds. In this example we assume
# a constant frame rate of 30 fps, but variable rates are supported.
delta_t = 1. / 30.
# Process the frame
error_code, predictions = ppd.process_frame(image, delta_t)
if error_code != ppd_api.ErrorCode.SUCCESS:
raise RuntimeError(f"Error in 'process_frame': {error_code}")
# Display the results
for p in predictions:
print(
f"Box #{p.id} of class {p.class_id} with confidence {p.confidence} "
f"@(x,y)->({p.x_min:.2f},{p.y_min:.2f})-({p.x_max:.2f},{p.y_max:.2f})"
)
if len(predictions) == 0:
print("No bounding boxes found in this frame")