Plumerai People Detection Python API¶
This document describes the Python API for the Plumerai People Detection software for videos.
The API¶
The Python API consists of a single class with a constructor that needs to be ran once, and a process_frame
function that needs to be executed on each input frame. Additionally, there is a single_image
function that can be used to process a single image independent of a video sequence.
PeopleDetection¶
Initializes a new people detection object. This needs to be called only once at the start of the application.
Arguments
- height
int
: The height of the input image in pixels. - width
int
: The width of the input image in pixels.
Returns
PeopleDetection
: Instance of the PeopleDetection
class.
process_frame¶
Process a single frame from a video sequence with RGB input. The image must have the height and width that was specified when the PeopleDetection object was created.
Arguments
- image
ArrayLike
: A tensor of shape(video_height, video_width, 3)
with RGB image data. It can be a Numpy, TensorFlow, PyTorch or Jax tensor. - delta_t
float
: The time in seconds it took between this and the previous video frame (1/fps). If left to the default of 0, then the system clock will be used to compute this value.
Returns
tuple[BoxPrediction, ...]
: The resulting bounding-boxes found in the frame.
single_image¶
Process a single image not part of a video sequence. This should not be used for video data, but only for single image evaluation and debugging. The returned box id values are not related to those returned by process_frame
or other calls to single_frame
.
Arguments
- image
ArrayLike
: A tensor of shape(*, *, 3)
with RGB image data. It can be a Numpy, TensorFlow, PyTorch or Jax tensor. - confidence_threshold
float
: Any box with a confidence value below this threshold will be filtered out. Range between 0 and 1. A value of 0.63 is recommended for regular evaluation, but for mAP computation this can be set to 0.
Returns
tuple[BoxPrediction, ...]
: The resulting bounding-boxes found in the image.
BoxPrediction¶
class BoxPrediction(NamedTuple):
y_min: float # top coordinate between 0 and 1 in height dimension
x_min: float # left coordinate between 0 and 1 in width dimension
y_max: float # bottom coordinate between 0 and 1 in height dimension
x_max: float # right coordinate between 0 and 1 in width dimension
confidence: float # between 0 and 1, higher means more confident
id: int = 0 # the tracked identifier of this box
class_id: int = DetectionClass.CLASS_UNKNOWN # the class of the detected object
A structure representing a single resulting bounding box. Coordinates are between 0 and 1, the origin is at the top-left.
Example usage¶
Below is an example of using the Python API shown above.
import numpy as np
import plumerai_people_detection as ppd_api
# Settings, to be changed as needed
width = 1600 # camera image width in pixels
height = 1200 # camera image height in pixels
# Initialize the people detection algorithm
ppd = ppd_api.PeopleDetection(height, width)
# Loop over frames in a video stream
while True:
# Some example input here, normally this is where camera data is acquired
image = np.zeros((height, width, 3), dtype=np.uint8)
# Process the frame
predictions = ppd.process_frame(image)
# Display the results to stdout
for p in predictions:
print(
f"Box #{p.id} of class {p.class_id} with confidence {p.confidence:.2f} "
f"@ (x,y) -> ({p.x_min:.2f},{p.y_min:.2f}) till ({p.x_max:.2f},{p.y_max:.2f})"
)