Plumerai People Detection C++ API¶
This document describes the C++ API for the Plumerai People Detection software for videos on Arm Cortex-A and x86.
The C++ API consists of a single header file which is self-documented. It is simple enough: there is a constructor that needs to be ran once, and a process_frame
function that needs to be executed on each input frame. Additionally, there is a single_image
function that can be used to process a single image independent of a video sequence.
Please refer to the C++ API example for example code to get started.
The API is re-entrant, i.e. you can instantiate several PeopleDetection objects in different threads and use them independently. However, using the same instance from different threads at the same time is not supported.
API¶
PeopleDetection¶
PeopleDetection¶
Initializes a new people detection object.
This needs to be called only once at the start of the application.
Arguments:
- height: The height of the input image in pixels.
- width: The width of the input image in pixels.
Returns:
- Nothing.
process_frame¶
template <ImageFormat image_format = ImageFormat::PACKED_RGB888>
ErrorCodeType process_frame(const std::uint8_t *image_data,
std::vector<BoxPrediction> &results,
float delta_t = 0.f);
template <ImageFormat image_format = ImageFormat::PLANAR_YUV420>
ErrorCodeType process_frame(const std::uint8_t *image_y,
const std::uint8_t *image_u,
const std::uint8_t *image_v,
std::vector<BoxPrediction> &results,
float delta_t = 0.f);
template <ImageFormat image_format = ImageFormat::PLANAR_NV12>
ErrorCodeType process_frame(const std::uint8_t *image_y,
const std::uint8_t *image_uv,
std::vector<BoxPrediction> &results,
float delta_t = 0.f);
Process a single frame from a video sequence.
The first version supports RGB, RGBA, BGRA or YUYV input. The second version supports planar YUV input with 420 chroma subsampling. The third version supports NV12 input.
Make sure the image is right side up. When it is upside down it can still work but accuracy is significantly degraded.
Note that the algorithm comes with a built-in threshold (e.g. 0.6 - this differs per model): boxes with confidences lower than that value won't be produced at all by this function.
Arguments:
- image_format: A template parameter which can be
ImageFormat::PACKED_RGB888
,ImageFormat::PACKED_RGBA8888
,ImageFormat::PACKED_BGRA8888
,ImageFormat::PACKED_YUYV
,ImageFormat::PLANAR_YUYV420
orImageFormat::PLANAR_NV12
. - image_data: A pointer to RGB image data (1st byte red, 3rd blue) of size
height * width * 3
, or RGBA or BGRA image data of sizeheight * width * 4
or YUYV image data of sizeheight * width * 2
. - image_y: A pointer to the Y channel, of size
height*width
. - image_u: A pointer to the U channel, of size
height*width/4
. - image_v: A pointer to the V channel, of size
height*width/4
. - image_uv: A pointer to the interleaved UV channels, size
height*width/2
. - results: The resulting bounding-boxes found in the frame.
- delta_t: The time in seconds it took between this and the previous video frame (1/fps). If left to the default of 0, then the system clock will be used to compute this value.
Returns:
- An error code of type
ErrorCode
. See that enum for more details.
single_image¶
template <ImageFormat image_format = ImageFormat::PACKED_RGB888>
ErrorCodeType single_image(const std::uint8_t *image_data,
std::vector<BoxPrediction> &results,
int height = 0, int width = 0);
template <ImageFormat image_format = ImageFormat::PLANAR_YUV420>
ErrorCodeType single_image(const std::uint8_t *image_y,
const std::uint8_t *image_u,
const std::uint8_t *image_v,
std::vector<BoxPrediction> &results,
int height = 0, int width = 0);
template <ImageFormat image_format = ImageFormat::PLANAR_NV12>
ErrorCodeType single_image(const std::uint8_t *image_y,
const std::uint8_t *image_uv,
std::vector<BoxPrediction> &results,
int height = 0, int width = 0);
Process a single image not part of a video sequence.
The first version supports RGB, RGBA, BGRA or YUYV input. The second version supports planar YUV input with 420 chroma subsampling. The third version supports NV12 input.
This should not be used for video data. It can be used for face enrollments from a set of images. The returned box id values are not related to those returned by process_frame
or other calls to single_image
.
Arguments:
- image_format: A template parameter which can be
ImageFormat::PACKED_RGB888
,ImageFormat::PACKED_RGBA8888
,ImageFormat::PACKED_BGRA8888
,ImageFormat::PACKED_YUYV
,ImageFormat::PLANAR_YUYV420
orImageFormat::PLANAR_NV12
. - image_data: A pointer to RGB image data (1st byte red, 3rd blue) of size
height * width * 3
, or RGBA or BGRA image data of sizeheight * width * 4
or YUYV image data of sizeheight * width * 2
. - image_y: A pointer to the Y channel, of size
height*width
. - image_u: A pointer to the U channel, of size
height*width/4
. - image_v: A pointer to the V channel, of size
height*width/4
. - image_uv: A pointer to the interleaved UV channels, size
height*width/2
. - results: The resulting bounding-boxes found in the frame.
- height: The height of the input image in pixels. If
height = 0
the height set in the constructor will be used. - width: The width of the input image in pixels. If
width = 0
the width set in the constructor will be used.
Returns:
- An error code of type
ErrorCode
. See that enum for more details.
has_box_moved¶
Check if a box has moved significantly since its initial detection.
This function should only be used for boxes returned by the most recent call to process_frame
. This function only accepts boxes of the Person, Vehicle, Animal and Package classes.
The has_moved
parameter will be set to:
false
if the box has not moved since its initial detectiontrue
if the box has moved
If a box has moved but then stopped moving for long enough, the has_moved
value will be reset to false
.
This function should not be called directly after restoring from a previous state.
This function should not be called with results of single_image
calls.
Arguments:
- box: A box returned by
process_frame
. - has_moved: A boolean that will be set to
true
if the box has moved.
Returns:
- Returns
ErrorCode::SUCCESS
orErrorCode::INVALID_BOX
on error.
Example:
auto ppd = plumerai::PeopleDetection(height, width);
for (int t = 0; t < num_frames; ++t) {
std::vector<BoxPrediction> boxes;
ppd.process_frame(image, boxes, delta_t);
for (auto& box : boxes) {
bool has_moved = false;
auto error_code = ppd.has_box_moved(box, has_moved);
if (error_code == plumerai::ErrorCode::SUCCESS) {
printf("Box has moved: %s\n", has_moved ? "yes" : "no");
}
}
}
reset_tracker¶
This function is only available if the library was built with tracking support.
This resets the internal tracker state and resets all tracker ids and face identifications if applicable. It is recommended to call this whenever two consecutive frames are too different from each other, such as when switching to a different camera input or when the camera abruptly moved.
Arguments:
- None.
Returns:
- Nothing.
store_state¶
Store the current state of the algorithm to a byte array.
This function can be used when processing a video in chunks, doing different chunks at different times or on different machines. The state can be restored by calling restore_state
with the data returned by store_state
. When the library is built with support for familiar face identification, the state includes the face library. Constraints:
- The
delta_t
parameter ofprocess_frame
can not be left to zero after restoring a previous state. - If familiar face identification is enabled, the state can only be stored and restored when not enrolling.
Arguments:
- state: A vector to store the serialized state in.
Returns:
- Returns
ErrorCode::SUCCESS
on success.
Example:
auto ppd = plumerai::PeopleDetection(height, width);
std::vector<std::uint8_t> state;
auto error_code = ppd.store_state(state);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("ERROR: store_state returned %s\n",
plumerai::error_code_string(error_code));
}
restore_state¶
Restore the state of the algorithm from a byte array.
See store_state
for more information. The user must ensure that the height and width of the current object match the height and width of the state that is being restored.
Arguments:
- state: A vector containing the serialized state.
Returns:
- Returns
ErrorCode::SUCCESS
on success. ReturnsErrorCode::STATE_CORRUPT
orErrorCode::STATE_SETTINGS_MISMATCH
on error.
Example:
auto ppd = plumerai::PeopleDetection(height, width);
// The state as obtained by the store-state API, e.g. loaded from memory
std::vector<std::uint8_t> state = ...;
auto error_code = ppd.restore_state(state);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("ERROR: restore_state returned %s\n",
plumerai::error_code_string(error_code));
}
get_motion_detection_grid_height¶
Retrieves the height of the motion detection grid.
For more information, see the docs under get_motion_detection_grid
.
Arguments:
- None.
Returns:
- The height of the motion detection grid.
get_motion_detection_grid_width¶
Retrieves the width of the motion detection grid.
For more information, see the docs under get_motion_detection_grid
.
Arguments:
- None.
Returns:
- The width of the motion detection grid.
get_motion_detection_grid¶
Retrieves the amount of motion found in each grid cell of the frame.
As a by-product of people detection, motion detection is performed. For specific use-cases, it might be useful to access this raw motion detection information as well. This function provides access to it, however, it can only be called as long as Plumerai People Detection is also running, and it won't be available in the first few frames.
The result of this function is a 2D grid of type float, with dimensions that can be retrieved with get_motion_detection_grid_height
and get_motion_detection_grid_width
. The values in each grid cell are floats between 0.0 and 1.0, and denote how much motion was detected in that grid cell. A higher value indicates more motion.
Arguments:
- motion_detection_grid: A float vector in which the result will be stored. It will be resized by this function to the correct size.
Returns:
- An error code of type
ErrorCode
. It will returnSUCCESS
if all went fine, orMOTION_GRID_NOT_YET_READY
if this function is called too soon after initialization of the people detection object. It needs to process at least a few frames before the motion grid is valid.
Example:
auto ppd = plumerai::PeopleDetection(height, width);
for (int t = 0; t < num_frames; ++t) {
ppd.process_frame(...);
std::vector<float> motion_grid(0);
auto error_code = ppd.get_motion_detection_grid(motion_grid);
if (error_code == plumerai::ErrorCode::SUCCESS) {
// Motion-detection grid `motion_grid` is now available
}
}
add_detection_zone¶
ErrorCodeType add_detection_zone(
const std::vector<std::tuple<float, float>> &coordinates,
const std::vector<DetectionClass> &classes, int &zone_id);
Specify a detection zone polygon for a group of classes.
The zone can be used to verify whether a bounding-box is inside or outside of it using the is_box_in_detection_zone
function. It can also be used internally by the Plumerai library to improve detection quality. A single detection zone can be used for one or more classes.
Arguments:
- coordinates: A vector of (x, y) coordinates specifying the polygon of the detection zone in normalized coordinates, between 0 and 1. The code assumes that the first coordinate is also the final coordinate of the polygon: it should not be given by the user. The polygon must be simple (not complex): it can't have holes or self-intersections. It is allowed to be both concave or convex.
- classes: A vector of classes for which this detection zone is valid.
- zone_id: The created detection zone's unique ID is returned by this function for use in
is_box_in_detection_zone
.
Returns:
- An error code of type
ErrorCode
. It will returnSUCCESS
if all went fine, or otherwiseINVALID_ZONE_CLASS
orINVALID_ZONE_GEOMETRY
.
Example:
auto ppd = plumerai::PeopleDetection(height, width);
std::vector<std::tuple<float, float>> coordinates = {
{0.0f, 0.0f}, {0.5, 0.0f}, {0.5, 1.0f}, {0.0f, 1.0f}};
int zone_id = -1;
auto error_code = ppd.add_detection_zone(
coordinates, {DetectionClass::CLASS_PERSON}, zone_id);
if (error_code != plumerai::ErrorCode::SUCCESS) { return; } // ERROR
// the 'zone_id' is valid and can be used in 'is_box_in_detection_zone'
is_box_in_detection_zone¶
ErrorCodeType is_box_in_detection_zone(int zone_id, const BoxPrediction &box,
bool &is_in_zone) const;
Determines whether a box prediction is within a detection zone.
Arguments:
- zone_id: The detection zone ID as returned by 'add_detection_zone'.
- box: A bounding-box prediction as returned by
process_frame
. - is_in_zone: The resulting boolean that indicates whether the given box is completely or partially inside the given detection zone in the current frame.
Returns:
- An error code of type
ErrorCode
. It will returnSUCCESS
if all went fine, or otherwiseINVALID_ZONE_ID
,INVALID_ZONE_CLASS
orINVALID_ZONE_GEOMETRY
.
Example:
auto ppd = plumerai::PeopleDetection(height, width);
int zone_id = -1;
ppd.add_detection_zone(coordinates, classes, zone_id);
(...)
auto predictions = std::vector<BoxPrediction>();
ppd.process_frame(image_data, predictions, delta_t);
for (auto p : predictions) {
bool is_in_zone = false;
error_code = ppd.is_box_in_detection_zone(zone_id, p, is_in_zone);
if (error_code != plumerai::ErrorCode::SUCCESS) { return; } // ERROR
if (is_in_zone) {
printf("Box with ID %d is in zone %d\n", p.id, zone_id);
}
}
debug_next_frame¶
Enable debug mode for the next frame.
The next time process_frame
is called, this will dump the input image as well as internal data and final results to a file. This file can then be shared with Plumerai support for further analysis. The file will be overwritten if it already exists, so to debug multiple frames, distinct filenames have to be used in successive calls to this function. Warning: these files contain uncompressed image data and can become large
Arguments:
- output_file_name: A filename to write the data to.
Returns:
- Returns
ErrorCode::SUCCESS
if all went well. It might returnErrorCode::INTERNAL_ERROR
if this method is called twice without callingprocess_frame
, or if the file could not be opened for writing.
BoxPrediction¶
typedef enum {
CLASS_UNKNOWN = 0,
CLASS_PERSON = 1,
CLASS_HEAD = 2,
CLASS_FACE = 3,
CLASS_MAX_ENUM = 3,
} DetectionClass;
typedef struct BoxPrediction {
float y_min; // top coordinate between 0 and 1 in height dimension
float x_min; // left coordinate between 0 and 1 in width dimension
float y_max; // bottom coordinate between 0 and 1 in height dimension
float x_max; // right coordinate between 0 and 1 in width dimension
float confidence; // between 0 and 1, higher means more confident
unsigned int id; // the tracked identifier of this box
DetectionClass class_id; // the class of the detected object
} BoxPrediction;
A structure representing a single resulting bounding box. Coordinates are between 0 and 1, the origin is at the top-left. Confidence values lie between 0 and 1. Note that the algorithm comes with a built-in threshold (e.g. 0.6 - this differs per model): boxes with confidences lower than that value won't be produced at all by the Plumerai People Detection functions.
ErrorCode¶
typedef enum {
SUCCESS = 0,
// Should not occur, contact Plumerai if this happens
INTERNAL_ERROR = -1,
// The `delta_t` parameter should be >= 0
INVALID_DELTA_T = -2,
// The `STATE_` error codes are only returned by `store_state` and
// `restore_state`. See those functions for more details.
// The state can not be (re)stored while enrolling
STATE_WHILE_ENROLLING = -3,
// The state could not be restored
STATE_CORRUPT = -4,
// The state was serialized with a different height/width than the current
// object, or uses different familiar face identification settings.
STATE_SETTINGS_MISMATCH = -5,
// This error code is only returned by `get_motion_detection_grid`.
// See that function for more details.
MOTION_GRID_NOT_YET_READY = -6,
// This is returned by `box_has_moved` if the argument is not a valid box
// returned by `process_frame`.
INVALID_BOX = -7,
// Return by `add_detection_zone` or `is_box_in_detection_zone` when the
// provided zone coordinates or zone ID have an invalid geometry, such as too
// few or too many points.
INVALID_ZONE_GEOMETRY = -8,
// Returned by `is_box_in_detection_zone` if the given zone ID is not a valid
// zone ID as created by `add_detection_zone`.
INVALID_ZONE_ID = -9,
// Returned by `is_box_in_detection_zone` if the given box does not belong to
// one of the configured classes for the given detection zone, or returned by
// `add_detection_zone` if none or invalid classes are specified.
INVALID_ZONE_CLASS = -10,
} ErrorCode;
error_code_string¶
Returns a string representation of an error code.
Arguments
- error_code
ErrorCodeType
: An error code.
Returns
const char*
: A string representation of the error code.
Example
auto ppd = plumerai::PeopleDetection(height, width);
auto error_code = ppd.process_frame(...);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("ERROR: %s\n", plumerai::error_code_string(error_code));
}
Example usage¶
Below is an example of using the C++ API shown above.
#include <cstdint>
#include <vector>
#include "plumerai/people_detection.h"
int main() {
// Settings, to be changed as needed
constexpr int width = 1600; // camera image width in pixels
constexpr int height = 1200; // camera image height in pixels
constexpr auto image_format = plumerai::ImageFormat::PACKED_RGB888;
// Initialize the people detection algorithm
auto ppd = plumerai::PeopleDetection(height, width);
// Loop over frames in a video stream (example: 10 frames)
for (int t = 0; t < 10; ++t) {
// Some example input here, normally this is where camera data is acquired
auto image = std::vector<std::uint8_t>(height * width * 3); // 3 for RGB
// The time between two video frames in seconds
// In this example we assume a constant frame rate of 30 fps, but variable
// rates are supported.
const float delta_t = 1.f / 30.f;
// Process the frame
std::vector<BoxPrediction> predictions(0);
const auto error_code =
ppd.process_frame<image_format>(image.data(), predictions, delta_t);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("Error: %s\n", plumerai::error_code_string(error_code));
return 1;
}
// Display the results to stdout
for (auto &p : predictions) {
printf(
"Box #%d of class %d with confidence %.2f @ (x,y) -> (%.2f,%.2f) "
"till (%.2f,%.2f)\n",
p.id, p.class_id, p.confidence, p.x_min, p.y_min, p.x_max, p.y_max);
}
}
return 0;
}
Upgrade guide¶
From version 1.14 to 1.15¶
The confidence_threshold
argument has been removed from the single_image
function. The return type of debug_next_frame
changed from bool
to ErrorCodeType
.
From version 1.13 to 1.14¶
In version 1.14 the API of process_frame
changed compared to earlier versions: the return type is now an error code, and the resulting boxes are now returned via a reference argument.
If your code looked like this before:
Then it should be updated as follows: