Plumerai People Detection for Cortex-A C++ API¶
This document describes the C++ API for Plumerai’s people detection software for videos on Arm Cortex-A.
The API¶
The C++ API consists of a single header file which is self-documented. It is simple enough: there is a constructor that needs to be ran once, and a process_frame
function that needs to be executed on each input frame. Additionally, there is a single_image
function that can be used to process a single image independent of a video sequence.
The API is re-entrant, i.e. you can instantiate several PeopleDetection objects in different threads and use them independently. However, using the same instance from different threads at the same time is not supported.
PeopleDetection::PeopleDetection¶
Initializes a new people detection object. This needs to be called only once at the start of the application.Arguments
- height
int
: The height of the input image in pixels. - width
int
: The width of the input image in pixels.
Returns
Nothing.
PeopleDetection::process_frame (RGB, YUYV)¶
template <ImageFormat image_format>
std::vector<BoxPrediction> process_frame(const std::uint8_t *image_data,
float delta_t = 0.f)
Arguments
- image_format
ImageFormat
: A template parameter which can beImageFormat::PACKED_RGB888
orImageFormat::PACKED_YUYV
. ForImageFormat::PLANAR_YUYV420
see the function below. - image_data
const std::uint8_t *
: A pointer to RGB image data (1st byte red, 3rd blue) of sizeheight * width * 3
or YUYV image data of sizeheight * width * 2
. - delta_t
float
: The time in seconds it took between this and the previous video frame (1/fps). If set to 0 then the system clock will be used to compute this value.
Returns
std::vector<BoxPrediction>
: The resulting bounding-boxes found in the frame.
PeopleDetection::process_frame (YUV420)¶
template <ImageFormat image_format>
std::vector<BoxPrediction> process_frame(const std::uint8_t *image_y,
const std::uint8_t *image_u,
const std::uint8_t *image_v,
float delta_t = 0.f)
Arguments
- image_format
ImageFormat
: A template parameter which has to be set toImageFormat::PLANAR_YUYV420
. See the function above for the other formats. - image_data_y
const std::uint8_t *
: A pointer to the Y channel, of sizeheight * width
. - image_data_u
const std::uint8_t *
: A pointer to the U channel, of sizeheight * width / 4
. - image_data_v
const std::uint8_t *
: A pointer to the V channel, of sizeheight * width / 4
. - delta_t
float
: The time in seconds it took between this and the previous video frame (1/fps). If set to 0 then the system clock will be used to compute this value.
Returns
std::vector<BoxPrediction>
: The resulting bounding-boxes found in the frame.
PeopleDetection::single_image¶
std::vector<BoxPrediction> single_image(const std::uint8_t *image_data,
float confidence_threshold);
process_frame
or other calls to single_frame
. Arguments
- image_data
const std::uint8_t *
: A pointer to RGB image data (1st byte red, 3rd blue) of sizeheight * width * 3
. - confidence_threshold
float
: Any box with a confidence value below this threshold will be filtered out. Range between 0 and 1. A value of 0.63 is recommended for regular evaluation, but for mAP computation this can be set to 0.
Returns
std::vector<BoxPrediction>
: The resulting bounding-boxes found in the image.
BoxPrediction¶
struct BoxPrediction {
float y_min; // top coordinate between 0 and 1 in height dimension
float x_min; // left coordinate between 0 and 1 in width dimension
float y_max; // bottom coordinate between 0 and 1 in height dimension
float x_max; // right coordinate between 0 and 1 in width dimension
float confidence; // between 0 and 1, higher means more confident
unsigned int id; // the tracked identifier of this box
unsigned int class_id; // the class of the detected object
};
Full header¶
#pragma once
#include <string>
#include <vector>
#include "box_prediction.h"
namespace plumerai {
// Supported input formats
// - Packed/interleaved:
// - RGB888
// - YUYV, also known as YUY2, which has 4:2:0 subsampling
// - Planar:
// - YUV420
enum class ImageFormat { PACKED_RGB888, PACKED_YUYV, PLANAR_YUV420 };
class PeopleDetection {
public:
// Initializes a new people detection object. This needs to be called only
// once at the start of the application.
//
// @param height The height of the input image in pixels.
// @param width The width of the input image in pixels.
PeopleDetection(int height, int width);
// Destructor, called automatically when the object goes out of scope
~PeopleDetection();
// Process a single frame from a video sequence.
// This version supports RGB or YUYV input. See below for the YUV420 version.
//
// @param image_format Can be either ImageFormat::PACKED_RGB888 or
// ImageFormat::PACKED_YUYV. See below for PLANAR_YUYV420.
// @param image_data A pointer to RGB image data (1st byte red, 3rd blue)
// of size height * width * 3 or YUYV image data of size height * width * 2.
// @param delta_t The time in seconds it took between this and the previous
// video frame (1/fps). If left to the default of 0, then the system clock
// will be used to compute this value.
// @return the resulting bounding-boxes found in the frame.
template <ImageFormat image_format = ImageFormat::PACKED_RGB888>
std::vector<BoxPrediction> process_frame(const std::uint8_t *image_data,
float delta_t = 0.f);
// Process a single frame from a video sequence with planar YUV input and 420
// chroma subsampling. See above for the RGB and YUYV versions.
//
// @param image_format Has to be ImageFormat::PLANAR_YUYV420. See above for
// other formats.
// @param image_data_y A pointer to the Y channel, of size height * width.
// @param image_data_u A pointer to the U channel, of size height * width / 4.
// @param image_data_v A pointer to the V channel, of size height * width / 4.
// @param delta_t The time in seconds it took between this and the previous
// video frame (1/fps). If left to the default of 0, then the system clock
// will be used to compute this value.
// @return the resulting bounding-boxes found in the frame.
template <ImageFormat image_format = ImageFormat::PLANAR_YUV420>
std::vector<BoxPrediction> process_frame(const std::uint8_t *image_y,
const std::uint8_t *image_u,
const std::uint8_t *image_v,
float delta_t = 0.f);
// Process a single image not part of a video sequence. This should not be
// used for video data, but only for single image evaluation and debugging.
// The returned box id values are not related to those returned by
// `process_frame` or other calls to `single_frame`.
//
// @param image_data A pointer to RGB image data (1st byte red, 3rd blue)
// of size height * width * 3.
// @param confidence_threshold Any box with a confidence value below this
// threshold will be filtered out. Range between 0 and 1. A value of 0.63
// is recommended for regular evaluation, but for mAP computation this can
// be set to 0.
// @param height The height of the input image in pixels. If `height = 0` the
// height set in the constructor will be used.
// @param width The width of the input image in pixels. If `width = 0` the
// width set in the constructor will be used.
// @return the resulting bounding-boxes found in the image.
std::vector<BoxPrediction> single_image(const std::uint8_t *image_data,
float confidence_threshold,
int height = 0, int width = 0);
};
} // namespace plumerai
Example usage¶
Below is an example of using the C++ API shown above.
#include <string>
#include <vector>
#include "plumerai/people_detection.h"
int main() {
// Settings, to be changed as needed
constexpr int width = 1600; // camera image width in pixels
constexpr int height = 1200; // camera image height in pixels
constexpr auto image_format = plumerai::ImageFormat::PACKED_RGB888;
// Initialize the people detection algorithm
auto ppd = plumerai::PeopleDetection(height, width);
// Loop over frames in a video stream
while (true) {
// Some example input here, normally this is where camera data is acquired
auto image = std::vector<std::uint8_t>(height * width * 3); // 3 for RGB
// Process the frame
auto predictions = ppd.process_frame<image_format>(image.data());
// Display the results to stdout
for (auto &p : predictions) {
printf(
"Box #%d of class %d with confidence %.2f @ (x,y) -> (%.2f,%.2f) "
"till (%.2f,%.2f)\n",
p.id, p.class_id, p.confidence, p.x_min, p.y_min, p.x_max, p.y_max);
}
}
return 0;
}