Plumerai Video Intelligence C++ API¶
This document describes the C++ API for the Plumerai Video Intelligence library for videos.
The C++ API header files are self-documented. The main entrypoint is the plumerai::VideoIntelligence class which provides access to all functionality of the Plumerai software. It has a process_frame function that needs to be executed on each input frame. The various intelligence features such as object detection and familiar face identification are available through different interfaces, listed here.
Please refer to the minimal examples for example code to get started.
The API is re-entrant, i.e. you can instantiate several VideoIntelligence objects in different threads and use them independently. However, using the same instance from different threads at the same time is not supported.
VideoIntelligence¶
VideoIntelligence¶
Initializes a new Video Intelligence object.
This version of the constructor uses dynamic memory allocation. This is the default and recommended way to use the VideoIntelligence object.
This needs to be called only once at the start of the application.
Arguments:
- height: The height of the input image in pixels.
- width: The width of the input image in pixels.
VideoIntelligence¶
Initializes a new Video Intelligence object.
This version of the constructor uses a user-provided memory buffer and is meant for advanced use-cases where dynamic memory allocation is not supported or not desired.
This needs to be called only once at the start of the application.
The VideoIntelligence object will not take ownership of the memory buffer. The memory buffer must be at least as big as the size returned by get_required_memory_size and must stay valid for the lifetime of the VideoIntelligence object.
If the size of the memory buffer is not big enough, the resulting object will be invalid, and all functions will return NOT_ENOUGH_MEMORY.
Arguments:
- height: The height of the input image in pixels.
- width: The width of the input image in pixels.
- buffer: A pointer to the memory buffer, must stay valid for the lifetime of the
VideoIntelligenceobject. - buffer_size: The size of the memory buffer. The required size can be obtained through
get_required_memory_size.
process_frame¶
template <ImageFormat image_format>
ErrorCode process_frame(const ImagePointer<image_format> image_data,
float delta_t = 0.f);
Process a single frame from a video sequence.
Make sure the image is right side up. When it is upside down it can still work but accuracy is significantly degraded.
Supported input formats:
- Packed/interleaved data-types:
- PACKED_RGB888: 8-bit red, green, and blue. Red first.
- Example in memory: R0 G0 B0 R1 G1 B1 ...
- PACKED_RGBA8888: 8-bit red, green, blue, and alpha. Red first. Also known as ABGR32.
- Example in memory: R0 G0 B0 A0 R1 G1 B1 A1 ...
- PACKED_BGRA8888: 8-bit blue, green, red, and alpha. Blue first. Also known as ARGB32.
- Example in memory: B0 G0 R0 A0 B1 G1 R1 A1 ...
- PACKED_RGB565: 5-bit red, 6-bit green, and 5-bit blue. Blue and first 3-bits green in the first byte, remaining 3-bits green and red in the second byte.
- Example in memory: BG0 GR0 BG1 GR1 ...
- PACKED_YUYV: ==YUY2, Luma (Y) and chroma (U, V) with 4:2:0 subsampling.
- Example in memory: Y0 U01 Y1 V01 Y2 U23 Y3 V23 ...
- PACKED_RGB888: 8-bit red, green, and blue. Red first.
- Planar formats:
- PLANAR_RGB888: 8-bit red, green, and blue. Red first. We assume the R, G, and B data is consecutive in memory without padding in between the three channels.
- Example in memory: R0 R1 R2 ... G0 G1 G2 ... B0 B1 B2 ...
- PLANAR_YUV420: 8-bit luma (Y) and chroma (U, V) with 4:2:0 subsampling. The Y, U, and V planes can be in 3 different memory locations.
- Example in memory: Y0 Y1 Y2 Y3 ..., U01 U23 ..., V01 V23 ...
- PLANAR_NV12: 8-bit luma (Y) and interleaved chroma (UV) with 4:2:0 subsampling. The Y and UV planes can be in 2 different memory locations.
- Example in memory: Y0 Y1 Y2 Y3 ..., U01 V01 U23 V23 ...
- PLANAR_RGB888: 8-bit red, green, and blue. Red first. We assume the R, G, and B data is consecutive in memory without padding in between the three channels.
Note that not all formats are available on every platform.
Arguments:
- image_format: A template parameter which must be one of the
ImageFormatenum values. - image_data: A pointer to the image data in the form of an
ImagePointerhelper struct. - delta_t: The time in seconds it took between this and the previous video frame (1/fps). If set to 0, then the system clock will be used to compute this value.
Returns:
- An error code of type
ErrorCode. See that enum for more details.
single_image¶
template <ImageFormat image_format>
ErrorCode single_image(const ImagePointer<image_format> image_data,
int height = 0, int width = 0);
Process a single image not part of a video sequence.
This should not be used for video data. It can be used for face enrollments from a set of images. The object detection box id values obtained after calling single_image are not related to those generated through process_frame or through other calls to single_image.
See the documentation under VideoIntelligence::process_frame for details about the formats.
Arguments:
- image_format: A template parameter which must be one of the
ImageFormatenum values. - image_data: A pointer to the image data in the form of an
ImagePointerhelper struct. - height: The height of the input image in pixels. If
height = 0the height set in the constructor will be used. - width: The width of the input image in pixels. If
width = 0the width set in the constructor will be used.
Returns:
- An error code of type
ErrorCode. See that enum for more details.
set_night_mode¶
Configure the video intelligence algorithm for either day mode color videos (default) or night mode IR videos.
This configures the video intelligence algorithm for optimal performance on day mode color data (default) or on night mode IR data.
After switching from day to night mode or back, the motion detection algorithm will need a couple of video frames to stabilize, so the motion-grid will not be immediately available.
This function does not have to be called before every frame, only when switching from RGB to IR video data or back.
Arguments:
- night_mode: Set to true for night mode or false for day mode.
Returns:
- Returns
ErrorCode::SUCCESSon success.
camera_is_unstable¶
Signal that the camera/ISP is unstable.
In some situations, the camera or ISP may be adjusting its settings, resulting in unstable video frames. This can happen, for example, during auto-exposure or switching to and from IR mode. After calling this function, the algorithm will not run motion detection when frames are processed. When camera_is_no_longer_unstable is called, the algorithm will reset its internal motion state and continue processing frames as usual.
Returns:
- Returns
ErrorCode::SUCCESSon success.
camera_is_no_longer_unstable¶
Signal that the camera/ISP is stable again.
Needs to be called some time after camera_is_unstable to re-enable the Plumerai motion detection. See camera_is_unstable for more information.
Returns:
- Returns
ErrorCode::SUCCESSon success.
store_state¶
Store the current state of the algorithm to a byte array.
This function can be used when processing a video in chunks, doing different chunks at different times or on different machines. The state can be restored by calling restore_state with the data returned by store_state. When the library is built with support for familiar face identification, the state includes the face library.
Constraints:
- The
delta_tparameter ofprocess_framecan not be left to zero after restoring a previous state. - If familiar face identification is enabled, the state can only be stored and restored when not enrolling.
Arguments:
- state: A vector to store the serialized state in.
Returns:
- Returns
ErrorCode::SUCCESSon success, orErrorCode::STATE_WHILE_ENROLLINGif the state is being stored while enrolling.
Example:
auto pvi = plumerai::VideoIntelligence(height, width);
std::vector<std::uint8_t> state;
auto error_code = pvi.store_state(state);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("ERROR: store_state returned %s\n",
plumerai::error_code_string(error_code));
}
restore_state¶
Restore the state of the algorithm from a byte array.
See store_state for more information. The user must ensure that the height and width of the current object match the height and width of the state that is being restored.
Arguments:
- state: A vector containing the serialized state.
Returns:
- Returns
ErrorCode::SUCCESSon success. ReturnsErrorCode::STATE_CORRUPT,ErrorCode::STATE_SETTINGS_MISMATCH, orErrorCode::STATE_WHILE_ENROLLINGon error.
Example:
auto pvi = plumerai::VideoIntelligence(height, width);
// The state as obtained by the store-state API, e.g. loaded from memory
std::vector<std::uint8_t> state = ...;
auto error_code = pvi.restore_state(state);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("ERROR: restore_state returned %s\n",
plumerai::error_code_string(error_code));
}
debug_mode_start¶
Enable debug mode.
When process_frame is called while debug mode is active, this will store debug information. This data is meant to be stored to a file to be shared with Plumerai support for further analysis. These files contain uncompressed input image data and can become large (several megabytes per frame). If exclude_images is set to true, no image data will be included.
The resulting debug data can be retrieved using debug_mode_end.
Calling debug_mode_start invalidates the data pointer obtained from any previous calls to debug_mode_end.
Arguments:
- exclude_images: Set to true to exclude input images in the debug data.
Returns:
- Returns
ErrorCode::SUCCESSif all went well. ReturnsErrorCode::NOT_AVAILABLEon platforms where this functionality is not available. It can returnErrorCode::INTERNAL_ERRORif this method is called twice without callingdebug_mode_end.
debug_mode_end¶
Stop debug mode and retrieve the debug data.
The user will receive a pointer to the gathered data. The data will be invalidated after another call to debug_mode_start.
This function is only available on platforms that support dynamic memory allocation.
Arguments:
- debug_data_buffer: An output parameter that receives a pointer to the debug data.
- debug_data_size: An output parameter that receives the size of the debug data.
Returns:
- Returns
ErrorCode::SUCCESSif all went well. ReturnsErrorCode::NOT_AVAILABLEon platforms where this functionality is not available. It can returnErrorCode::INTERNAL_ERRORif this method is called twice without callingdebug_mode_start, orErrorCode::INVALID_ARGUMENTwhendebug_data_bufferis null.
Example:
auto pvi = plumerai::VideoIntelligence(height, width);
auto error_code = pvi.debug_mode_start(true);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("ERROR: debug_mode_start returned %s\n",
plumerai::error_code_string(error_code));
}
for (...) {
pvi.process_frame(...);
}
const std::uint8_t* debug_data_buffer = nullptr;
std::size_t debug_data_size = 0;
auto error_code = pvi.debug_mode_end(&debug_data_buffer,
&debug_data_size);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("ERROR: debug_mode_end returned %s\n",
plumerai::error_code_string(error_code));
}
// Write the data to a file for Plumerai support
const char* debug_file_name = "/tmp/plumerai_debug_data.bin";
debug_file = fopen(debug_file_name, "wb");
fwrite(debug_data_buffer, 1, debug_data_size, debug_file);
fclose(debug_file);
get_required_memory_size¶
Get the required memory size for the VideoIntelligence object.
This optional function can be used to avoid any dynamic memory allocation within the VideoIntelligence object. See the advanced VideoIntelligence constructor for details.
Arguments:
- height: The height of the input image in pixels.
- width: The width of the input image in pixels.
code_version¶
Returns the version of the video intelligence code as a date and time.
For other version numbers, see also ObjectDetection::detector_version for the object detector, and FaceIdentification::embedding_version for the face embedder.
Returns:
- The version of the code as YYYY.MM.DD.HH.MM date and time string.
object_detection¶
Get the interface for the ObjectDetection video intelligence features.
motion_detection¶
Get the interface for the MotionDetection video intelligence features.
detection_zones¶
Get the interface for the DetectionZones video intelligence features.
face_identification¶
Get the interface for the FaceIdentification video intelligence features.
face_enrollment_automatic¶
Get the interface for the FaceEnrollmentAutomatic video intelligence features.
face_enrollment_manual¶
Get the interface for the FaceEnrollmentManual video intelligence features.
vlm_video_collection¶
Get the interface for the VLMVideoCollection video intelligence features.
Upgrade guide¶
From version 2.2 to 2.3¶
The meaning of the confidence field in the BoxPrediction struct has been changed: the confidence values are now tuned so that for each code/model version and for each object class, sensible values range from 0.5 to 1. If you have been using the confidence field before, e.g. for thresholding, you must re-tune your thresholds accordingly and consider the new Plumerai API functions for this instead.
By default, the Plumerai software now sets a threshold (e.g. 0.6) to filter out values below a certain confidence level. The default value can be queried with a new API function ObjectDetection::get_confidence_threshold. To change the threshold, use the newly added API function ObjectDetection::set_confidence_threshold.
As a side effect of some internal changes, the track IDs produced by the algorithm are no longer guaranteed to be consecutive values, instead they will sometimes skip values.
From version 2.1 to 2.2¶
The track ID in the BoxPrediction struct has been updated, so code that uses track IDs needs to be updated accordingly:
- It has been renamed from
idtotrack_id. - It is now a signed 64-bit value (
int64_t) instead of an unsigned 32-bit. - The track ID now starts counting from a unique random number instead of 0. This should likely not affect any code, but if it does, it is possible to re-create the old behaviour by saving the first track ID as a base value and subtracting it from subsequent track IDs.
The FaceEnrollmentAutomatic class has been updated:
-
The following functions have been renamed, code can be updated with a find-replace.
remove_embeddingis nowremove_face_idremove_all_embeddingsis nowremove_all_face_idsget_embedding_datais nowget_enrollment_datarestore_embedding_datais nowrestore_enrollment_datamerge_embeddingsis nowmerge_face_ids
-
The
get_face_snapshotfunction now includes an additionaltrack_idargument to retrieve snapshots for different tracks within an enrollment. Refer to the function's documentation for further details.
From version 2.0 to 2.1¶
Apart from adding new functionality, two minor changes were made to the existing API in 2.1:
- The detection zones API used to accept an array of
std::tupleitems, but this is now a flat array of coordinates. See DetectionZones::add_zone for more information. - Because there is now also a static memory version of the API, the
VideoIntelligenceconstructor andFaceEnrollmentAutomatic::configure_face_snapshotsnow take additional optional arguments. Leaving them to their defaults will result in the same behaviour as in 2.0.
From version 1.x to 2.0¶
-
The name 'People Detection' was changed to 'Video Intelligence' to reflect support for other detection classes and features such as advanced motion detection.
- The library filename changed from
libplumerai{peopledetection,faceidentification}tolibplumeraivideointelligence. This should be updated in the relevant build scripts. - The main header file is now
plumerai/video_intelligence.hinstead ofplumerai/people_detection.h,plumerai/face_identification.horplumerai/face_identification_automatic.h. - The main class to use is now always
plumerai::VideoIntelligenceinstead ofplumerai::PeopleDetection,plumerai::FaceIdentificationorplumerai::FaceIdentificationAutomatic.
- The library filename changed from
-
Different features have been moved to separate 'feature classes', accessible from the main class
VideoIntelligence. For exampleVideoIntelligence::motion_detection()provides access to all functionality related to motion detection, andVideoIntelligence::face_enrollment_automatic()provides access to automatic face enrollment functionality. Most functions that were inPeopleDetection,FaceIdentificationorFaceIdentificationAutomatichave been moved to one of those other feature classes. For example:PeopleDetection::add_detection_zone(...)is nowVideoIntelligence::detection_zones().add_zone(...)FaceIdentification::add_face_embedding(...)is nowVideoIntelligence::face_enrollment_manual().add_embedding(...)FaceIdentification::get_face_id(...)is nowVideoIntelligence::face_identification().get_face_id(...)
-
The function
process_frameno longer returns bounding boxes as output. Instead, the boxes are now accessible throughVideoIntelligence::object_detection().get_detections(...)after the frame has been processed. Furthermore, image data pointers are now wrapped in aplumerai::ImagePointerstruct, this is shown in the example below.
Code that looked like this before:
#include "plumerai/people_detection.h"
auto ppd = plumerai::PeopleDetection(height, width);
std::vector<BoxPrediction> predictions(0);
const auto error_code =
ppd.process_frame<image_format>(image_data, predictions, delta_t);
Should be updated to this:
#include "plumerai/video_intelligence.h" // (1)
// Initialize the Video Intelligence object (2)
auto pvi = plumerai::VideoIntelligence(height, width);
// Process a video frame (3)
auto error_code = pvi.process_frame(
plumerai::ImagePointer<image_format>(image_data), delta_t);
// Get bounding box results (4)
std::vector<BoxPrediction> predictions;
error_code = pvi.object_detection().get_detections(predictions);
-
Replace
plumerai/people_detection.h,plumerai/face_identification.horplumerai/face_identification_automatic.hbyplumerai/video_intelligence.h. -
Replace
plumerai::PeopleDetectionbyplumerai::VideoIntelligence. -
Wrap the raw pointer argument in
plumerai::ImagePointer. Remove thepredictionsoutput argument. -
Get bounding box results through the new
ObjectDetection::get_detectionsfunction.
From version 1.14 to 1.15¶
The confidence_threshold argument has been removed from the single_image function. The return type of debug_next_frame changed from bool to ErrorCodeType.
In version 1.15 the API of start_face_enrollment and finish_face_enrollment changed compared to earlier versions:
- The function
get_cumulative_enrollment_scorewas removed. - The
previous_embeddingargument ofstart_face_enrollmentwas removed. - The resulting embedding of
finish_face_enrollmentis now returned via a reference argument. - The return type of both functions is now an error code, and
finish_face_enrollmentcan indicate a low quality enrollment through this error code. There is no more need to check the enrollment score for the quality of the embedding.
If your code looked like this before:
Then it should be updated as follows:
std::vector<int8_t> embedding;
const auto error_code = ffid.finish_face_enrollment(embedding);
if (error_code != plumerai::ErrorCode::SUCCESS) {
printf("Error code: %d\n", error_code);
}
From version 1.13 to 1.14¶
In version 1.14 the API of process_frame changed compared to earlier versions: the return type is now an error code, and the resulting boxes are now returned via a reference argument.
If your code looked like this before:
Then it should be updated as follows: