Plumerai Object and Motion Detection API¶

This document describes the API for the object detection and motion detection functionality.

BoxPrediction¶

typedef enum {
  CLASS_UNKNOWN = 0,
  CLASS_PERSON = 1,
  CLASS_HEAD = 2,
  CLASS_FACE = 3,
  CLASS_VEHICLE = 4,
  CLASS_ANIMAL = 5,
  CLASS_PACKAGE = 6,
  CLASS_MAX_ENUM = 6,
} DetectionClass;

typedef struct BoxPrediction {
  float y_min;             // top coordinate between 0 and 1 in height dimension
  float x_min;             // left coordinate between 0 and 1 in width dimension
  float y_max;             // bottom coordinate between 0 and 1 in height dimension
  float x_max;             // right coordinate between 0 and 1 in width dimension
  float confidence;        // between 0 and 1, higher means more confident
  unsigned int id;         // the tracked identifier of this box
  DetectionClass class_id; // the class of the detected object
} BoxPrediction;

A structure representing a single resulting bounding box. Coordinates are between 0 and 1, the origin is at the top-left. Confidence values lie between 0 and 1. Note that the algorithm comes with a built-in threshold (e.g. 0.6 - this differs per model and per class): boxes with confidences lower than that value won't be produced at all by the Plumerai software.

ObjectDetection¶

get_detections¶

ErrorCode get_detections(const BoxPrediction** results,
                         std::size_t* results_size);
ErrorCode get_detections(std::vector<BoxPrediction>& results);

Obtain the object detections from the most recently processed frame.

Note that the algorithm comes with a built-in threshold (e.g. 0.6 - this differs per model): boxes with confidences lower than that value won't be produced at all by this function.

The pointer returned by this function will be invalidated by any subsequent call to VideoIntelligence::process_frame or VideoIntelligence::single_image.

On some platforms, a version that uses std::vector is available.

Arguments:

results: An output parameter that receives a pointer to the resulting bounding boxes.
results_size: An output parameter that receives the number of resulting bounding boxes.

Returns:

An error code of type ErrorCode. See that enum for more details.

Example:

auto pvi = plumerai::VideoIntelligence(height, width);

pvi.process_frame(...);

const BoxPrediction* boxes = nullptr;
std::size_t num_boxes = 0;
pvi.object_detection().get_detections(&boxes, &num_boxes);

for (std::size_t i = 0; i < num_boxes; ++i) {
  printf(
      "Box #%d of class %d @ (x,y) -> (%.2f,%.2f) till (%.2f,%.2f)\n",
      p.id, p.class_id, p.x_min, p.y_min, p.x_max, p.y_max);
}

has_box_moved¶

ErrorCode has_box_moved(const BoxPrediction& box, bool& has_moved,
                        float timeout_seconds = 20.0f) const;

Check if a box has moved significantly since its initial detection.

This function should only be used for boxes from the most recent video frame. This function only accepts boxes of the Person, Vehicle, Animal and Package classes.

The has_moved parameter will be set to:

false if the box has not moved since its initial detection
true if the box has moved If a box has moved but then stopped moving for at least 20 seconds, the has_moved value will be reset to false. This timeout can be changed by specifying the optional timeout_seconds parameter.

This function should not be called directly after restoring from a previous state.

This function should not be called with boxes obtained directly after VideoIntelligence::single_image calls.

Arguments:

box: A box from the most recent video frame.
has_moved: A boolean that will be set to true if the box has moved.
timeout_seconds: The number of seconds after which the has_moved flag will be reset. The default value is 20s. It is not allowed to set this value larger than the default.

Returns:

Returns ErrorCode::SUCCESS or ErrorCode::INVALID_BOX or ErrorCode::INVALID_HAS_MOVED_TIMEOUT on error.

Example:

auto pvi = plumerai::VideoIntelligence(height, width);
for (int t = 0; t < num_frames; ++t) {
  pvi.process_frame(...);
  const BoxPrediction* boxes = nullptr;
  std::size_t num_boxes = 0;
  pvi.object_detection().get_detections(&boxes, &num_boxes);
  for (auto i = 0; i < num_boxes; ++i) {
    bool has_moved = false;
    auto error_code =
        pvi.object_detection().has_box_moved(boxes[i], has_moved);
    if (error_code == plumerai::ErrorCode::SUCCESS) {
      printf("Box has moved: %s\n", has_moved ? "yes" : "no");
    }
  }
}

reset_tracker¶

void reset_tracker();

This function is only available if the library was built with tracking support.

This resets the internal tracker state and resets all tracker ids and face identifications if applicable. It is recommended to call this whenever two consecutive frames are too different from each other, such as when switching to a different camera input or when the camera abruptly moved.

detector_version¶

static std::uint32_t detector_version();

Returns a version number of the object detector neural network.

Returns:

The version number of the object detector.

MotionDetection¶

get_grid_height¶

int get_grid_height() const;

Retrieves the height of the motion detection grid.

For more information, see the docs under get_grid.

Returns:

The height of the motion detection grid.

get_grid_width¶

int get_grid_width() const;

Retrieves the width of the motion detection grid.

For more information, see the docs under get_grid.

Returns:

The width of the motion detection grid.

set_grid_size¶

ErrorCode set_grid_size(int height, int width);

Set the size of the motion detection grid.

There is a default grid size which scales with the input resolution, so it is not necessary to call this function. If a custom grid size is desired, this function can be called at the start of the application, before processing any frames.

Calling this function re-initializes the motion-detection algorithm so this should only be called at the start of the application.

For more information, see the docs under get_grid.

Arguments:

height: The height of the motion detection grid.
width: The width of the motion detection grid.

Returns:

An error code of type ErrorCode. It will return SUCCESS if all went well, or INVALID_GRID_SIZE if the supplied grid size is invalid.

get_grid¶

ErrorCode get_grid(const float** motion_detection_grid) const;
ErrorCode get_grid(std::vector<float>& motion_detection_grid) const;

Retrieves the amount of motion found in each grid cell of the frame.

As a by-product of object detection, motion detection is performed. For specific use-cases, it might be useful to access this raw motion detection information as well. This function provides access to it, but it won't be available in the first few frames.

The result of this function is a 2D grid of type float, with dimensions that can be retrieved with get_grid_height and get_grid_width. The values in each grid cell are floats between 0.0 and 1.0, and denote how much motion was detected in that grid cell. A higher value indicates more motion. The height is the outer dimension, and the width is the inner dimension.

The motion detection grid array is managed by this class, should not be modified by the caller, and is invalidated by each call to VideoIntelligence::process_frame.

On some platforms, a version that uses std::vector is available.

Arguments:

motion_detection_grid: An output parameter that receives a pointer to the float array in which the result will be stored.

Returns:

An error code of type ErrorCode. It will return SUCCESS if all went fine, or MOTION_GRID_NOT_YET_READY if this function is called too soon after initialization of the VideoIntelligence object. It needs to process at least a few frames before the motion grid is valid.

Example:

auto pvi = plumerai::VideoIntelligence(height, width);
for (int t = 0; t < num_frames; ++t) {
  pvi.process_frame(...);
  const float* motion_grid = nullptr;
  auto error_code = pvi.motion_detection().get_grid(&motion_grid);
  if (error_code == plumerai::ErrorCode::SUCCESS) {
    // Motion-detection grid `motion_grid` is now available
  }
}

DetectionZones¶

add_zone¶

ErrorCode add_zone(const std::tuple<float, float>* coordinates,
                   const std::size_t num_coordinate_pairs,
                   const DetectionClass* classes,
                   const std::size_t num_classes, int& zone_id);
ErrorCode add_zone(const std::vector<std::tuple<float, float>>& coordinates,
                   const std::vector<DetectionClass>& classes, int& zone_id);

Specify a detection zone polygon for a group of classes.

The zone can be used to verify whether a bounding-box is inside or outside of it using the is_box_in_zone function. It can also be used internally by the Plumerai library to improve detection quality. A single detection zone can be used for one or more classes.

On some platforms, a version that uses std::vector is available.

Arguments:

coordinates: An array of (x, y) coordinate pairs specifying the polygon of the detection zone in normalized coordinates, between 0 and 1. The code assumes that the first coordinate is also the final coordinate of the polygon: it should not be given by the user. The polygon must be simple (not complex): it can't have holes or self-intersections. It is allowed to be both concave or convex.
num_coordinate_pairs: The number of coordinate pairs in the coordinates array.
classes: An array of classes for which this detection zone is valid.
num_classes: The number of classes in the classes array.
zone_id: The created detection zone's unique ID is returned by this function for use in is_box_in_zone.

Returns:

An error code of type ErrorCode. It will return SUCCESS if all went fine, or otherwise INVALID_ZONE_CLASS or INVALID_ZONE_GEOMETRY.

Example:

auto pvi = plumerai::VideoIntelligence(height, width);
const std::size_t num_coordinates = 4;
std::tuple<float, float> coordinates[num_coordinates] = {
  {0.0f, 0.0f}, {0.5, 0.0f}, {0.5, 1.0f}, {0.0f, 1.0f}};
const std::size_t num_classes = 1;
DetectionClass classes[num_classes] = {DetectionClass::CLASS_PERSON};
int zone_id = -1;
auto error_code = pvi.detection_zones().add_zone(
    coordinates, num_coordinates, classes, num_classes, zone_id);
if (error_code != plumerai::ErrorCode::SUCCESS) {
  printf("Error: %s\n", plumerai::error_code_string(error_code));
}
// the 'zone_id' is now valid and can be used in 'is_box_in_zone'

remove_zone¶

ErrorCode remove_zone(const int zone_id);

Remove an existing detection zone.

Arguments:

zone_id: The detection zone ID as returned by add_zone.

Returns:

An error code of type ErrorCode. It will return SUCCESS if all went fine, or otherwise INVALID_ZONE_ID.

is_box_in_zone¶

ErrorCode is_box_in_zone(int zone_id, const BoxPrediction& box,
                         bool& is_in_zone) const;

Determines whether a box prediction is within a detection zone.

Arguments:

zone_id: The detection zone ID as returned by 'add_zone'.
box: A bounding-box prediction.
is_in_zone: The resulting boolean that indicates whether the given box is completely or partially inside the given detection zone in the current frame.

Returns:

An error code of type ErrorCode. It will return SUCCESS if all went fine, or otherwise INVALID_ZONE_ID, INVALID_ZONE_CLASS or INVALID_ZONE_GEOMETRY.

Example:

auto pvi = plumerai::VideoIntelligence(height, width);
int zone_id = -1;
pvi.detection_zones().add_zone(coordinates, num_coordinates, classes,
                               num_classes, zone_id);
(...)
const BoxPrediction* boxes = nullptr;
std::size_t num_boxes = 0;
pvi.object_detection().get_detections(&boxes, &num_boxes);
for (auto i = 0; i < num_boxes; ++i) {
  auto& box = boxes[i];
  bool is_in_zone = false;
  error_code = pvi.detection_zones().is_box_in_zone(
      zone_id, box, is_in_zone);

  if (error_code != plumerai::ErrorCode::SUCCESS) { return; } // ERROR
  if (is_in_zone) {
    printf("Box with ID %d is in zone %d\n", box.id, zone_id);
  }
}