Skip to content

Plumerai People Detection for Cortex-M API

This document describes the C API for Plumerai’s people detection software for videos on Arm Cortex-M microcontrollers.

Below are all the functions that make up the API. Following that is the full header as well as a simple example of how to use the API.

PeopleDetectionMicroInit

StatusCode PeopleDetectionMicroInit(unsigned char* tensor_arena);
Initializes the people detection algorithm. This needs to be called only once at the start of the application.

Arguments

  • tensor_arena unsigned char*: A pointer to a user-allocated contiguous memory region to store persistent, input, output, and intermediate tensors. The size should be equal or larger than the value given by TENSOR_ARENA_SIZE in model_defines.h. This memory region should not be overwritten after the call to this function is made. See below for a version that splits the tensor arena in persistent and non-persistent storage.

Returns

StatusCode: A status-code indicating whether there was an error, it can be either of Success or AllocationError (see full header below for a description of these codes).

PeopleDetectionMicroInitSplit

StatusCode PeopleDetectionMicroInitSplit(
    unsigned char* persistent_tensor_arena,
    unsigned char* non_persistent_tensor_arena);
Initializes the people detection algorithm. As above, but splits the tensor arena in two such that one buffer can be re-used as scratch space.

Arguments

  • persistent_tensor_arena unsigned char*: A pointer to a user-allocated contiguous memory region to store persistent data. The size should be equal or larger than TENSOR_ARENA_SIZE_PERSISTENT from model_defines.h. This memory region should not be overwritten after the call to this function is made.

  • non_persistent_tensor_arena unsigned char*: A pointer to a user-allocated contiguous memory region to store input, output, and intermediate tensors. The size should be equal or larger than TENSOR_ARENA_SIZE_NON_PERSISTENT from model_defines.h. This memory region can be overwritten by the user between (but not during) calls to PeopleDetectionMicroProcessFrame.

Returns

StatusCode: A status-code indicating whether there was an error, it can be either of Success or AllocationError (see full header below for a description of these codes).

PeopleDetectionMicroCleanUp

StatusCode PeopleDetectionMicroCleanUp();
Cleans-up the people detection algorithm. This needs to be called only once at the end of the application.

Returns

StatusCode: A status-code indicating whether there was an error. At the moment always returns Success.

PeopleDetectionGetInputPointer

signed char* PeopleDetectionGetInputPointer();
Retrieves the address of where the input data needs to be stored before making a call to PeopleDetectionMicroProcessFrame. The camera output should be stored here in the image format described before. Any data residing at this location can be overwritten by a call to PeopleDetectionMicroProcessFrame. The user should not write beyond PLUMERAI_IMAGE_SIZE bytes (see model_defines.h) from the start of this value.

Returns

signed char*: A pointer to the location the input data should be placed.

PeopleDetectionReadDataCallback

void PeopleDetectionReadDataCallback(void (*user_callback)(void* input_ptr));

As an alternative of using PeopleDetectionGetInputPointer and setting the new input data in between calls to PeopleDetectionMicroProcessFrame, there is the option to provide a user-defined callback to retrieve input data (e.g. from a camera) in parallel to running people detection. To do so, provide a function pointer that takes as single argument a pointer to the input data (similar as what is returned by PeopleDetectionGetInputPointer) and returns nothing. In this function a user can start for an example an asynchronous DMA call to grab a new camera frame and place it in the input buffer. Waiting for execution of any such parallel operation needs to be done before making a next call to PeopleDetectionMicroProcessFrame.

Arguments

  • user_callback void func(void* input_ptr): The callback function as described above. It will be executed roughly halfway execution of each call to PeopleDetectionMicroProcessFrame and takes the address of where the input data needs to be stored. The user should not write beyond PLUMERAI_IMAGE_SIZE bytes (see model_defines.h) from the start of this value.

Returns

Nothing.

PeopleDetectionMicroProcessFrame

StatusCode PeopleDetectionMicroProcessFrame(BoxPrediction* results,
                                            int results_length,
                                            int* num_results_returned);
Process a single frame from a video sequence with RGB input. This will process RGB image data (1st byte red, 3rd blue) of size PLUMERAI_IMAGE_HEIGHT * PLUMERAI_IMAGE_WIDTH * BYTES_PER_PIXEL (see model_defines.h) in unsigned RGB888 or RGB565 format found at the location returned by PeopleDetectionGetInputPointer.

Arguments

  • results BoxPrediction*: A pointer to an array to store the resulting boxes in. The user needs to allocate space for this structure. The recommended size is 20, but if the user allocates less, then fewer boxes are returned. If fewer boxes are detected, then also fewer are returned. This amount is given by num_results_returned, see below.
  • results_length int: The number of BoxPrediction elements allocated in the provided results parameter above by the user. See that parameter for more info.
  • num_results_returned int*: The minimum of the number of resulting bounding-boxes found in the image and results_length. The results structure results will be filled with zeros beyond this amount. If this value is equal to results_length, it might be an indication that more boxes are found than that can be output.

Returns

StatusCode: A status-code indicating whether there was an error, it can be either of Success, InvokeError, OutputDimsError (see full header below for a description of these codes).

BoxPrediction

typedef struct {
  float y_min;       // top coordinate between 0 and 1 in height dimension
  float x_min;       // left coordinate between 0 and 1 in width dimension
  float y_max;       // bottom coordinate between 0 and 1 in height dimension
  float x_max;       // right coordinate between 0 and 1 in width dimension
  float confidence;  // between 0 and 1, higher means more confident
} BoxPrediction;
An output structure representing a single resulting bounding box. Coordinates are between 0 and 1, the origin is at the top-left.

Full people_detection_micro.h header

// Most functions in this API return a status-code to indicate whether
// everything went well. If not, see below for more information about the error
// code.
typedef enum StatusCode_ {
  Success = 0,             // Everything went all right
  AllocationError = -1,    // Memory allocation failure, check arena size
  InvokeError = -2,        // Unexpected error, contact Plumerai
  OutputDimsError = -3,    // Unexpected error, contact Plumerai
  RegistrationError = -4,  // Unexpected error, contact Plumerai
} StatusCode;

// An output structure representing a single resulting bounding box.
// Coordinates are between 0 and 1, the origin is at the top-left.
typedef struct {
  float y_min;       // top coordinate between 0 and 1 in height dimension
  float x_min;       // left coordinate between 0 and 1 in width dimension
  float y_max;       // bottom coordinate between 0 and 1 in height dimension
  float x_max;       // right coordinate between 0 and 1 in width dimension
  float confidence;  // between 0 and 1, higher means more confident
} BoxPrediction;

// Initializes the people detection algorithm. This needs to be called only
// once at the start of the application.
//
// @param tensor_arena A pointer to a user-allocated contiguous memory region
//  to store persistent, input, output, and intermediate tensors. The size
//  should be equal or larger than the value given by `TENSOR_ARENA_SIZE` in
//  `model_defines.h`. This memory region should not be overwritten after the
//  call to this function is made. See below for a version that splits the
//  tensor arena in persistent and non-persistent storage.
// @return a status-code indicating whether there was an error, it can be either
//  of `Success` or `AllocationError` (see above for a description).
StatusCode PeopleDetectionMicroInit(unsigned char* tensor_arena);

// Initializes the people detection algorithm. As above, but splits the tensor
// arena in two such that one buffer can be re-used as scratch space.
//
// @param persistent_tensor_arena A pointer to a user-allocated contiguous
//  memory region to store persistent data. The size should be equal or larger
//  than `TENSOR_ARENA_SIZE_PERSISTENT` from `model_defines.h`. This memory
//  region should not be overwritten after the call to this function is made.
// @param non_persistent_tensor_arena A pointer to a user-allocated contiguous
//  memory region to store input, output, and intermediate tensors. The size
//  should be equal or larger than `TENSOR_ARENA_SIZE_NON_PERSISTENT` from
//  `model_defines.h`. This memory region can be overwritten by the user between
//  (but not during) calls to `PeopleDetectionMicroProcessFrame`.
// @return a status-code indicating whether there was an error, it can be either
//  of `Success` or `AllocationError` (see above for a description).
StatusCode PeopleDetectionMicroInitSplit(
    unsigned char* persistent_tensor_arena,
    unsigned char* non_persistent_tensor_arena);

// Cleans-up the people detection algorithm. This needs to be called only once
// at the end of the application.
//
// @return a status-code indicating whether there was an error. At the moment
// always returns `Success`.
StatusCode PeopleDetectionMicroCleanUp();

// Retrieves the address of where the input data needs to be stored before
// making a call to `PeopleDetectionMicroProcessFrame`. The camera output
// should be stored here in the image format described before. Any data residing
// at this location can be overwritten by a call to
// `PeopleDetectionMicroProcessFrame`. The user should not write beyond
// `PLUMERAI_IMAGE_SIZE` bytes (see `model_defines.h`) from the start of this
// value.
//
// @return a pointer to the location the input data should be placed.
void* PeopleDetectionGetInputPointer();

// As an alternative of using `PeopleDetectionGetInputPointer` and setting the
// new input data in between calls to `PeopleDetectionMicroProcessFrame`, there
// is the option to provide a user-defined callback to retrieve input data (e.g.
// from a camera) in parallel to running people detection. To do so, provide a
// function pointer that takes as single argument a pointer to the input data
// (similar as what is returned by `PeopleDetectionGetInputPointer`) and returns
// nothing. In this function a user can start for an example an asynchronous DMA
// call to grab a new camera frame and place it in the input buffer. Waiting for
// execution of any such parallel operation needs to be done before making a
// next call to `PeopleDetectionMicroProcessFrame`.
//
// @param user_callback The callback function as described above. It will be
//  executed roughly halfway execution of each call to
//  `PeopleDetectionMicroProcessFrame` and takes the address of where the input
//  data needs to be stored. The user should not write beyond
//  `PLUMERAI_IMAGE_SIZE` bytes (see `model_defines.h`) from the start of this
//  value.
void PeopleDetectionReadDataCallback(void (*user_callback)(void* input_ptr));

// Process a single frame from a video sequence with RGB input. This will
// process RGB image data (1st byte red, 3rd blue) of size PLUMERAI_IMAGE_HEIGHT
// * PLUMERAI_IMAGE_WIDTH * BYTES_PER_PIXEL (see `model_defines.h`) in unsigned
// RGB888 or RGB565 format found at the location returned by
// `PeopleDetectionGetInputPointer`.
//
// @param results A pointer to an array to store the resulting boxes in. The
//  user needs to allocate space for this structure. The recommended size is 20,
//  but if the user allocates less, then fewer boxes are returned. If fewer
//  boxes are detected, then also fewer are returned. This amount is given by
//  `num_results_returned`, see below.
// @param results_length The number of `BoxPrediction` elements allocated in
//  the provided `results` parameter above by the user. See that parameter for
//  more info.
// @param num_results_returned The minimum of the number of resulting
//  bounding-boxes found in the image and `results_length`. The results
//  structure `results` will be filled with zeros beyond this amount. If this
//  value is equal to `results_length`, it might be an indication that more
//  boxes are found than that can be output.
// @return a status-code indicating whether there was an error, it can be either
//  of `Success`, `InvokeError`, `OutputDimsError` (see above for description).
StatusCode PeopleDetectionMicroProcessFrame(BoxPrediction* results,
                                            int results_length,
                                            int* num_results_returned);

Example model_defines.h header

Following is an example of the model_defines.h header. This header might be different depending on the input resolution, the software version, and the target platform.

// The total required tensor arena size, the sum of the two components below.
#define TENSOR_ARENA_SIZE 378448

// The required persistent tensor arena size. Anything in this section should
// not be overridden.
#define TENSOR_ARENA_SIZE_PERSISTENT 34304

// The required non-persistent tensor arena size. This can be freely accessed
// in between model invocations.
#define TENSOR_ARENA_SIZE_NON_PERSISTENT 344128

// The Plumerai people detection algorithm requires a fixed input resolution.
// It uses the RGB888 data-format with 3 bytes per pixel.
#define PLUMERAI_IMAGE_WIDTH 320   // The width of the input image in pixels.
#define PLUMERAI_IMAGE_HEIGHT 240  // The height of the input image in pixels.
#define PLUMERAI_IMAGE_SIZE (PLUMERAI_IMAGE_WIDTH * PLUMERAI_IMAGE_HEIGHT * 3)

Example usage

Below is an example of using the C API shown above.

#include "plumerai/model_defines.h"
#include "plumerai/people_detection_micro.h"

// Here stdio.h is used for `printf`, but this can be replaced with any specific
// method of printing results to screen, if needed.
#include <cstdio>

// Example tensor arena. Can be allocated on a specific memory region if
// desired. It needs to be of size TENSOR_ARENA_SIZE.
__attribute__((aligned(16))) unsigned char tensor_arena[TENSOR_ARENA_SIZE];

void mainloop() {
  // This initializes the people detector
  int error_code = PeopleDetectionMicroInit(tensor_arena);
  if (error_code != 0) return;
  auto input_ptr =
      reinterpret_cast<unsigned char*>(PeopleDetectionGetInputPointer());

  // We pre-allocate memory for the results, at most 10 boxes can be found
  BoxPrediction predictions[10];

  // Loop over frames in a video stream. In this example we run only 10 times.
  for (int t = 0; t < 10; ++t) {
    // Some example input here, normally this is where image data is acquired.
    // We can pass the `input_ptr` value directly to our camera API.
    // In this example here we simply assign some values for fake input data.
    for (int i = 0; i < PLUMERAI_IMAGE_SIZE; ++i) {
      input_ptr[i] = static_cast<unsigned char>(i % 255);
    }

    // Process a single input image frame from e.g. a camera video stream. The
    // result will be stored in `predictions`, and `num_results` will tell us
    // how many results there are.
    int num_results = 0;
    error_code =
        PeopleDetectionMicroProcessFrame(predictions, 10, &num_results);
    if (error_code != 0) return;

    // Write the results to a terminal. Replace this with your own printing
    // function if needed.
    printf("Detected %d people\n", num_results);
    for (int i = 0; i < num_results; ++i) {
      BoxPrediction p = predictions[i];
      printf(
          "Detected person with confidence %.2f @ (x,y) -> (%.2f,%.2f) till "
          "(%.2f,%.2f)\n",
          p.confidence, p.x_min, p.y_min, p.x_max, p.y_max);
    }
  }

  // Clean-up
  error_code = PeopleDetectionMicroCleanUp();
  if (error_code != 0) return;
}

For advanced use with a callback, e.g. for parallel camera data capture, the above example can be adjusted slightly. The first part of the above mainloop function then becomes:

void mainloop() {
  // This initializes the people detector
  int error_code = PeopleDetectionMicroInit(tensor_arena);
  if (error_code != 0) return;

  // This sets the read-data callback which can be used to read in camera data
  // in parallel to running inference, e.g. through a DMA. This callback will be
  // executed roughly halfway execution of 'PeopleDetectionMicroProcessFrame'.
  PeopleDetectionReadDataCallback(start_camera_capture);

  // We pre-allocate memory for the results, at most 10 boxes can be found
  BoxPrediction predictions[10];

  // Since the callback above is only executed during the first inference, we
  // need to acquire the first camera input manually before entering the loop.
  auto input_ptr =
      reinterpret_cast<unsigned char *>(PeopleDetectionGetInputPointer());
  start_camera_capture(input_ptr);

  // Loop over frames in a video stream. In this example we run only 10 times.
  for (int t = 0; t < 10; ++t) {
    // Before we start 'PeopleDetectionMicroProcessFrame' we want to make sure
    // the camera data is fully captured.
    finalize_camera_capture();

It then continues as above with setting int num_results = 0;, making the call to PeopleDetectionMicroProcessFrame and everything that follows. Those two new functions should then be filled in depending on the camera set-up, e.g.:

void start_camera_capture(void *input_ptr) {
  // Here we would normally ask the camera to start recording data and store it
  // into the 'input_ptr'. However, since this differs per camera, we leave this
  // unimplemented here. Ideally this function finishes almost instantly, but
  // leaves some parallel process working in the background (e.g. a DMA). We
  // should not write beyond 'PLUMERAI_IMAGE_SIZE' bytes from 'input_ptr'.
}

void finalize_camera_capture() {
  // If the 'start_camera_capture' function above started some parallel process
  // then it can be synchronized here to make sure the camera capture process
  // is completed. If it isn't completed yet, we can wait here in this function.
}