Skip to content

Plumerai Inference Engine C++ API for microcontrollers

This document describes the C++ API for the Plumerai Inference Engine for microcontrollers.

The API

InferenceEngine::InferenceEngine

InferenceEngine::InferenceEngine(std::uint8_t* tensor_arena_ptr, int tensor_arena_size, ::tflite::MicroProfiler* profiler = nullptr);

Creates the inference engine object. The tensor arena has to be provided by the user and should be large enough to hold the models activation tensors. Ideally the tensor arena is 16-byte aligned. The class does not take ownership of the tensor arena or optional profiler. The contents of the tensor arena should not be overwritten during the lifetime of the object, except by setting input tensor data through the corresponding functions.

InferenceEngine::AddCustomOp

TfLiteStatus InferenceEngine::AddCustomOp(const char* name, TfLiteRegistration* registration);

Optional in case you have ops that are not supported by Tensorflow Lite for Microcontrollers. If it will be used, it has to be called before AllocateTensors. The call will be forwarded to the Tensorflow Op Resolver class function MicroOpResolver::AddCustom and accepts the same arguments.

InferenceEngine::AllocateTensors

TfLiteStatus InferenceEngine::AllocateTensors();

Allocates input, output and intermediate tensors in the tensor arena. This needs to be called before running inference with Invoke. Returns kTfLiteError when not enough space is available. When custom ops have been registered, this will call the Init and Prepare function of those ops.

InferenceEngine::Invoke

TfLiteStatus InferenceEngine::Invoke();

Run inference assuming input data is already set. See the functions below for setting input data. Returns kTfliteOK if everything went right.

Input and output tensors

TfLiteTensor* InferenceEngine::input(int input_id);
TfLiteTensor* InferenceEngine::output(int output_id);
size_t InferenceEngine::inputs_size() const;
size_t InferenceEngine::outputs_size() const;

Get access to the input and output tensors. The TfLiteTensor object is the same as the one in Tensorflow (tensorflow/lite/c/common.h). Relevant functionality includes getting a pointer to the data, the datatype and the shape of the tensor:

TfLiteTensor* input_tensor = engine.input(0);

TfLiteType input_data_type = input_tensor->type;

std::int8_t* input_data = tflite::GetTensorData<std::int8_t>(input_tensor);

TfLiteIntArray* input_shape = input_tensor->dims;

InferenceEngine::arena_used_bytes

size_t InferenceEngine::arena_used_bytes() const;

For debugging only. This method gives the optimal arena size. It's only available after AllocateTensors has been called.

#include <cstdint>

#include "plumerai_tf_compat.h"

namespace plumerai {

class InferenceEngine {
 public:
  // The lifetime of the tensor arena and optional profiler must be at least
  // as long as that of the interpreter object, since the interpreter may need
  // to access them at any time. The interpreter doesn't do any deallocation of
  // any of the pointed-to objects, ownership remains with the caller.
  InferenceEngine(std::uint8_t* tensor_arena_ptr, int tensor_arena_size,
                  ::tflite::MicroProfiler* profiler = nullptr);

  // `AddCustomOp` is only needed if there are custom ops not supported by
  // TF Micro. Has to be called before `AllocateTensors`
  TfLiteStatus AddCustomOp(const char* name, TfLiteRegistration* registration);

  // Runs through the model and allocates all necessary input, output and
  // intermediate tensors in the tensor arena.
  TfLiteStatus AllocateTensors();

  // Run inference, assumes input data is already set
  TfLiteStatus Invoke();

  TfLiteTensor* input(int input_id);
  TfLiteTensor* output(int output_id);
  size_t inputs_size() const;
  size_t outputs_size() const;

  // For debugging only.
  // This method gives the optimal arena size. It's only available after
  // `AllocateTensors` has been called.
  size_t arena_used_bytes() const;

 private:
  struct impl;
  impl* impl_;
};

}  // namespace plumerai

Example

The Plumerai Inference Engine consists of just a single class, plumerai::InferenceEngine. It can be used as follows:

#include "plumerai_inference.h"

// Example tensor arena of 8KB, to be changed depending on the model
constexpr size_t arena_size = 8 * 1024;
uint8_t tensor_arena[arena_size];

// TODO: Implement this to define how debug printing is done
void DebugLog(const char *s) {
  // printf("%s", s);
}

int main() {
  plumerai::InferenceEngine inference(tensor_arena, arena_size);

  // Allocate memory from the tensor_arena for the model's tensors.
  auto allocate_status = inference.AllocateTensors();
  if (allocate_status != kTfLiteOk) {
    MicroPrintf("AllocateTensors() failed");
    return 1;
  }

  // Obtain pointers to the model's input and output tensors.
  // TODO: Assumes the model has one input and one output, modify this if there
  // are more.
  TfLiteTensor* input = inference.input(0);
  TfLiteTensor* output = inference.output(0);

  // Example: print the input shape
  MicroPrintf("Input shape:");
  for (int i = 0; i < input->dims->size; ++i) {
    MicroPrintf(" %d", input->dims->data[i]);
  }

  // Example: run inference in an infinite loop.
  while (true) {
    // Set input data example. TODO: Get data from sensor.
    int8_t* input_data = tflite::GetTensorData<int8_t>(input);
    input_data[0] = 17;  // example, setting first element to '17'

    // Run inference on a single input.
    auto invoke_status = inference.Invoke();
    if (invoke_status != kTfLiteOk) {
      MicroPrintf("Invoke failed");
      return 1;
    }

    // Read results and print first output to screen.
    int8_t* output_data = tflite::GetTensorData<int8_t>(output);
    MicroPrintf("Result: %d", int(output_data[0]));
  }

  return 0;
}