Building an application with the inference engine¶
Building¶
The generated inference engine library consists of four header files and a pre-compiled static library:
include/plumerai/inference_engine.h # for the C++ API only
include/plumerai/inference_engine_c.h # for the C API only
include/plumerai/tensorflow_compatibility.h
include/plumerai/model_defines.h
libplumerai.a
To build, make sure the header files can be found on the compiler include path, and link with libplumerai.a
. Use -Wl,--gc-sections
when linking to garbage-collect unused code from the binary. The library is compiled with -fdata-sections
and -ffunction-sections
to support this.
The exact details on how to compile and link depend on the target platform and compiler. Please refer to their respective documentation for detailed instructions.
Usage¶
The inference engine is built on top of Tensorflow Lite for Microcontrollers (TFLM), and usage is very similar. Instructions on how to use the API along with an example can be found here for the C++ API or here for the C API.
Log messages¶
Log messages are the same as in TFLM: one has to provide a C function called DebugLog
to output strings, for example over UART.
Tensor arena¶
The tensor arena is a chunk of memory that stores the tensor data during model inference. The user has to provide this and make sure it is large enough. All tensors, including the model input and output, will point to a location within the tensor arena, overlapping each other when possible. For ideal usage, the tensor arena should be 16-byte aligned.
For convenience, the inference engine generator provides a TENSOR_ARENA_SIZE
define (and TENSOR_ARENA_SIZE_REPORT_MODE
for report mode) in the generated include/plumerai/model_defines.h
file. This define can be used directly in the application after an #include "plumerai/model_defines.h"
is added, and should in most cases be sufficient. In rare cases it could only be a lower-bound and might need to be increased slightly. The user is informed about these cases when the Arena size estimation might be inaccurate
message is printed in the offline report.
In the case of supplying multiple models to the Inference Engine Generator, the user has to provide multiple separate tensor arenas, one for each model. In this case, plumerai/model_defines.h
adds the defines TENSOR_ARENA_SIZE_MODEL_X
(and TENSOR_ARENA_SIZE_MODEL_X_REPORT_MODE
for report mode), where X
is 0
, 1
, or higher depending on the number of models. The original defines TENSOR_ARENA_SIZE
and TENSOR_ARENA_SIZE_REPORT_MODE
also still exist: they are the sum of all model-specific defines.
Examples¶
Example applications can be found here for C++ and here for C.