Plumerai Video Intelligence platform support¶
Supported platforms¶
The Plumerai Video Intelligence library is optimized for fast, memory-efficient inference across a wide range of processors, from embedded devices to high-performance systems. It leverages SIMD and AI accelerators where available to maximize efficiency and is extensively tested for reliability and compatibility across the following platforms:
- Arm Cortex-A with/without Neon:
- Armv7 and aarch64/ARM64.
- Cortex-A5, A7, A9, A15, A35, A53, A72, etc.
- NXP i.MX, QualComm Snapdragon, Renesas RZ, TI Sitara.
- Raspberry Pi 3/4/5 and similar devices.
- Arm Cortex-M with/without Helium:
- Arm Cortex-M33, M4, M7, M55, M85.
- Neural Processing Units (NPUs) and AI accelerators:
- Ambarella CV series 'Vector Processor' VP (e.g. CV28).
- Anyka NNE NPU.
- Arm Ethos-U.
- Realtek Ameba Pro2 'Intelligent Engine' NPU.
- SigmaStar IPU.
- x86 and x86_64 with SSE2, SSE4, AVX, AVX2, AVX512:
- Intel Celeron, i3, i5, i7, Xeon, etc.
- AMD Opteron, Ryzen, Epyc, etc.
- Desktops, laptops, cloud.
- Microcontrollers (MCUs):
- Arm Cortex-M (see above).
- Espressif ESP32-S3 with Tensilica Xtensa LX7.
- Synopsys ARC EM4.
- RISC-V.
The Plumerai Video Intelligence library can run on:
- Bare-metal systems.
- RTOS-based systems.
- Full operating systems: Linux, Windows, macOS.
Additional features:
- Supports static memory allocation, making it compatible with systems that lack dynamic memory allocation.
- Can operate without requiring the C++ standard library, enabling use in highly constrained environments.
Framerate and memory requirements¶
In this section we present some expected framerates along with memory requirements for a couple of common Plumerai Video Intelligence configurations on some common target platforms. A couple of notes regarding the presented results:
- The software runs on other platforms as well, and can run in other configurations than those presented, including any input resolution.
- Depending on the exact configuration, the framerates can vary.
- All benchmarks use a single CPU core only, but the algorithm can also run across 2 cores (~1.8x speed-up) or 4 cores (~2.6x speed-up) if desired.
- The framerates advertised assume the entire single-core CPU or NPU is available for Plumerai, but it can also run at a slower framerate to accommodate other tasks running on the same core, or to save battery.
- The reported RAM upperbounds are observed on Aarch64 and x86-64 platforms, while on microcontrollers or other resource-constrained devices our algorithms can run at a much lower memory footprint.
- The memory requirement can grow or be reduced depending on the exact use-case, such as the input resolution or the maximum size of the familiar face library. Furthermore, Plumerai can design a custom solution that fits specific requirements.
Please contact us for more information about your specific platform and use-case.
People detection only¶
In this configuration, we run the Plumerai People Detection algorithm, including multi-frame object tracking and motion detection. This is the most light-weight configuration and can run on tiny microcontrollers.
Platform | Processor (single CPU) | Framerate @ 640x480 |
---|---|---|
Espressif ESP32-S3 | Tensilica Xtensa LX7 @ 240MHz | 3.2 FPS |
STMicro STM32H7B3 | Arm Cortex-M7 @ 240MHz | 2.3 FPS |
Renesas RA8D1 | Arm Cortex-M85 @ 480MHz | 13.5 FPS |
Alif E1 MCU | Arm Ethos-U @ 400MHz | 83.0 FPS |
Raspberry Pi 3 | Arm Cortex-A53 @ 1.2GHz | 27.2 FPS |
Depending on the platform and the exact configuration, the memory requirements are as follows:
Memory type | Minimum | Maximum |
---|---|---|
ROM (e.g. flash storage) | 1.5MB | 2MB |
RAM | 350KB | 20MB |
People detection with familiar face identification¶
In this configuration, we add face detection and familiar face identification.
Platform | Processor (single CPU) | Framerate @ 1280x720 | Framerate @ 1920x1080 |
---|---|---|---|
Realtek Ameba Pro2 | Realtek NPU | (not measured) | 12 FPS |
Infineon Edge E84 | Arm Ethos-U @ 400MHz | 25 FPS | (not measured) |
Alif E1 MCU | Arm Ethos-U @ 400MHz | 33 FPS | (not measured) |
Renesas RZ/V2L | Arm Cortex-A55 @ 1.2GHz | (not measured) | 18 FPS |
Raspberry Pi 3 | Arm Cortex-A53 @ 1.2GHz | 14 FPS | 13 FPS |
Raspberry Pi 4 | Arm Cortex-A72 @ 1.8GHz | 33 FPS | 31 FPS |
Raspberry Pi 5 | Arm Cortex-A76 @ 2.4GHz | 147 FPS | 135 FPS |
The above results are independent of the number of objects in view, but they assume that there are no faces in view for familiar face identification. This affects framerate slightly, as seen below for an input resolution of 1280x720:
Platform | Processor (single CPU) | With 0 faces | With 1 face | With 2 faces |
---|---|---|---|---|
Infineon Edge E84 | Arm Ethos-U @ 400MHz | 25 FPS | 19 FPS | 16 FPS |
Raspberry Pi 3 | Arm Cortex-A53 @ 1.2GHz | 14 FPS | 11 FPS | 9 FPS |
Raspberry Pi 4 | Arm Cortex-A72 @ 1.8GHz | 33 FPS | 32 FPS | 31 FPS |
Raspberry Pi 5 | Arm Cortex-A76 @ 2.4GHz | 147 FPS | 138 FPS | 130 FPS |
Depending on the platform and the exact configuration, the memory requirements are as follows:
Memory type | Minimum | Maximum |
---|---|---|
ROM (e.g. flash storage) | 6MB | 9MB |
RAM (with manual enrollment) | 5MB | 30MB |
RAM (with automatic enrollment) | 9MB | 34MB |
People, vehicle, animal and package detection with familiar face identification¶
In this configuration, we run the Plumerai Video Intelligence software as above, but now with vehicle, animal, and package detection added. This is the most heavy-weight configuration, but still achieves very good framerates.
Platform | Processor (single CPU) | Framerate @ 1280x720 | Framerate @ 1920x1080 |
---|---|---|---|
Realtek Ameba Pro2 | Realtek NPU | (not measured) | 10 FPS |
SigmaStar SAV524D | SigmaStar IPU | (not measured) | 28 FPS |
Ambarella CV28M | CVFlow VP @ 1.0GHz | (not measured) | 29.6 FPS |
Raspberry Pi 3 | Arm Cortex-A53 @ 1.2GHz | 6.4 FPS | 6.3 FPS |
Raspberry Pi 4 | Arm Cortex-A72 @ 1.8GHz | 16.0 FPS | 15.6 FPS |
Raspberry Pi 5 | Arm Cortex-A76 @ 2.4GHz | 77 FPS | 74 FPS |
The above results are independent of the number of objects in view, but they assume that there are no faces in view for familiar face identification. This affects framerate slightly, as seen below for an input resolution of 1280x720:
Platform | Processor (single CPU) | With 0 faces | With 1 face | With 2 faces |
---|---|---|---|---|
Raspberry Pi 3 | Arm Cortex-A53 @ 1.2GHz | 6.4 FPS | 5.8 FPS | 5.2 FPS |
Raspberry Pi 4 | Arm Cortex-A72 @ 1.8GHz | 16.0 FPS | 14.4 FPS | 12.8 FPS |
Raspberry Pi 5 | Arm Cortex-A76 @ 2.4GHz | 77 FPS | 74 FPS | 72 FPS |
Depending on the platform and the exact configuration, the memory requirements are as follows:
Memory type | Minimum | Maximum |
---|---|---|
ROM (e.g. flash storage) | 7MB | 10MB |
RAM (with manual enrollment) | 5MB | 30MB |
RAM (with automatic enrollment) | 9MB | 34MB |