People/Vehicle/Animal/Package Detection¶
At Plumerai we've built the world’s most accurate people, vehicle, animal, and package detection for camera-based video analytics. We deliver a complete software solution for object detection that can run at the edge, on-premise, or in the cloud. Our models are tiny, fast, and detect objects with a high accuracy under any circumstance.
You can try our AI live in the browser with your webcam.
Technical documentation¶
We have split the technical documentation for our object detection products into the following sections:
- The Video Intelligence API reference.
- Platform support.
- Demo on Arm/x86
- Demo on ESP32-S3
- Demo on Realtek Ameba Pro2
Additionally, we provide example usage of the Plumerai Video Intelligence library in the form of a simple OpenCV webcam-to-display pipeline. Documentation and source code is available in a public GitHub repository.
Object detection features¶
Class definitions¶
Below is an overview of what each object class includes and excludes.
People¶
The library accurately detects people in a wide range of situations, including different poses (walking, sitting, crouching), occlusion (partially blocked by objects), diversity (all skin tones, ethnicities, genders, and ages), and clothing (including hats, masks, or a hijab).
- Included: Entire or partial humans of any age, including people in reflections, on screens, or on vehicles such as bicycles or motorbikes.
- Excluded: Non-realistic depictions (e.g. cartoons, paintings). People inside closed vehicles (e.g. cars) when not clearly visible.
Vehicles¶
Both moving and non-moving (e.g. parked) vehicles are detected. Non-moving vehicles can optionally be filtered out with functionality provided by the Plumerai Video Intelligence API.
- Included: Cars (sedans, coupes, SUVs, minivans, etc.), trucks (pickups, semis, delivery and construction vehicles, etc.), buses, and motorcycles. Including partial vehicles (e.g. occlusion, partly out of view).
- Excluded: Bicycles and tricycles, toy vehicles, rail vehicles, aircraft, and non-realistic depictions of vehicles (e.g. cartoons).
Note that miscellaneous road vehicles such as golf carts, rickshaws, and tractors aren't explicitly included nor excluded: they might be detected depending on the specific object and situation.
Animals¶
- Included: All large and small animals not listed under 'excluded' below. This includes domestic animals, wild animals, and farm animals. Some examples: cats, dogs, rodents, sheep, cows, horses, racoons, coyotes, deer, bears, and elephants.
- Excluded: Birds, chicken, reptiles, insects, spiders, and non-realistic depictions of animals.
Packages¶
- Included: Boxes, padded envelopes, and polybags (mailbags). Including packages wrapped in plastic, made of colored cardboard, packages with an open-top, or packages not being carried by a person.
- Excluded: Regular mail, garbage bags, food wrapping, and food transport bags (e.g. delivery backpacks).
Tracking¶
The library allows for tracking each object by providing a unique ID for each detected bounding box. This enables the application to understand the movement of an object from frame to frame. This information can then be used to detect when an object crosses a line or enters a specific area, or determine the direction it is moving into.
Camera¶
The object detection library supports a wide variety of cameras:
- Lens: supports a variety of lenses, including wide angle lenses with a 180 degrees field of view.
- Resolution: from low resolution like QVGA (320x240) to beyond 4K (3840x2160).
- Color: supports both color (RGB) or black and white media.
- Aspect ratio: from landscape to portrait, for example 16:9, 1:1, or 9:16.

Lighting¶
The library correctly detects objects under a wide variety of lighting conditions:
- Low light: even with noisy or blurry media the detector is accurate.
- Night vision: cameras that use NIR LEDs to light up the scene in the dark are supported.
Distance¶
Detects objects up close or at a distance:
- Close: detects objects that are up close and only partially in view.
- Distant: detects objects at more than 20 meters or 65 feet.
Mounting¶
A variety of mounting heights and angles are supported:
- Standard: mounted at a height of 1-2 meters or 3-6 feet.
- High: mounted at a height of 2-5 meters or 6-15 feet.
- Low: mounted lower than 1 meter or 3 feet.
- Angles: from level to 45 degrees downward.
Locations¶
Whatever the location, the object detection performs well:
- Indoor: variety of settings such as homes, offices, stores, hospitals, and malls.
- Outdoor: variety of locations such as office and home surroundings such as streets, yards, parking lots, sidewalks.
Fixed or moving camera¶
Both fixed and moving cameras are supported:
- Fixed: cameras mounted on the wall or embedded into a device such as a display or laptop.
- Moving: cameras mounted on a robot or vehicle, or pan-tilt-zoom cameras.
Dataset¶
The software has been trained with over 32 million images and videos to ensure robust performance in a large variety of situations:
- Curated: to ensure that all the important use-cases are included, and in the right balance.
- Unbiased: we use proprietary data analysis methods to identify specific issues stemming from sampling biases in public datasets.
- Varied: our dataset consists of a very large variety of images from all kinds of objects, settings and conditions.
- Labeled: we have an extensive methodology to label our datasets to ensure labels are accurate and precise.
Validation¶
- Validation: The object detection models have been extensively validated on a diverse range of people, animals and objects, in different poses, settings, and situations.
- Unit tests: Data unit tests guarantee that the model works reliably for specific subsets of our dataset. For example, for people standing far away or for people who are only showing their back, as well as challenging cases involving vehicles, animals, and packages.
Privacy¶
Our object detection software is designed to support privacy-conscious deployments across edge, on-premise, and cloud environments:
- Flexible deployment: our object detection model is small and lightweight, enabling it to run directly on the camera, on local on-premise devices, or in the cloud, depending on the product architecture and privacy requirements.
- Edge and on-premise options: when deployed on the device or within on-premise infrastructure, image data can remain under local control rather than being transmitted over the internet for further processing.
- No unnecessary image transmission: deployments can be designed so that image data does not need to leave the device or local network, reducing exposure during transmission.
- Controlled data handling: the system architecture can be configured to align with the customer’s requirements for storage, access control, and data processing.