YOLOv8 on RTSP Stream
This project demonstrates how to use the SiMa PePPi API to run real-time object detection using the YOLOv8 model on a live RTSP stream. The pipeline is optimized for edge inference on SiMa’s MLSoC, and streams annotated video frames via UDP.
Purpose
This pipeline is designed to:
Read video from an RTSP stream using
rtspsrc
.Run detection using the YOLOv8 model on SiMa MLSoC.
Annotate frames with bounding boxes and class labels.
Stream the output frames over UDP for real-time visualization.
This setup is ideal for evaluating high-performance object detection in edge AI deployments.
Configuration Overview
The application is driven by project.yaml
. The parameters below describe its structure.
Input/Output Configuration
Parameter |
Description |
Example |
---|---|---|
|
Input source type |
|
|
RTSP stream URL |
|
|
Destination IP for UDP stream |
|
|
Destination port for UDP stream |
|
|
Processing pipeline name |
|
Model Configuration (Models[0]
)
Parameter |
Description |
Value |
---|---|---|
|
Model identifier |
|
|
Compressed model archive path |
|
|
Path to label file |
|
|
Apply input normalization |
|
|
Input channel mean values |
|
|
Input channel stddev values |
|
|
Padding type during preprocessing |
|
|
Maintain input aspect ratio |
|
|
Max number of detections per frame |
|
|
Score threshold for valid detections |
|
|
IOU threshold for non-max suppression |
|
|
Detection decoding strategy |
|
|
Number of classes the model can detect |
|
Main Python Script
The Python script executes the following steps:
Loads
project.yaml
.Initializes a
VideoReader
for RTSP input and aVideoWriter
for UDP output.Sets up the YOLOv8 model using a
MLSoCSession
configured with SiMa MLSoC.In a loop:
Reads an input frame.
Optionally dumps it to disk for debugging (
/tmp/nv12.out
).Runs the model.
Renders detection results onto the frame.
Streams the annotated frame via UDP.
Model Details
Download from here.
Model: YOLOv8
Input Format: NV12
Normalization: Yes (mean =
[0.0, 0.0, 0.0]
, stddev =[1.0, 1.0, 1.0]
)Thresholds:
Detection: 0.7
NMS IOU: 0.3
Output: Up to 10 detections per frame
Classes: 87 object categories