YOLOv7 on RTSP Stream

This project demonstrates the use of SiMa’s PePPi API to run real-time object detection on a live RTSP video stream using the YOLOv7 model. The application uses SiMa’s MLSoC hardware for accelerated inference and streams the annotated output via UDP.

Purpose

This pipeline is designed to:

Capture live video from an RTSP source.
Perform object detection using the YOLOv7 model.
Annotate frames with bounding boxes and labels.
Stream the results to a specified host via UDP.

This setup is ideal for edge inference applications requiring high-speed, low-latency visual processing.

Configuration Overview

The runtime configuration is managed through project.yaml. The following tables explain the input/output and model configuration.

Input/Output Configuration

Parameter	Description	Example
`source.name`	Type of input source	`"rtspsrc"`
`source.value`	RTSP video stream URL	`"<RTSP_URL>"`
`udp_host`	Destination host IP for UDP output	`"<HOST_IP>"`
`port`	UDP port number	`"<PORT_NUM>"`
`pipeline`	Pipeline name for inference	`"yoloV7Pipeline"`

Model Configuration (`Models[0]`)

Parameter	Description	Value
`name`	Model identifier	`"yolov7"`
`targz`	Path to the YOLOv7 model archive	`"<targz_path>"`
`label_file`	Class label file path	`"labels.txt"`
`normalize`	Whether to apply normalization	`true`
`channel_mean`	Input channel mean values	`[0.0, 0.0, 0.0]`
`channel_stddev`	Input channel stddev values	`[1.0, 1.0, 1.0]`
`padding_type`	Input padding type	`"CENTER"`
`aspect_ratio`	Maintain original aspect ratio during preprocessing	`true`
`topk`	Max number of detections returned per frame	`10`
`detection_threshold`	Confidence threshold for detections	`0.7`
`decode_type`	Decode method used during postprocessing	`"yolo"`

Main Python Script

The Python script does the following:

Loads project.yaml to read configuration.
Initializes a VideoReader for RTSP input and a VideoWriter for UDP output.
Sets up a YOLOv7 inference session using MLSoCSession.
In a loop:
- Captures a video frame.
- Runs inference.
- Renders bounding boxes and class labels.
- Sends the annotated frame to the UDP endpoint.

The application is packaged using mpk create and deployed to the target device using SiMa’s deployment tools.

Model Details

Download from here.
Model: YOLOv7
Normalize Input: Yes (mean: [0.0, 0.0, 0.0], stddev: [1.0, 1.0, 1.0])
Detection Threshold: 0.7
Output: Top 10 detections per frame
Bounding Box Rendering: SimaBoxRender