YOLOv7 on RTSP Stream ===================== This project demonstrates the use of SiMa’s PePPi API to run real-time object detection on a live RTSP video stream using the YOLOv7 model. The application uses SiMa’s MLSoC hardware for accelerated inference and streams the annotated output via UDP. Purpose ------- This pipeline is designed to: - Capture live video from an RTSP source. - Perform object detection using the YOLOv7 model. - Annotate frames with bounding boxes and labels. - Stream the results to a specified host via UDP. This setup is ideal for edge inference applications requiring high-speed, low-latency visual processing. Configuration Overview ---------------------- The runtime configuration is managed through ``project.yaml``. The following tables explain the input/output and model configuration. Input/Output Configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~ ================ ================================== ==================== Parameter Description Example ================ ================================== ==================== ``source.name`` Type of input source ``"rtspsrc"`` ``source.value`` RTSP video stream URL ``""`` ``udp_host`` Destination host IP for UDP output ``""`` ``port`` UDP port number ``""`` ``pipeline`` Pipeline name for inference ``"yoloV7Pipeline"`` ================ ================================== ==================== Model Configuration (``Models[0]``) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ======================= =================================================== =================== Parameter Description Value ======================= =================================================== =================== ``name`` Model identifier ``"yolov7"`` ``targz`` Path to the YOLOv7 model archive ``""`` ``label_file`` Class label file path ``"labels.txt"`` ``normalize`` Whether to apply normalization ``true`` ``channel_mean`` Input channel mean values ``[0.0, 0.0, 0.0]`` ``channel_stddev`` Input channel stddev values ``[1.0, 1.0, 1.0]`` ``padding_type`` Input padding type ``"CENTER"`` ``aspect_ratio`` Maintain original aspect ratio during preprocessing ``true`` ``topk`` Max number of detections returned per frame ``10`` ``detection_threshold`` Confidence threshold for detections ``0.7`` ``decode_type`` Decode method used during postprocessing ``"yolo"`` ======================= =================================================== =================== Main Python Script ------------------ The Python script does the following: 1. Loads ``project.yaml`` to read configuration. 2. Initializes a ``VideoReader`` for RTSP input and a ``VideoWriter`` for UDP output. 3. Sets up a YOLOv7 inference session using ``MLSoCSession``. 4. In a loop: - Captures a video frame. - Runs inference. - Renders bounding boxes and class labels. - Sends the annotated frame to the UDP endpoint. The application is packaged using ``mpk create`` and deployed to the target device using SiMa’s deployment tools. Model Details ------------- - Download from `here `__. - Model: YOLOv7 - Normalize Input: Yes (mean: ``[0.0, 0.0, 0.0]``, stddev: ``[1.0, 1.0, 1.0]``) - Detection Threshold: 0.7 - Output: Top 10 detections per frame - Bounding Box Rendering: ``SimaBoxRender``