YOLOv7 on RTSP Stream
=====================

This project demonstrates the use of SiMa’s PePPi API to run real-time object detection on a live RTSP video stream using the YOLOv7 model. The application uses SiMa’s MLSoC hardware for accelerated inference and streams the annotated output via UDP.

Purpose
-------

This pipeline is designed to:

- Capture live video from an RTSP source.
- Perform object detection using the YOLOv7 model.
- Annotate frames with bounding boxes and labels.
- Stream the results to a specified host via UDP.

This setup is ideal for edge inference applications requiring high-speed, low-latency visual processing.

Configuration Overview
----------------------

The runtime configuration is managed through ``project.yaml``. The following tables explain the input/output and model configuration.

Input/Output Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~

================ ================================== ====================
Parameter        Description                        Example
================ ================================== ====================
``source.name``  Type of input source               ``"rtspsrc"``
``source.value`` RTSP video stream URL              ``"<RTSP_URL>"``
``udp_host``     Destination host IP for UDP output ``"<HOST_IP>"``
``port``         UDP port number                    ``"<PORT_NUM>"``
``pipeline``     Pipeline name for inference        ``"yoloV7Pipeline"``
================ ================================== ====================

Model Configuration (``Models[0]``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

======================= =================================================== ===================
Parameter               Description                                         Value
======================= =================================================== ===================
``name``                Model identifier                                    ``"yolov7"``
``targz``               Path to the YOLOv7 model archive                    ``"<targz_path>"``
``label_file``          Class label file path                               ``"labels.txt"``
``normalize``           Whether to apply normalization                      ``true``
``channel_mean``        Input channel mean values                           ``[0.0, 0.0, 0.0]``
``channel_stddev``      Input channel stddev values                         ``[1.0, 1.0, 1.0]``
``padding_type``        Input padding type                                  ``"CENTER"``
``aspect_ratio``        Maintain original aspect ratio during preprocessing ``true``
``topk``                Max number of detections returned per frame         ``10``
``detection_threshold`` Confidence threshold for detections                 ``0.7``
``decode_type``         Decode method used during postprocessing            ``"yolo"``
======================= =================================================== ===================

Main Python Script
------------------

The Python script does the following:

1. Loads ``project.yaml`` to read configuration.
2. Initializes a ``VideoReader`` for RTSP input and a ``VideoWriter`` for UDP output.
3. Sets up a YOLOv7 inference session using ``MLSoCSession``.
4. In a loop:

   - Captures a video frame.
   - Runs inference.
   - Renders bounding boxes and class labels.
   - Sends the annotated frame to the UDP endpoint.

The application is packaged using ``mpk create`` and deployed to the target device using SiMa’s deployment tools.

Model Details
-------------

- Download from `here <https://docs.sima.ai/pkg_downloads/SDK1.6.0/appzoo/peppi/yolov7_det.tar.gz>`__.
- Model: YOLOv7
- Normalize Input: Yes (mean: ``[0.0, 0.0, 0.0]``, stddev: ``[1.0, 1.0, 1.0]``)
- Detection Threshold: 0.7
- Output: Top 10 detections per frame
- Bounding Box Rendering: ``SimaBoxRender``