YOLOv8 on RTSP Stream
=====================

This project demonstrates how to use the SiMa PePPi API to run real-time object detection using the YOLOv8 model on a live RTSP stream. The pipeline is optimized for edge inference on SiMa’s MLSoC, and streams annotated video frames via UDP.

Purpose
-------

This pipeline is designed to:

- Read video from an RTSP stream using ``rtspsrc``.
- Run detection using the YOLOv8 model on SiMa MLSoC.
- Annotate frames with bounding boxes and class labels.
- Stream the output frames over UDP for real-time visualization.

This setup is ideal for evaluating high-performance object detection in edge AI deployments.

Configuration Overview
----------------------

The application is driven by ``project.yaml``. The parameters below describe its structure.

Input/Output Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~

================ =============================== ================
Parameter        Description                     Example
================ =============================== ================
``source.name``  Input source type               ``"rtspsrc"``
``source.value`` RTSP stream URL                 ``"<RTSP_URL>"``
``udp_host``     Destination IP for UDP stream   ``"<HOST_IP>"``
``port``         Destination port for UDP stream ``"<PORT_NUM>"``
``pipeline``     Processing pipeline name        ``"YoloV8"``
================ =============================== ================

Model Configuration (``Models[0]``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

======================= ====================================== ===================
Parameter               Description                            Value
======================= ====================================== ===================
``name``                Model identifier                       ``"YOLO"``
``targz``               Compressed model archive path          ``"<targz_path>"``
``label_file``          Path to label file                     ``"labels.txt"``
``normalize``           Apply input normalization              ``true``
``channel_mean``        Input channel mean values              ``[0.0, 0.0, 0.0]``
``channel_stddev``      Input channel stddev values            ``[1.0, 1.0, 1.0]``
``padding_type``        Padding type during preprocessing      ``"CENTER"``
``aspect_ratio``        Maintain input aspect ratio            ``true``
``topk``                Max number of detections per frame     ``10``
``detection_threshold`` Score threshold for valid detections   ``0.7``
``nms_iou_threshold``   IOU threshold for non-max suppression  ``0.3``
``decode_type``         Detection decoding strategy            ``"yolo"``
``num_classes``         Number of classes the model can detect ``87``
======================= ====================================== ===================

Main Python Script
------------------

The Python script executes the following steps:

1. Loads ``project.yaml``.
2. Initializes a ``VideoReader`` for RTSP input and a ``VideoWriter`` for UDP output.
3. Sets up the YOLOv8 model using a ``MLSoCSession`` configured with SiMa MLSoC.
4. In a loop:

   - Reads an input frame.
   - Optionally dumps it to disk for debugging (``/tmp/nv12.out``).
   - Runs the model.
   - Renders detection results onto the frame.
   - Streams the annotated frame via UDP.

Model Details
-------------

- Download from `here <https://docs.sima.ai/pkg_downloads/SDK1.6.0/appzoo/peppi/yolo_det.tar.gz>`__.
- Model: YOLOv8
- Input Format: NV12
- Normalization: Yes (mean = ``[0.0, 0.0, 0.0]``, stddev = ``[1.0, 1.0, 1.0]``)
- Thresholds:

  - Detection: 0.7
  - NMS IOU: 0.3

- Output: Up to 10 detections per frame
- Classes: 87 object categories