YOLOv8 on RTSP Stream ===================== This project demonstrates how to use the SiMa PePPi API to run real-time object detection using the YOLOv8 model on a live RTSP stream. The pipeline is optimized for edge inference on SiMa’s MLSoC, and streams annotated video frames via UDP. Purpose ------- This pipeline is designed to: - Read video from an RTSP stream using ``rtspsrc``. - Run detection using the YOLOv8 model on SiMa MLSoC. - Annotate frames with bounding boxes and class labels. - Stream the output frames over UDP for real-time visualization. This setup is ideal for evaluating high-performance object detection in edge AI deployments. Configuration Overview ---------------------- The application is driven by ``project.yaml``. The parameters below describe its structure. Input/Output Configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~ ================ =============================== ================ Parameter Description Example ================ =============================== ================ ``source.name`` Input source type ``"rtspsrc"`` ``source.value`` RTSP stream URL ``""`` ``udp_host`` Destination IP for UDP stream ``""`` ``port`` Destination port for UDP stream ``""`` ``pipeline`` Processing pipeline name ``"YoloV8"`` ================ =============================== ================ Model Configuration (``Models[0]``) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ======================= ====================================== =================== Parameter Description Value ======================= ====================================== =================== ``name`` Model identifier ``"YOLO"`` ``targz`` Compressed model archive path ``""`` ``label_file`` Path to label file ``"labels.txt"`` ``normalize`` Apply input normalization ``true`` ``channel_mean`` Input channel mean values ``[0.0, 0.0, 0.0]`` ``channel_stddev`` Input channel stddev values ``[1.0, 1.0, 1.0]`` ``padding_type`` Padding type during preprocessing ``"CENTER"`` ``aspect_ratio`` Maintain input aspect ratio ``true`` ``topk`` Max number of detections per frame ``10`` ``detection_threshold`` Score threshold for valid detections ``0.7`` ``nms_iou_threshold`` IOU threshold for non-max suppression ``0.3`` ``decode_type`` Detection decoding strategy ``"yolo"`` ``num_classes`` Number of classes the model can detect ``87`` ======================= ====================================== =================== Main Python Script ------------------ The Python script executes the following steps: 1. Loads ``project.yaml``. 2. Initializes a ``VideoReader`` for RTSP input and a ``VideoWriter`` for UDP output. 3. Sets up the YOLOv8 model using a ``MLSoCSession`` configured with SiMa MLSoC. 4. In a loop: - Reads an input frame. - Optionally dumps it to disk for debugging (``/tmp/nv12.out``). - Runs the model. - Renders detection results onto the frame. - Streams the annotated frame via UDP. Model Details ------------- - Download from `here `__. - Model: YOLOv8 - Input Format: NV12 - Normalization: Yes (mean = ``[0.0, 0.0, 0.0]``, stddev = ``[1.0, 1.0, 1.0]``) - Thresholds: - Detection: 0.7 - NMS IOU: 0.3 - Output: Up to 10 detections per frame - Classes: 87 object categories