PeopleDetector on RTSP Stream
=============================

This project uses the SiMa PePPi API to run a real-time people detection pipeline on a live RTSP video stream. It leverages a CenterNet-based model optimized for detecting human figures and streams the annotated results via UDP.

Purpose
-------

This pipeline showcases how to:

- Ingest live RTSP video using SiMa’s PePPi API.
- Run people detection using a CenterNet-based model on SiMa’s MLSoC.
- Annotate frames with bounding boxes and class labels.
- Stream the output to a specified host and port via UDP.

All inference is hardware-accelerated through SiMa’s MLSoC for efficient edge deployment.

Configuration Overview
----------------------

Settings are defined in ``project.yaml``. The following tables outline the input/output configuration and model-specific parameters.

Input/Output Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~

================ ============================= ====================
Parameter        Description                   Example
================ ============================= ====================
``source.name``  Type of input source          ``"rtspsrc"``
``source.value`` RTSP video stream URL         ``"<RTSP_URL>"``
``udp_host``     Host IP for UDP output        ``"<HOST_IP>"``
``port``         Port number for UDP stream    ``"<PORT_NUM>"``
``pipeline``     Inference pipeline to be used ``"PeopleDetector"``
================ ============================= ====================

Model Configuration (``Models[0]``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

======================= ================================================== =========================
Parameter               Description                                        Value
======================= ================================================== =========================
``name``                Model identifier                                   ``"pd"``
``targz``               Compressed model archive path                      ``"<targz_path>"``
``label_file``          Path to class label file                           ``"labels.txt"``
``normalize``           Enable input normalization                         ``true``
``channel_mean``        Per-channel mean for input normalization           ``[0.408, 0.447, 0.470]``
``channel_stddev``      Per-channel stddev for input normalization         ``[0.289, 0.274, 0.278]``
``padding_type``        Padding strategy for input preprocessing           ``"BOTTOM_LEFT"``
``aspect_ratio``        Whether to maintain original image aspect ratio    ``true``
``topk``                Maximum number of detections returned per frame    ``10``
``detection_threshold`` Minimum confidence to qualify as a valid detection ``0.7``
``decode_type``         Postprocessing decode method used                  ``"centernet"``
======================= ================================================== =========================

Main Python Script
------------------

The script performs the following operations:

1. Loads configuration from ``project.yaml``.
2. Initializes a ``VideoReader`` for the RTSP stream and a ``VideoWriter`` for UDP output.
3. Loads the detection model with a SiMa ``MLSoCSession``.
4. Continuously:

   - Reads a frame
   - Runs inference
   - Annotates detected people
   - Streams the annotated frame over UDP

The application is packaged using ``mpk create`` and deployed to the target device using SiMa’s standard flow.

Model Details
-------------

- Download from `here <https://docs.sima.ai/pkg_downloads/SDK1.6.0/appzoo/peppi/people.tar.gz>`__.
- Model Type: CenterNet-based
- Target: People detection
- Normalization:

  - Mean: ``[0.408, 0.447, 0.470]``
  - Stddev: ``[0.289, 0.274, 0.278]``

- Detection Confidence Threshold: 0.7
- Output: Top 10 people detections per frame