Effdet on RTSP Stream
=====================

This project demonstrates how to use the SiMa PePPi API to build a Python application that performs accelerated object detection inference on a live RTSP video stream using the EfficientDet (Effdet) model.

Purpose
-------

The primary goal of this pipeline is to showcase how to:

- Read live video data from an RTSP source.
- Perform real-time object detection inference using the SiMa MLSoC and the EfficientDet model.
- Render and annotate bounding boxes with class labels.
- Stream the output video over UDP for further visualization or processing.

All inference is accelerated through SiMa’s MLSoC hardware, allowing high throughput and low-latency performance ideal for edge AI applications.

Configuration Overview
----------------------

The application is configured via ``project.yaml``. Below is a breakdown of its parameters.

Input/Output Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~

================ ==================================== ====================
Parameter        Description                          Example
================ ==================================== ====================
``source.name``  Input type for the video stream      ``"rtspsrc"``
``source.value`` RTSP stream URL                      ``"<RTSP_URL>"``
``udp_host``     Host IP to stream the output via UDP ``"<HOST_IP>"``
``port``         UDP port number for output stream    ``"<PORT_NUM>"``
``pipeline``     Pipeline type used for inference     ``"EffDetPipeline"``
================ ==================================== ====================

Model Configuration (``Models[0]``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

======================= ============================================================== =========================
Parameter               Description                                                    Example
======================= ============================================================== =========================
``name``                Name of the model                                              ``"Effdet"``
``targz``               Path to the compressed model archive                           ``"<targz_path>"``
``normalize``           Whether to normalize image input                               ``true``
``aspect_ratio``        Whether to maintain original aspect ratio during preprocessing ``true``
``channel_mean``        Per-channel mean for input normalization                       ``[0.485, 0.456, 0.406]``
``channel_stddev``      Per-channel stddev for input normalization                     ``[0.229, 0.224, 0.225]``
``decode_type``         Postprocessing decode type used with Effdet                    ``"effdet"``
``detection_threshold`` Minimum confidence score to retain a detection                 ``0.3``
``scaled_width``        Width to scale image before inference                          ``512``
``scaled_height``       Height to scale image before inference                         ``288``
``label_file``          Path to the label file with class names                        ``"labels.txt"``
``padding_type``        Type of padding used before inference                          ``"CENTER"``
``topk``                Maximum number of detections returned per frame                ``10``
``num_classes``         Number of object classes supported by the model                ``90``
======================= ============================================================== =========================

Main Python Script
------------------

The Python script performs the following steps:

1. Loads configuration from ``project.yaml``.
2. Initializes a video reader and writer using the PePPi API.
3. Loads the Effdet model via ``MLSoCSession`` and configures it.
4. Continuously reads frames, performs inference, renders detection results, and streams annotated video via UDP.

The application is packaged using ``mpk create`` and deployed to the target device through SiMa’s deployment workflow.

Model Details
-------------

- Download from `here <https://docs.sima.ai/pkg_downloads/SDK1.6.0/appzoo/peppi/effdet.tar.gz>`__.
- Model: EfficientDet (Effdet)
- Input Normalization:

  - Mean: ``[0.485, 0.456, 0.406]``
  - Stddev: ``[0.229, 0.224, 0.225]``

- Detection Threshold: 0.3
- Max Output Per Frame: Top 10 detections
- Input Resolution: 512×288 (scaled)
- Bounding Boxes: Rendered using ``SimaBoxRender``