Multimodel Demo

This project demonstrates how to use the SiMa PePPi API to create a multi-model pipeline combining segmentation, detection, and anomaly analysis. It uses a combination of YOLO, Teacher-Student distillation, and AutoEncoder networks to identify and evaluate regions of interest from a local folder of input samples.

Purpose

The pipeline performs the following:

  • Loads video frames from a local directory using filesrc.

  • Uses a YOLO model to perform segmentation and detect objects of interest.

  • Processes the detected regions using Teacher, Student, and AutoEncoder models.

  • Calculates anomaly maps via mean squared error between Teacher-Student and AutoEncoder outputs.

  • Combines these maps to generate predictions.

  • Writes annotated results via UDP stream for visualization.

Configuration Overview

All runtime settings are managed through project.yaml. Below are the parameters grouped by function.

Input/Output Configuration

Parameter

Description

Example

source.name

Input type; here using local image files

"filesrc"

source.value

Folder path containing input samples

"<Folder containing input samples>"

udp_host

Host IP to stream annotated output

"<HOST_IP>"

port

UDP port for the output stream

"<PORT_NUM>"

pipeline

Inference pipeline used

"AutoEncoderPipeline"

Model Configuration (Models[])

YOLO Model (Index 0)

Parameter

Description

Value

name

Model name

"YOLO"

targz

Path to model archive

"yolo_seg_cls.tar.gz"

normalize

Whether to normalize input

true

channel_mean

Input mean values

[0.0, 0.0, 0.0]

channel_stddev

Input stddev values

[1.0, 1.0, 1.0]

padding_type

Padding method

"CENTER"

aspect_ratio

Maintain aspect ratio

false

topk

Maximum number of detections

10

detection_threshold

Detection score threshold

0.7

label_file

Label map file path

"labels.txt"

input_img_type

Image color format

"BGR"

Teacher / Student / AutoEncoder Models (Index 1–3)

Parameter

Description

Value (same across all)

name

Model name

"Teacher", "Student", "AutoEncoder"

targz

Model archive

teacher_int8_mpk.tar.gz, etc.

normalize

Enable input normalization

true

channel_mean

Input mean

[0.407, 0.446, 0.469]

channel_stddev

Input stddev

[0.289, 0.273, 0.277]

input_img_type

Input format

"BGR"

Main Python Script

The script performs the following:

  1. Initializes four SiMa MLSoC sessions for the YOLO, Teacher, Student, and AutoEncoder models.

  2. Loads and loops through input frames using filesrc.

  3. Runs the YOLO segmentation model to isolate regions of interest.

  4. Passes the segmented region through the Teacher, Student, and AutoEncoder networks.

  5. Computes two anomaly maps (Teacher vs. Student, AutoEncoder vs. Student).

  6. Normalizes and combines these maps to highlight regions of potential anomaly.

  7. Post-processes the combined map and generates a final prediction.

  8. Converts and streams the result via UDP using VideoWriter.

Model Details

  • Download bundled models from here.

  • Unzip the package by running tar -xvf STAnomalyDet.tar.gz.