simaaiboxdecode

The simaaiboxdecode GStreamer plugin processes tensors into valid object detections. It detesselates and dequantizes tensors on the fly, identifies cells of interest, classifies objects, converts box regressions to model-frame coordinates, applies NMS and configurable top-K filtering, and rescales boxes to the original frame size.

Properties

The plugin supports the following properties:

Property	Type	Default	Description
config	string	“/mnt/host/simaaiboxdecode.json”	Path to the JSON configuration file for the simaaiboxdecode.
emit-signals	boolean	FALSE	Send signals.
latency	int64	0	Additional latency in live mode to allow upstream to take longer to produce buffers for the current position (in nanoseconds).
min-upstream-latency	int64	0	Overrides initial minimum latency if dynamically added sources have higher latency. (ns)
name	string	“simaaisimaaiboxdecode0”	The name of the object.
parent	GstObject		The parent of the object.
silent	boolean	TRUE	Produce verbose output.
sima-allocator-type	int	0	1 - no segment API, 2 - segment API support.
start-time	int64	18446744073709551615	Start time to use if start-time-selection=set.
start-time-selection	Enum	0, “zero”	Decides which start time is output.
transmit	boolean	FALSE	Transmit KPI Messages.

Usage

The plugin can be integrated into a GStreamer pipeline as follows:

gst-launch-1.0 simaaisrc location=input.bin ! simaaiboxdecode config=/path/to/config.json ! fakesink

Replace /path/to/config.json with the actual path to your JSON configuration file. input.bin should be replaced with the path to your input file containing the encoded data. The fakesink element can be replaced with another element suitable for processing the decoded data.

Configuration

The configuration is divided into three blocks:

GenerixBoxDecode library parameters

For the full list of parameters, refer to the library README.
Caps block

For a detailed description of the Caps block, refer to the Caps Negotiation Library README.
Plugin parameters

These are common across all aggregator template base plugins:
- node_name – the name of the current node (used as the output buffer name).
- memory – defines output memory options:
  - cpu – CPU where this plugin will run (affects only memory allocation).
  - next_cpu – CPU where the next plugin will run (affects only memory allocation).
- system – defines plugin system settings:
  - out_buf_queue – size of the BufferPool.
  - dump_data – dumps output buffers to /tmp/{name-of-the-object}-{frame_id}.out.
- buffers – defines input/output buffers:
  - input – an array of objects specifying the input buffer name and size.
  - output – defines the output buffer size.

Configuration Description

decode_type:

This is the type of algorithm to apply for decoding, usually named by a model type. - yolo: decoding for cell-based single-stage YOLO detector models, while coordinate decoding is done on MLA(uncut model), “boxdecode” just retrieves the data - centernet: decoding of the boxes for centernet-like models, where the centers of the boxes are defined by peaks in a heatmap per class - detr: decoding for DETR type of models, where full decoding and NMS is done as part of MLA model execution - effdet: decoding is done in a way, similar to yolovx full decoding, just the model is EfficientDetector - yolov7, yolov8, yolov9, yolov10: decoding for specific types of YOLO models, when coordinate decoding is done by “boxdecode” itself (cut model - final layers are removed). The coordinate decode algorithm, which may be different in each type of yolo is defined by this - yolov7-seg, yolov8-seg: decoding done for specific YOLO type, plus instance segmentation is added on top - yolov7-pose, yolov8-pose: decoding done for specific YOLO type, plus pose inference is added on top ### topk The maximum number of boxes to leave as the output of the kernel ### num_classes number of classes for detection, needs to be coordinated with the number of channels in input tensors ### detection_threshold a threshold, used for comparison of confidence score to determine if a cell is a source of a positive detection ### nms_iou_threshold a threshold used in comparison of IOU (Intersection over Union) for two boxes to determine if one of the boxes should be eliminated ### sigmoid_on_probabilities an optionall flag, that forces to use/not use sigmoid in models, where it is parametrised. If not present - default falue will be used. Default values: false where decode_type=[yolo, detr], true otherwise

Configuration file example

{
  "version": 0.1,
  "node_name": "simaai_boxdecode",
  "memory": {
    "cpu": 0,
    "next_cpu": 1
  },
  "system": {
    "out_buf_queue": 1,
    "debug": 0,
    "dump_data": 0
  },
  "buffers": {
    "input": [
      {
        "name": "simaai_process_mla",
        "size": 16000
      }
    ],
    "output": {
      "size": 580
    }
  },
  "decode_type" : "detr",
  "topk" : 24,
  "original_width": 1280,
  "original_height": 720,
  "model_width" : 640,
  "model_height" : 480,
  "num_classes" : 92,
  "detection_threshold" : 0.9,
  "nms_iou_threshold" : 0,
  "num_in_tensor": 2,
  "input_width": [
    100,
    100
  ],
  "input_height": [
    1,
    1
  ],
  "input_depth": [
    92,
    4
  ],
  "slice_width": [
    50,
    100
  ],
  "slice_height": [
    1,
    1
  ],
  "slice_depth": [
    92,
    4
  ],
  "dq_scale": [
    6.950103398907683,
    512.0
  ],
  "dq_zp": [
    37,
    -127357
  ],
  "data_type": [
    "INT8",
    "INT32"
  ],
  "caps": {
    "sink_pads": [
      {
        "media_type": "application/vnd.simaai.tensor",
        "params": [
          {
            "name": "format",
            "type": "string",
            "values": "MLA",
            "json_field": null
          },
          {
            "name": "data_type",
            "type": "string",
            "values": "(INT8, INT16, INT32), (INT8, INT16, INT32)",
            "json_field": "data_type"
          },
          {
            "name": "width",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "input_width"
          },
          {
            "name": "height",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "input_height"
          },
          {
            "name": "depth",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "input_depth"
          },
          {
            "name": "slice_width",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "slice_width"
          },
          {
            "name": "slice_height",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "slice_height"
          },
          {
            "name": "slice_depth",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "slice_depth"
          }
        ]
      }
    ],
    "src_pads": [
      {
        "media_type": "application/vnd.simaai.tensor",
        "params": [
          {
            "name": "format",
            "type": "string",
            "values": "BBOX",
            "json_field": null
          }
        ]
      }
    ]
  }
}

Buffer Size Calculation:

For BBOX only Models:

Buffer Size = 4 + (topK x 24) â†’ 4: 4 bytes header for BBOXES, topK: 24 (Default), 24: BBOX size, so total: 580

For Segmentation Model:

Buffer Size = 4 + topK x (24 + 160 x 160) â†’ 4: 4 bytes header for BBOXES, topK: 20 , 24: BBOX size, 160: Mask size, so total: 512,484

Installation

To install the simaaiboxdecode plugin, copy the plugin library file from /usr/local/simaai/plugin_zoo/gst-simaai-plugins-base/gst/ into your project’s plugins/ directory.