simaaiboxdecode

The simaaiboxdecode GStreamer plugin processes tensors into valid object detections. It detesselates and dequantizes tensors on the fly, identifies cells of interest, classifies objects, converts box regressions to model-frame coordinates, applies NMS and configurable top-K filtering, and rescales boxes to the original frame size.

Properties

The plugin supports the following properties:

Property

Type

Default

Description

config

string

“/mnt/host/simaaiboxdecode.json”

Path to the JSON configuration file for the simaaiboxdecode.

emit-signals

boolean

FALSE

Send signals.

latency

int64

0

Additional latency in live mode to allow upstream to take longer to produce buffers for the current position (in nanoseconds).

min-upstream-latency

int64

0

Overrides initial minimum latency if dynamically added sources have higher latency. (ns)

name

string

“simaaisimaaiboxdecode0”

The name of the object.

parent

GstObject

The parent of the object.

silent

boolean

TRUE

Produce verbose output.

sima-allocator-type

int

0

1 - no segment API, 2 - segment API support.

start-time

int64

18446744073709551615

Start time to use if start-time-selection=set.

start-time-selection

Enum

0, “zero”

Decides which start time is output.

transmit

boolean

FALSE

Transmit KPI Messages.

Usage

The plugin can be integrated into a GStreamer pipeline as follows:

gst-launch-1.0 simaaisrc location=input.bin ! simaaiboxdecode config=/path/to/config.json ! fakesink

Replace /path/to/config.json with the actual path to your JSON configuration file. input.bin should be replaced with the path to your input file containing the encoded data. The fakesink element can be replaced with another element suitable for processing the decoded data.

Configuration

The configuration is divided into three blocks:

  1. GenerixBoxDecode library parameters
    For the full list of parameters, refer to the library README.
  2. Caps block
    For a detailed description of the Caps block, refer to the Caps Negotiation Library README.
  3. Plugin parameters
    These are common across all aggregator template base plugins:
    • node_name – the name of the current node (used as the output buffer name).

    • memory – defines output memory options:

      • cpu – CPU where this plugin will run (affects only memory allocation).

      • next_cpu – CPU where the next plugin will run (affects only memory allocation).

    • system – defines plugin system settings:

      • out_buf_queue – size of the BufferPool.

      • dump_data – dumps output buffers to /tmp/{name-of-the-object}-{frame_id}.out.

    • buffers – defines input/output buffers:

      • input – an array of objects specifying the input buffer name and size.

      • output – defines the output buffer size.

Configuration Description

decode_type:

This is the type of algorithm to apply for decoding, usually named by a model type. - yolo: decoding for cell-based single-stage YOLO detector models, while coordinate decoding is done on MLA(uncut model), “boxdecode” just retrieves the data - centernet: decoding of the boxes for centernet-like models, where the centers of the boxes are defined by peaks in a heatmap per class - detr: decoding for DETR type of models, where full decoding and NMS is done as part of MLA model execution - effdet: decoding is done in a way, similar to yolovx full decoding, just the model is EfficientDetector - yolov7, yolov8, yolov9, yolov10: decoding for specific types of YOLO models, when coordinate decoding is done by “boxdecode” itself (cut model - final layers are removed). The coordinate decode algorithm, which may be different in each type of yolo is defined by this - yolov7-seg, yolov8-seg: decoding done for specific YOLO type, plus instance segmentation is added on top - yolov7-pose, yolov8-pose: decoding done for specific YOLO type, plus pose inference is added on top ### topk The maximum number of boxes to leave as the output of the kernel ### num_classes number of classes for detection, needs to be coordinated with the number of channels in input tensors ### detection_threshold a threshold, used for comparison of confidence score to determine if a cell is a source of a positive detection ### nms_iou_threshold a threshold used in comparison of IOU (Intersection over Union) for two boxes to determine if one of the boxes should be eliminated ### sigmoid_on_probabilities an optionall flag, that forces to use/not use sigmoid in models, where it is parametrised. If not present - default falue will be used. Default values: false where decode_type=[yolo, detr], true otherwise

Configuration file example

{
  "version": 0.1,
  "node_name": "simaai_boxdecode",
  "memory": {
    "cpu": 0,
    "next_cpu": 1
  },
  "system": {
    "out_buf_queue": 1,
    "debug": 0,
    "dump_data": 0
  },
  "buffers": {
    "input": [
      {
        "name": "simaai_process_mla",
        "size": 16000
      }
    ],
    "output": {
      "size": 580
    }
  },
  "decode_type" : "detr",
  "topk" : 24,
  "original_width": 1280,
  "original_height": 720,
  "model_width" : 640,
  "model_height" : 480,
  "num_classes" : 92,
  "detection_threshold" : 0.9,
  "nms_iou_threshold" : 0,
  "num_in_tensor": 2,
  "input_width": [
    100,
    100
  ],
  "input_height": [
    1,
    1
  ],
  "input_depth": [
    92,
    4
  ],
  "slice_width": [
    50,
    100
  ],
  "slice_height": [
    1,
    1
  ],
  "slice_depth": [
    92,
    4
  ],
  "dq_scale": [
    6.950103398907683,
    512.0
  ],
  "dq_zp": [
    37,
    -127357
  ],
  "data_type": [
    "INT8",
    "INT32"
  ],
  "caps": {
    "sink_pads": [
      {
        "media_type": "application/vnd.simaai.tensor",
        "params": [
          {
            "name": "format",
            "type": "string",
            "values": "MLA",
            "json_field": null
          },
          {
            "name": "data_type",
            "type": "string",
            "values": "(INT8, INT16, INT32), (INT8, INT16, INT32)",
            "json_field": "data_type"
          },
          {
            "name": "width",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "input_width"
          },
          {
            "name": "height",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "input_height"
          },
          {
            "name": "depth",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "input_depth"
          },
          {
            "name": "slice_width",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "slice_width"
          },
          {
            "name": "slice_height",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "slice_height"
          },
          {
            "name": "slice_depth",
            "type": "int",
            "values": "(1 - 4096), (1 - 4096)",
            "json_field": "slice_depth"
          }
        ]
      }
    ],
    "src_pads": [
      {
        "media_type": "application/vnd.simaai.tensor",
        "params": [
          {
            "name": "format",
            "type": "string",
            "values": "BBOX",
            "json_field": null
          }
        ]
      }
    ]
  }
}

Buffer Size Calculation:

For BBOX only Models:

Buffer Size = 4 + (topK x 24) → 4: 4 bytes header for BBOXES, topK: 24 (Default), 24: BBOX size, so total: 580

For Segmentation Model:

Buffer Size = 4 + topK x (24 + 160 x 160) → 4: 4 bytes header for BBOXES, topK: 20 , 24: BBOX size, 160: Mask size, so total: 512,484

Installation

To install the simaaiboxdecode plugin, copy the plugin library file from /usr/local/simaai/plugin_zoo/gst-simaai-plugins-base/gst/ into your project’s plugins/ directory.