simaaiprocessmla

Perform workload with the Machine Learning Accelerator. User must not change plugin source code in any way. Only configuration changes are needed to start using this plugin.

Usage

! simaaiprocessmla config="process_mla.json"

Configuration

Here is the example of config file used by simaaiprocessmla plugin:

{
  "version" : 0.1,
  "node_name" : "mla-resnet",
  "simaai__params" : {
    "params" : 15, <========================== NOT USED, NO NEED TO CHANGE
    "index" : 1, <============================ NOT USED, NO NEED TO CHANGE
    "cpu" : 4, <============================== CPU type used for memory allocation. CONSTANT AND MUST BE SET TO 4
    "next_cpu" : 0, <========================= NOT USED, NO NEED TO CHANGE
    "out_sz" : 96, <========================== Single buffer size. The buffer size depends on particular model memory layout.
    "no_of_outbuf" : 5, <===================== How many output buffers will be created by plugin. Must be greater than 0.
    "model_id" : 1, <========================= DEFAULT ONE, NO NEED TO CHANGE
    "batch_size" : 1, <======================= Number of frames to process per time
    "batch_sz_model" : 1, <=================== Number of frames the particular model supports
    "in_tensor_sz": 0, <====================== Input tensor size calculated as `N * H * W * C`. Reflects the input tensor that the particular model accepts . This value will be used only when `batch_size > batch_sz_model`
    "out_tensor_sz": 0, <===================== Output tensor size calculated as `N * H * W * C`. Reflects the output tensor that the particular model returns . This value will be used only when `batch_size > batch_sz_model`
    "out_type" : 2,   <======================= DEFAULT ONE, NO NEED TO CHANGE
    "ibufname" : "ev-resnet-preproc", <======= Input buffer name
    "inpath" : "/root/ev74_preproc.raw", <==== DEFAULT ONE, NO NEED TO CHANGE
    "model_path" : "/path/to/the/model.lm", <= Path to model file
    "debug" : 0, <============================ Not used
    "dump_data" : 0 <========================= Dump output buffer to file at /tmp
  }
}

To start using this plugin the User has to specify:

Path to model file (.lm) that needs to be run.
Output buffer size. The output buffer size depends on particular model memory layout. Please refer for model specification to get the actual output buffer size.

Note

In some rare cases, you can calculate it by yourself, knowing the output tensor shape. For example, if model outputs tensor 1x4x4x16, then output buffer size will be 144*16 = 256 bytes. Please note, that in some cases C16 alignment could be taken in place: for tensor 1x4x4x12 the output size will also be 256, because channels size (12) will be automatically aligned to 16.

no_of_outbuf: Set to the value, greater than 0. Plugin will allocate no_of_outbuf * out_sz bytes in total.
"batch_size" : N,: User must specify how many frames the plugin will be processing per run
"batch_sz_model" : M : User must specify how many frames the particular model can process per run
"in_tensor_sz" : K : Reflects the input tensor size that the particular model accepts: N * H * W * C. This parameter is required for config.json but it is only used if batch_size > batch_sz_model.
"out_tensor_sz" : P : Reflects the output tensor size that the particular model accepts: N * H * W * C. This parameter is required for config.json but it is only used if batch_size > batch_sz_model.

Batching

In the configuration file the following parameters apply to batching control.

"batch_size" : 1,
"batch_sz_model" : 1,
"in_tensor_sz": 0,
"out_tensor_sz": 0

If the model is trained only to process one frame - user must set batch_size: 1 and batch_sz_model: 1 , in_tensor_sz: 0 andout_tensor_sz: 0 .

If model supports batching and user wants to process as many frames as model supports, than batch_size: N , batch_sz_model: M , in_tensor_sz: 0, out_tensor_sz: 0 where N <= M

If user wants to process more frames, than model supports: batch_size: N , batch_sz_model: M, in_tensor_sz: K, out_tensor_sz: P . Where N > M . And K = 1 * H * W * C for input tensor and P = 1 * H * W * C for output tensor.

Dependencies

This plugin uses external dependency called dispatcher-lite to communicate with MLA. It is a shared library that is automatically built along with the plugin and must be uploaded to the Da Vinci (generation 1) board too and LD_LIBRARY_PATH environment variable set before running the pipeline:

LD_LIBRARY_PATH=/path/to/foler/with/dispatcher-lite gst-launch-1.0 --gst-plugin-path=/path/to/simaaiprocessmla/plugin/folder simaaisrc location=mla_in.out node-name="ev-preproc" mem-target=4 ! simaaiprocessmla config=/absolute/path/to/mla_config.json ! fakesink

For more information about dispatcher-lite please see core/dispatcher-lite folder in the current repo.