Step 3: Run and verify output of simaaiprocesscvu
(CVU postprocess)

In this section we will perform dequantization
and detesselation
with the help of the simaaiprocesscvu
plugin.
We will then verify versus our fp32 expected output tensor to see if it matches expectations.
Because we are again using the EV74 CVU, we will have to follow the same steps as we did before:
Choose the CVU graph we want to run (graph SIMA_DETESS_DEQUANT in this case)
Create the JSON configuration file with the right parameters for this application and the target EV74 CVU graph
Develop and compile a configuration application for it
Run the configuration application before executing your GStreamer pipeline with the JSON configuration file
Run your GStreamer pipeline by specifying the simaaiprocesscvu plugin with the JSON configuration file
Choosing the CVU function
For our example, we will be choosing the SIMA_DETESS_DEQUANT function from the list of CVU available kernels in the CVU Graphs.
This function will be useful because it performs dequantization
and detesselation
of any MLA output.
Note
Every output tensor from the MLA is tesselated
and quantized
. Before proceeding with any downstream operations, all values out of the MLA
must be dequantized
and detesselated
.
Creating the JSON configuration file
A JSON configuration file is used in 2 steps of the runtime:
The JSON is used by the CVU configuration application to configure the parameters being set.
The same JSON file can be used to configure the
simaaiprocesscvu
plugin at runtime when the application is launched.
To create the JSON file, you can refer to SIMA_DETESS_DEQUANT Parameters
and Example Configuration
section.
Developing and compiling the configuration application
The CVU needs to be configured with the graph that it will run and along with the corresponding parameters for that graph. Currently, this needs to be done explicitly by the developer via a C++ application. Here, we present an example configuration application that works for the ResNet50 example for the SIMA_DETESS_DEQUANT graph.
Go to the SIMA_DETESS_DEQUANT section and either download the pre-written and pre-compiled configuration application, or follow the instructions to re-write or edit the source.
To compile the application on Palette, please refer to the EV74 CVU How to compile CVU Configuration Application? section.
Copying the configuration application and the JSON configuration to the board
Once the application has been compiled or downloaded, we need to copy it to the board.
Let’s first create the JSON file for the Configuration parameters we need for the CVU found in the previous section. From the MLSoC:
davinci:~/resnet50_example_app$ cd app_configs davinci:~/resnet50_example_app/app_configs$
Run the following command:
echo '{ "version": 0.1, "node_name": "detess-dequant", "simaai__params": { "params": 15, "cpu": 1, "next_cpu": 0, "no_of_outbuf": 1, "ibufname": "", "graph_id": 201, "img_width": 1280, "img_height": 720, "num_tensors": 1, "input_width": [1], "input_height": [1], "input_depth": [1000], "slice_width": [1], "slice_height": [1], "slice_depth": [1000], "dq_scale": [255.02200010497842], "dq_zp": [-128], "data_type": [0], "fp16_out_en": [0], "output_format": [0], "debug": 0, "out_sz": 4000, "dump_data": 1 } }' > detessdequant_201_cvu_cfg_params.json
From the Palette Docker container on the development host machine, let’s scp the configuration application (
detessdequant_201_cvu_cfg_app
) binary to the same folder.sima-user@docker-image-id:/home/docker/sima-cli/ev74_cgfs/sima_detess_dequant/build$ scp detessdequant_201_cvu_cfg_app sima@<IP address of MLSoC>:/home/sima/resnet50_example_app/app_configs build/detessdequant_201_cvu_cfg_app 100% 65KB 9.7MB/s 00:00
The directory should now look like this:
davinci:~/resnet50_example_app$ ls build/detessdequant_201_cvu_cfg_app detessdequant_201_cvu_cfg_params.json
We now have the parameters with the right values, and the application necessary to configure the CVU for our preprocessing step.
Running the configuration application
To run the configuration application, simply run it on the MLSoC with the right input parameters. In the binary directory, run:
davinci:~/resnet50_example_app$ sudo ./detessdequant_201_cvu_cfg_app detessdequant_201_cvu_cfg_params.json
Password:
Completed SIMA_DETESS_DEQUANT graph configure
To verify if the configuration was set correctly, you can look at the EV74 log found at: /var/log/simaai_EV74.log
. The output should look something like:
Note
Sometimes it can take a few seconds to a minute for the log to update.
davinci:/home/sima/resnet50_example_app/app_configs$ tail -f /var/log/simaai_EV74.log
... function="dump_detessdequant_params"]-------sima_detessdequant_inst_0_tensor_0------
... function="dump_detessdequant_params"]Input width: 1
... function="dump_detessdequant_params"]Input height: 1
... function="dump_detessdequant_params"]Input depth/channels: 1000
... function="dump_detessdequant_params"]Slice width: 1
... function="dump_detessdequant_params"]Slice height: 1
... function="dump_detessdequant_params"]Slice depth/channels: 1000
... function="dump_detessdequant_params"]Dequant Scale : 265.849884
... function="dump_detessdequant_params"]Dequant ZeroPoint: -128
... function="dump_detessdequant_params"]Input data type: int8
... function="dump_detessdequant_params"]Fp16 Output Enabled: False
... function="dump_detessdequant_params"]Output Tensor Format: NHWC
... function="dump_detessdequant_params"]Input Tensor Size: 1008
... function="dump_detessdequant_params"]Output Tensor Size: 4000
... function="dump_detessdequant_params"]Debug level : 0
The GStreamer string update
Let us update the previous run_pipeline.sh
script to include our new plugin.
#!/bin/bash
# Constants #
APP_DIR=/home/sima/resnet50_example_app
DATA_DIR="${APP_DIR}/data"
SIMA_PLUGINS_DIR="${APP_DIR}/../gst-plugins"
SAMPLE_IMAGE_SRC="${DATA_DIR}/golden_retriever_207_rgb.bin"
CONFIGS_DIR="${APP_DIR}/app_configs"
PREPROC_CVU_CONFIG_BIN="${CONFIGS_DIR}/genpreproc_200_cvu_cfg_app"
PREPROC_CVU_CONFIG_JSON="${CONFIGS_DIR}/genpreproc_200_cvu_cfg_params.json"
INFERENCE_MLA_CONFIG_JSON="${CONFIGS_DIR}/simaaiprocessmla_cfg_params.json"
DETESSDEQUANT_CVU_CONFIG_BIN="${CONFIGS_DIR}/detessdequant_201_cvu_cfg_app"
DETESSDEQUANT_CVU_CONFIG_JSON="${CONFIGS_DIR}/detessdequant_201_cvu_cfg_params.json"
# Remove any existing temporary files before running
rm /tmp/generic_preproc*.out /tmp/mla-*.out /tmp/detess-dequant*.out
# Run the configuration apps for generic_preproc and detessdequant
$PREPROC_CVU_CONFIG_BIN $PREPROC_CVU_CONFIG_JSON
$DETESSDEQUANT_CVU_CONFIG_BIN $DETESSDEQUANT_CVU_CONFIG_JSON
# Run the application
export LD_LIBRARY_PATH="${SIMA_PLUGINS_DIR}"
gst-launch-1.0 -v --gst-plugin-path="${SIMA_PLUGINS_DIR}" \
simaaisrc mem-target=1 node-name="my_image_src" location="${SAMPLE_IMAGE_SRC}" num-buffers=1 ! \
simaaiprocesscvu source-node-name="my_image_src" buffers-list="my_image_src" config="${PREPROC_CVU_CONFIG_JSON}" name="generic_preproc" ! \
simaaiprocessmla config="${INFERENCE_MLA_CONFIG_JSON}" ! \
simaaiprocesscvu source-node-name="mla-resnet" buffers-list="mla-resnet" config="${DETESSDEQUANT_CVU_CONFIG_JSON}" name="detessdequant" ! \
fakesink
To run the application:
davinci:~/resnet50_example_app$ sudo sh run_pipeline.sh
rm: cannot remove '/tmp/detess-dequant*.out': No such file or directory
Completed SIMA_GENERIC_PREPROC graph configure
Completed Graph Configure
** Message: 23:09:28.934: Num of chunks 1
** Message: 23:09:28.934: Buffer_name: my_image_src, num_of_chunks:1
(gst-launch-1.0:5856): GLib-GObject-CRITICAL **: 23:09:28.944: g_pointer_type_register_static: assertion 'g_type_from_name (name) == 0' failed
(gst-launch-1.0:5856): GLib-GObject-CRITICAL **: 23:09:28.944: g_type_set_qdata: assertion 'node != NULL' failed
** Message: 23:09:28.945: Num of chunks 1
** Message: 23:09:28.946: Buffer_name: mla-resnet, num_of_chunks:1
Setting pipeline to PAUSED ...
** Message: 23:09:28.952: Initialize dispatcher
** Message: 23:09:28.954: handle: 0x7f9195b0, 0xffff7f9195b0
** Message: 23:09:29.629: Loaded model from location /home/sima/resnet50_example_app/models/quantized_resnet50_stage1_mla.lm, model:hdl: 0xaaaacca728f0
** Message: 23:09:29.634: Filename memalloc = /home/sima/resnet50_example_app/data/golden_retriever_207_rgb.bin
Pipeline is PREROLLING ...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
Redistribute latency...
New clock: GstSystemClock
Got EOS from element "pipeline0".
Execution ended after 0:00:00.001388186
Setting pipeline to NULL ...
Freeing pipeline ...
The output dump of the simaaiprocesscvu
is located in: /tmp/detess-dequant-001.out
for verification in the next step.
Tip
Notice that we set the dump
parameter in the detessdequant_201_cvu_cfg_params.json
file to 1
in order to dump the output of the simaaiprocesscvu
plugin.
Verifying the output
In order to verify the output against a known reference, we will obtain the original fp32 resnet50 output probabilities and compare vs. our output.
On the MLSoC:
davinci:~/resnet50_example_app$ vi print_mla_probabilities.py
Write the following inside the script:
import numpy as np
# Step 1: Read the binary file as int8 values
probabilities_data = np.fromfile('/tmp/detess-dequant-001.out', dtype=np.float32)
# Step 2: Find the indices of the top 3 largest values
top_3_indices = np.argpartition(probabilities_data, -3)[-3:]
# Step 3: Sort the top 3 indices by the actual values (descending)
top_3_indices = top_3_indices[np.argsort(-probabilities_data[top_3_indices])]
# Print the results
print("Top 3 largest values and their indices:")
for idx in top_3_indices:
print(f"Index: {idx}, Value: {probabilities_data[idx]}")
Run the python script:
davinci:~/resnet50_example_app$ python3 print_mla_probabilities.py
Top 3 largest values and their indices:
Index: 207, Value: 0.959187924861908
Index: 208, Value: 0.0037615213077515364
Index: 216, Value: 0.0037615213077515364
That is great, it seems that our top class is what we expect it to be with high probability as seen in the ModelSDK quantization tests.
Note
If you wish, you could write a similar program to compare against the golden_retriever_207_inference_output_probabilities.bin
file it creates for debugging purposes.
Conclusion and next steps
In this section, we:
Went through the steps necessary to chose the CVU graph we want to run, create its configuration application, and how to copy it to the MLSoC
Went through how to set the JSON configuration file for the CVU graph we are running, along with a description of each parameter value
Ran and verified the output given the
dump
of the plugin as configured in the JSON and the output from our python reference application
Next we will complete our ResNet50 application by writing our own custom plugin for the final step of the pipeline.