Step 3: Run and verify output of simaaiprocesscvu (CVU postprocess)

../../../_images/resnet50_application_simaaisrc_detessdequant.jpg

In this section we will perform dequantization and detesselation with the help of the simaaiprocesscvu plugin. We will then verify versus our fp32 expected output tensor to see if it matches expectations.

Because we are again using the EV74 CVU, we will have to follow the same steps as we did before:

  1. Choose the CVU graph we want to run (graph SIMA_DETESS_DEQUANT in this case)

  2. Create the JSON configuration file with the right parameters for this application and the target EV74 CVU graph

  3. Develop and compile a configuration application for it

  4. Run the configuration application before executing your GStreamer pipeline with the JSON configuration file

  5. Run your GStreamer pipeline by specifying the simaaiprocesscvu plugin with the JSON configuration file

Choosing the CVU function

For our example, we will be choosing the SIMA_DETESS_DEQUANT function from the list of CVU available kernels in the CVU Graphs. This function will be useful because it performs dequantization and detesselation of any MLA output.

Note

Every output tensor from the MLA is tesselated and quantized. Before proceeding with any downstream operations, all values out of the MLA must be dequantized and detesselated.

Creating the JSON configuration file

A JSON configuration file is used in 2 steps of the runtime:

  1. The JSON is used by the CVU configuration application to configure the parameters being set.

  2. The same JSON file can be used to configure the simaaiprocesscvu plugin at runtime when the application is launched.

To create the JSON file, you can refer to SIMA_DETESS_DEQUANT Parameters and Example Configuration section.

Developing and compiling the configuration application

The CVU needs to be configured with the graph that it will run and along with the corresponding parameters for that graph. Currently, this needs to be done explicitly by the developer via a C++ application. Here, we present an example configuration application that works for the ResNet50 example for the SIMA_DETESS_DEQUANT graph.

  1. Go to the SIMA_DETESS_DEQUANT section and either download the pre-written and pre-compiled configuration application, or follow the instructions to re-write or edit the source.

  2. To compile the application on Palette, please refer to the EV74 CVU How to compile CVU Configuration Application? section.

Copying the configuration application and the JSON configuration to the board

  1. Once the application has been compiled or downloaded, we need to copy it to the board.

  2. Let’s first create the JSON file for the Configuration parameters we need for the CVU found in the previous section. From the MLSoC:

    davinci:~/resnet50_example_app$ cd app_configs
    davinci:~/resnet50_example_app/app_configs$
    
  3. Run the following command:

    echo '{
        "version": 0.1,
        "node_name": "detess-dequant",
        "simaai__params": {
            "params": 15,
            "cpu": 1,
            "next_cpu": 0,
            "no_of_outbuf": 1,
            "ibufname": "",
            "graph_id": 201,
            "img_width": 1280,
            "img_height": 720,
            "num_tensors": 1,
            "input_width": [1],
            "input_height": [1],
            "input_depth": [1000],
            "slice_width": [1],
            "slice_height": [1],
            "slice_depth": [1000],
            "dq_scale": [255.02200010497842],
            "dq_zp": [-128],
            "data_type": [0],
            "fp16_out_en": [0],
            "output_format": [0],
            "debug": 0,
            "out_sz": 4000,
            "dump_data": 1
        }
    }' > detessdequant_201_cvu_cfg_params.json
    
  4. From the Palette Docker container on the development host machine, let’s scp the configuration application (detessdequant_201_cvu_cfg_app) binary to the same folder.

    sima-user@docker-image-id:/home/docker/sima-cli/ev74_cgfs/sima_detess_dequant/build$ scp detessdequant_201_cvu_cfg_app sima@<IP address of MLSoC>:/home/sima/resnet50_example_app/app_configs
        build/detessdequant_201_cvu_cfg_app                                                                               100%   65KB   9.7MB/s   00:00
    
  5. The directory should now look like this:

    davinci:~/resnet50_example_app$ ls
        build/detessdequant_201_cvu_cfg_app  detessdequant_201_cvu_cfg_params.json
    

We now have the parameters with the right values, and the application necessary to configure the CVU for our preprocessing step.

Running the configuration application

To run the configuration application, simply run it on the MLSoC with the right input parameters. In the binary directory, run:

davinci:~/resnet50_example_app$ sudo ./detessdequant_201_cvu_cfg_app detessdequant_201_cvu_cfg_params.json
    Password:
    Completed SIMA_DETESS_DEQUANT graph configure

To verify if the configuration was set correctly, you can look at the EV74 log found at: /var/log/simaai_EV74.log. The output should look something like:

Note

Sometimes it can take a few seconds to a minute for the log to update.

davinci:/home/sima/resnet50_example_app/app_configs$ tail -f /var/log/simaai_EV74.log
    ... function="dump_detessdequant_params"]-------sima_detessdequant_inst_0_tensor_0------
    ... function="dump_detessdequant_params"]Input width: 1
    ... function="dump_detessdequant_params"]Input height: 1
    ... function="dump_detessdequant_params"]Input depth/channels: 1000
    ... function="dump_detessdequant_params"]Slice width: 1
    ... function="dump_detessdequant_params"]Slice height: 1
    ... function="dump_detessdequant_params"]Slice depth/channels: 1000
    ... function="dump_detessdequant_params"]Dequant Scale : 265.849884
    ... function="dump_detessdequant_params"]Dequant ZeroPoint: -128
    ... function="dump_detessdequant_params"]Input data type: int8
    ... function="dump_detessdequant_params"]Fp16 Output Enabled: False
    ... function="dump_detessdequant_params"]Output Tensor Format: NHWC
    ... function="dump_detessdequant_params"]Input Tensor Size: 1008
    ... function="dump_detessdequant_params"]Output Tensor Size: 4000
    ... function="dump_detessdequant_params"]Debug level : 0

The GStreamer string update

Let us update the previous run_pipeline.sh script to include our new plugin.

#!/bin/bash

# Constants #
APP_DIR=/home/sima/resnet50_example_app
DATA_DIR="${APP_DIR}/data"
SIMA_PLUGINS_DIR="${APP_DIR}/../gst-plugins"
SAMPLE_IMAGE_SRC="${DATA_DIR}/golden_retriever_207_rgb.bin"
CONFIGS_DIR="${APP_DIR}/app_configs"
PREPROC_CVU_CONFIG_BIN="${CONFIGS_DIR}/genpreproc_200_cvu_cfg_app"
PREPROC_CVU_CONFIG_JSON="${CONFIGS_DIR}/genpreproc_200_cvu_cfg_params.json"
INFERENCE_MLA_CONFIG_JSON="${CONFIGS_DIR}/simaaiprocessmla_cfg_params.json"
DETESSDEQUANT_CVU_CONFIG_BIN="${CONFIGS_DIR}/detessdequant_201_cvu_cfg_app"
DETESSDEQUANT_CVU_CONFIG_JSON="${CONFIGS_DIR}/detessdequant_201_cvu_cfg_params.json"

# Remove any existing temporary files before running
rm /tmp/generic_preproc*.out /tmp/mla-*.out /tmp/detess-dequant*.out

# Run the configuration apps for generic_preproc and detessdequant
$PREPROC_CVU_CONFIG_BIN $PREPROC_CVU_CONFIG_JSON
$DETESSDEQUANT_CVU_CONFIG_BIN $DETESSDEQUANT_CVU_CONFIG_JSON

# Run the application
export LD_LIBRARY_PATH="${SIMA_PLUGINS_DIR}"
gst-launch-1.0 -v --gst-plugin-path="${SIMA_PLUGINS_DIR}" \
simaaisrc mem-target=1 node-name="my_image_src" location="${SAMPLE_IMAGE_SRC}" num-buffers=1 ! \
simaaiprocesscvu source-node-name="my_image_src" buffers-list="my_image_src" config="${PREPROC_CVU_CONFIG_JSON}" name="generic_preproc" ! \
simaaiprocessmla config="${INFERENCE_MLA_CONFIG_JSON}" ! \
simaaiprocesscvu source-node-name="mla-resnet" buffers-list="mla-resnet" config="${DETESSDEQUANT_CVU_CONFIG_JSON}" name="detessdequant" ! \
fakesink

To run the application:

davinci:~/resnet50_example_app$ sudo sh run_pipeline.sh
    rm: cannot remove '/tmp/detess-dequant*.out': No such file or directory
    Completed SIMA_GENERIC_PREPROC graph configure

    Completed Graph Configure
    ** Message: 23:09:28.934: Num of chunks 1
    ** Message: 23:09:28.934: Buffer_name: my_image_src, num_of_chunks:1

    (gst-launch-1.0:5856): GLib-GObject-CRITICAL **: 23:09:28.944: g_pointer_type_register_static: assertion 'g_type_from_name (name) == 0' failed

    (gst-launch-1.0:5856): GLib-GObject-CRITICAL **: 23:09:28.944: g_type_set_qdata: assertion 'node != NULL' failed
    ** Message: 23:09:28.945: Num of chunks 1
    ** Message: 23:09:28.946: Buffer_name: mla-resnet, num_of_chunks:1
    Setting pipeline to PAUSED ...
    ** Message: 23:09:28.952: Initialize dispatcher
    ** Message: 23:09:28.954: handle: 0x7f9195b0, 0xffff7f9195b0
    ** Message: 23:09:29.629: Loaded model from location /home/sima/resnet50_example_app/models/quantized_resnet50_stage1_mla.lm, model:hdl: 0xaaaacca728f0
    ** Message: 23:09:29.634: Filename memalloc = /home/sima/resnet50_example_app/data/golden_retriever_207_rgb.bin
    Pipeline is PREROLLING ...
    Pipeline is PREROLLED ...
    Setting pipeline to PLAYING ...
    Redistribute latency...
    New clock: GstSystemClock
    Got EOS from element "pipeline0".
    Execution ended after 0:00:00.001388186
    Setting pipeline to NULL ...
    Freeing pipeline ...

The output dump of the simaaiprocesscvu is located in: /tmp/detess-dequant-001.out for verification in the next step.

Tip

Notice that we set the dump parameter in the detessdequant_201_cvu_cfg_params.json file to 1 in order to dump the output of the simaaiprocesscvu plugin.

Verifying the output

In order to verify the output against a known reference, we will obtain the original fp32 resnet50 output probabilities and compare vs. our output.

On the MLSoC:

davinci:~/resnet50_example_app$ vi print_mla_probabilities.py

Write the following inside the script:

import numpy as np

# Step 1: Read the binary file as int8 values
probabilities_data = np.fromfile('/tmp/detess-dequant-001.out', dtype=np.float32)

# Step 2: Find the indices of the top 3 largest values
top_3_indices = np.argpartition(probabilities_data, -3)[-3:]

# Step 3: Sort the top 3 indices by the actual values (descending)
top_3_indices = top_3_indices[np.argsort(-probabilities_data[top_3_indices])]

# Print the results
print("Top 3 largest values and their indices:")
for idx in top_3_indices:
    print(f"Index: {idx}, Value: {probabilities_data[idx]}")

Run the python script:

davinci:~/resnet50_example_app$ python3 print_mla_probabilities.py
    Top 3 largest values and their indices:
    Index: 207, Value: 0.959187924861908
    Index: 208, Value: 0.0037615213077515364
    Index: 216, Value: 0.0037615213077515364

That is great, it seems that our top class is what we expect it to be with high probability as seen in the ModelSDK quantization tests.

Note

If you wish, you could write a similar program to compare against the golden_retriever_207_inference_output_probabilities.bin file it creates for debugging purposes.

Conclusion and next steps

In this section, we:

  • Went through the steps necessary to chose the CVU graph we want to run, create its configuration application, and how to copy it to the MLSoC

  • Went through how to set the JSON configuration file for the CVU graph we are running, along with a description of each parameter value

  • Ran and verified the output given the dump of the plugin as configured in the JSON and the output from our python reference application

Next we will complete our ResNet50 application by writing our own custom plugin for the final step of the pipeline.