.. _developing_gstreamer_app_gstreamer_postproc_cvu:


Step 3: Run and verify output of ``simaaiprocesscvu`` (CVU postprocess)
#######################################################################

.. image:: media/resnet50_application_simaaisrc_detessdequant.jpg 
    :align: center
    :scale: 30%

|

In this section we will perform ``dequantization`` and ``detesselation`` with the help of the ``simaaiprocesscvu`` plugin.
We will then verify versus our fp32 expected output tensor to see if it matches expectations.

Because we are again using the EV74 CVU, we will have to follow the same steps as we did before:

#. Choose the CVU graph we want to run (graph :ref:`ev74_graph_201_sima_detess_dequant` in this case)
#. Create the JSON configuration file with the right parameters for this application and the target EV74 CVU graph
#. Develop and compile a configuration application for it
#. Run the configuration application before executing your GStreamer pipeline with the JSON configuration file
#. Run your GStreamer pipeline by specifying the simaaiprocesscvu plugin with the JSON configuration file

Choosing the CVU function
=========================

For our example, we will be choosing the :ref:`ev74_graph_201_sima_detess_dequant` function from the list of CVU available kernels in the :ref:`cvu_graphs`. 
This function will be useful because it performs ``dequantization`` and ``detesselation`` of any MLA output.

.. note:: 
    Every output tensor from the MLA is ``tesselated`` and ``quantized``. Before proceeding with any downstream operations, all values out of the MLA
    must be ``dequantized`` and ``detesselated``.

Creating the JSON configuration file
====================================

A JSON configuration file is used in 2 steps of the runtime:

#. The JSON is used by the CVU configuration application to configure the parameters being set.
#. The same JSON file can be used to configure the :code:`simaaiprocesscvu` plugin at runtime when the application is launched.

To create the JSON file, you can refer to :ref:`ev74_graph_201_sima_detess_dequant` ``Parameters`` and ``Example Configuration`` section.

Developing and compiling the configuration application
======================================================

The CVU needs to be configured with the graph that it will run and along with the corresponding parameters for that graph.
Currently, this needs to be done explicitly by the developer via a C++ application. Here, we present an example configuration application
that works for the ResNet50 example for the :ref:`ev74_graph_201_sima_detess_dequant` graph. 

#. Go to the :ref:`ev74_graph_201_sima_detess_dequant` section and either download the pre-written and pre-compiled configuration application, or follow the instructions to re-write or edit the source.
#. To compile the application on Palette, please refer to the EV74 CVU :ref:`dependent_app_ev74_app` section.

Copying the configuration application and the JSON configuration to the board
=============================================================================

#. Once the application has been compiled or downloaded, we need to copy it to the board.
#. Let's first create the JSON file for the Configuration parameters we need for the CVU found in the previous section. From the MLSoC:

    .. code-block:: console

        davinci:~/resnet50_example_app$ cd app_configs
        davinci:~/resnet50_example_app/app_configs$ 

#. Run the following command:        

    .. code-block:: bash
        
        echo '{
            "version": 0.1,
            "node_name": "detess-dequant",
            "simaai__params": {
                "params": 15,
                "cpu": 1,
                "next_cpu": 0,
                "no_of_outbuf": 1,
                "ibufname": "",
                "graph_id": 201,
                "img_width": 1280,
                "img_height": 720,
                "num_tensors": 1,
                "input_width": [1],
                "input_height": [1],
                "input_depth": [1000],
                "slice_width": [1],
                "slice_height": [1],
                "slice_depth": [1000],
                "dq_scale": [255.02200010497842],
                "dq_zp": [-128],
                "data_type": [0],
                "fp16_out_en": [0],
                "output_format": [0],
                "debug": 0,
                "out_sz": 4000,
                "dump_data": 1
            }
        }' > detessdequant_201_cvu_cfg_params.json
    
#. From the Palette Docker container on the development host machine, let's `scp` the configuration application (``detessdequant_201_cvu_cfg_app``) binary to the same folder.

    .. code-block:: console

        sima-user@docker-image-id:/home/docker/sima-cli/ev74_cgfs/sima_detess_dequant/build$ scp detessdequant_201_cvu_cfg_app sima@<IP address of MLSoC>:/home/sima/resnet50_example_app/app_configs
            build/detessdequant_201_cvu_cfg_app                                                                               100%   65KB   9.7MB/s   00:00    

#. The directory should now look like this:

    .. code-block:: console

        davinci:~/resnet50_example_app$ ls
            build/detessdequant_201_cvu_cfg_app  detessdequant_201_cvu_cfg_params.json

We now have the parameters with the right values, and the application necessary to configure the CVU for our preprocessing step.

Running the configuration application
=====================================

To run the configuration application, simply run it on the MLSoC with the right input parameters. In the binary directory, run:

.. code-block:: console

    davinci:~/resnet50_example_app$ sudo ./detessdequant_201_cvu_cfg_app detessdequant_201_cvu_cfg_params.json 
        Password: 
        Completed SIMA_DETESS_DEQUANT graph configure

To verify if the configuration was set correctly, you can look at the EV74 log found at: ``/var/log/simaai_EV74.log``. The output should look something like:

.. note:: 

    Sometimes it can take a few seconds to a minute for the log to update.


.. code-block:: console
    
    davinci:/home/sima/resnet50_example_app/app_configs$ tail -f /var/log/simaai_EV74.log
        ... function="dump_detessdequant_params"]-------sima_detessdequant_inst_0_tensor_0------
        ... function="dump_detessdequant_params"]Input width: 1
        ... function="dump_detessdequant_params"]Input height: 1
        ... function="dump_detessdequant_params"]Input depth/channels: 1000
        ... function="dump_detessdequant_params"]Slice width: 1
        ... function="dump_detessdequant_params"]Slice height: 1
        ... function="dump_detessdequant_params"]Slice depth/channels: 1000
        ... function="dump_detessdequant_params"]Dequant Scale : 265.849884
        ... function="dump_detessdequant_params"]Dequant ZeroPoint: -128
        ... function="dump_detessdequant_params"]Input data type: int8
        ... function="dump_detessdequant_params"]Fp16 Output Enabled: False
        ... function="dump_detessdequant_params"]Output Tensor Format: NHWC
        ... function="dump_detessdequant_params"]Input Tensor Size: 1008
        ... function="dump_detessdequant_params"]Output Tensor Size: 4000
        ... function="dump_detessdequant_params"]Debug level : 0


The GStreamer string update
===========================

Let us update the previous ``run_pipeline.sh`` script to include our new plugin.

.. code-block:: bash

    #!/bin/bash

    # Constants #
    APP_DIR=/home/sima/resnet50_example_app
    DATA_DIR="${APP_DIR}/data"
    SIMA_PLUGINS_DIR="${APP_DIR}/../gst-plugins"
    SAMPLE_IMAGE_SRC="${DATA_DIR}/golden_retriever_207_rgb.bin"
    CONFIGS_DIR="${APP_DIR}/app_configs"
    PREPROC_CVU_CONFIG_BIN="${CONFIGS_DIR}/genpreproc_200_cvu_cfg_app"
    PREPROC_CVU_CONFIG_JSON="${CONFIGS_DIR}/genpreproc_200_cvu_cfg_params.json"
    INFERENCE_MLA_CONFIG_JSON="${CONFIGS_DIR}/simaaiprocessmla_cfg_params.json"
    DETESSDEQUANT_CVU_CONFIG_BIN="${CONFIGS_DIR}/detessdequant_201_cvu_cfg_app"
    DETESSDEQUANT_CVU_CONFIG_JSON="${CONFIGS_DIR}/detessdequant_201_cvu_cfg_params.json"

    # Remove any existing temporary files before running
    rm /tmp/generic_preproc*.out /tmp/mla-*.out /tmp/detess-dequant*.out

    # Run the configuration apps for generic_preproc and detessdequant
    $PREPROC_CVU_CONFIG_BIN $PREPROC_CVU_CONFIG_JSON
    $DETESSDEQUANT_CVU_CONFIG_BIN $DETESSDEQUANT_CVU_CONFIG_JSON

    # Run the application
    export LD_LIBRARY_PATH="${SIMA_PLUGINS_DIR}"
    gst-launch-1.0 -v --gst-plugin-path="${SIMA_PLUGINS_DIR}" \
    simaaisrc mem-target=1 node-name="my_image_src" location="${SAMPLE_IMAGE_SRC}" num-buffers=1 ! \
    simaaiprocesscvu source-node-name="my_image_src" buffers-list="my_image_src" config="${PREPROC_CVU_CONFIG_JSON}" name="generic_preproc" ! \
    simaaiprocessmla config="${INFERENCE_MLA_CONFIG_JSON}" ! \
    simaaiprocesscvu source-node-name="mla-resnet" buffers-list="mla-resnet" config="${DETESSDEQUANT_CVU_CONFIG_JSON}" name="detessdequant" ! \
    fakesink

To run the application:

.. code:: console

    davinci:~/resnet50_example_app$ sudo sh run_pipeline.sh 
        rm: cannot remove '/tmp/detess-dequant*.out': No such file or directory
        Completed SIMA_GENERIC_PREPROC graph configure 

        Completed Graph Configure 
        ** Message: 23:09:28.934: Num of chunks 1
        ** Message: 23:09:28.934: Buffer_name: my_image_src, num_of_chunks:1

        (gst-launch-1.0:5856): GLib-GObject-CRITICAL **: 23:09:28.944: g_pointer_type_register_static: assertion 'g_type_from_name (name) == 0' failed

        (gst-launch-1.0:5856): GLib-GObject-CRITICAL **: 23:09:28.944: g_type_set_qdata: assertion 'node != NULL' failed
        ** Message: 23:09:28.945: Num of chunks 1
        ** Message: 23:09:28.946: Buffer_name: mla-resnet, num_of_chunks:1
        Setting pipeline to PAUSED ...
        ** Message: 23:09:28.952: Initialize dispatcher
        ** Message: 23:09:28.954: handle: 0x7f9195b0, 0xffff7f9195b0
        ** Message: 23:09:29.629: Loaded model from location /home/sima/resnet50_example_app/models/quantized_resnet50_stage1_mla.lm, model:hdl: 0xaaaacca728f0
        ** Message: 23:09:29.634: Filename memalloc = /home/sima/resnet50_example_app/data/golden_retriever_207_rgb.bin
        Pipeline is PREROLLING ...
        Pipeline is PREROLLED ...
        Setting pipeline to PLAYING ...
        Redistribute latency...
        New clock: GstSystemClock
        Got EOS from element "pipeline0".
        Execution ended after 0:00:00.001388186
        Setting pipeline to NULL ...
        Freeing pipeline ...

The output dump of the ``simaaiprocesscvu`` is located in: ``/tmp/detess-dequant-001.out`` for verification in the next step.

.. tip:: 

    Notice that we set the ``dump`` parameter in the ``detessdequant_201_cvu_cfg_params.json`` file to ``1`` in order to dump the output of the ``simaaiprocesscvu`` plugin.

Verifying the output
====================

In order to verify the output against a known reference, we will obtain the original fp32 resnet50 output probabilities and compare vs. our output.

On the MLSoC:

.. code-block:: console

    davinci:~/resnet50_example_app$ vi print_mla_probabilities.py

Write the following inside the script:

.. code-block:: python

    import numpy as np

    # Step 1: Read the binary file as int8 values
    probabilities_data = np.fromfile('/tmp/detess-dequant-001.out', dtype=np.float32)

    # Step 2: Find the indices of the top 3 largest values
    top_3_indices = np.argpartition(probabilities_data, -3)[-3:]

    # Step 3: Sort the top 3 indices by the actual values (descending)
    top_3_indices = top_3_indices[np.argsort(-probabilities_data[top_3_indices])]

    # Print the results
    print("Top 3 largest values and their indices:")
    for idx in top_3_indices:
        print(f"Index: {idx}, Value: {probabilities_data[idx]}")

Run the python script:

.. code-block:: console

    davinci:~/resnet50_example_app$ python3 print_mla_probabilities.py 
        Top 3 largest values and their indices:
        Index: 207, Value: 0.959187924861908
        Index: 208, Value: 0.0037615213077515364
        Index: 216, Value: 0.0037615213077515364

That is great, it seems that our top class is what we expect it to be with high probability as seen in the ModelSDK quantization tests. 

.. note::

    If you wish, you could write a similar program to compare against the ``golden_retriever_207_inference_output_probabilities.bin`` file it creates for debugging purposes.
 

Conclusion and next steps
=========================

In this section, we: 

    * Went through the steps necessary to chose the CVU graph we want to run, create its configuration application, and how to copy it to the MLSoC
    * Went through how to set the JSON configuration file for the CVU graph we are running, along with a description of each parameter value
    * Ran and verified the output given the ``dump`` of the plugin as configured in the JSON and the output from our python reference application

Next we will complete our ResNet50 application by writing our own custom plugin for the final step of the pipeline.