.. _developing_gstreamer_app_gstreamer_inference_mla: Step 2: Run and verify the output of ``simaaiprocessmla`` MLA process ##################################################################### .. image:: media/resnet50_application_simaaisrc_simaaiprocessmla.jpg :align: center :scale: 30% | In this section we will explore running through the MLA plugin in order to run the ML model. We have two options going forward: #. We can only run the MLA plugin to ensure that we are getting the right output. If so, we will use the ``simaaisrc`` with the output ``/tmp/generic_preproc-001.out`` from the previous step as input and feed directly into the ``simaaiprocessmla`` plugin. #. We can simply expand the pipeline to now include the MLA step. In this guide, we will go with this step. Before running the ``simaaiprocessmla`` plugin to perform inference on the MLA, we need to configure the json file for the plugin and ensure that we have saved the model locally on the board. Copy the model to the MLSoC =========================== Copy the quantized and compiled model from the Palette docker on the host machine to the MLSoC: .. code-block:: console sima-user@docker-image-id$ scp models/compiled_resnet50/quantized_resnet50_mpk.tar.gz sima@:/home/sima/resnet50_example_app/models/ From the MLSoC shell prompt, lets extract the contents: .. code-block:: console davinci:~/resnet50_example_app/models$ tar xvf quantized_resnet50_mpk.tar.gz quantized_resnet50_stage1_mla.lm quantized_resnet50_mpk.json quantized_resnet50_stage1_mla_stats.yaml Creating the JSON configuration file ==================================== On the MLSoC, create the JSON configuration in ``/home/sima/resnet50_example_app/app_configs``. .. code-block:: console davinci:~/resnet50_example_app/app_configs$ ls Run the following command: .. code-block:: bash echo '{ "version" : 0.1, "node_name" : "mla-resnet", "simaai__params" : { "params" : 15, "index" : 1, "cpu" : 4, "next_cpu" : 1, "out_sz" : 1008, "no_of_outbuf" : 1, "batch_size" : 1, "batch_sz_model" : 1, "in_tensor_sz": 0, "out_tensor_sz": 0, "ibufname" : "generic_preproc", "model_path" : "/home/sima/resnet50_example_app/models/quantized_resnet50_stage1_mla.lm", "debug" : 0, "dump_data" : 1 } }' > simaaiprocessmla_cfg_params.json The GStreamer string update =========================== Let's update the previous ``run_pipeline.sh`` script to include our new plugin. .. code-block:: bash #!/bin/bash # Constants APP_DIR=/home/sima/resnet50_example_app DATA_DIR="${APP_DIR}/data" SIMA_PLUGINS_DIR="${APP_DIR}/../gst-plugins" SAMPLE_IMAGE_SRC="${DATA_DIR}/golden_retriever_207_rgb.bin" CONFIGS_DIR="${APP_DIR}/app_configs" PREPROC_CVU_CONFIG_BIN="${CONFIGS_DIR}/genpreproc_200_cvu_cfg_app" PREPROC_CVU_CONFIG_JSON="${CONFIGS_DIR}/genpreproc_200_cvu_cfg_params.json" INFERENCE_MLA_CONFIG_JSON="${CONFIGS_DIR}/simaaiprocessmla_cfg_params.json" # Remove any existing temporary files before running rm /tmp/generic_preproc*.out # Run the configuration app for generic_preproc $PREPROC_CVU_CONFIG_BIN $PREPROC_CVU_CONFIG_JSON # Run the application export LD_LIBRARY_PATH="${SIMA_PLUGINS_DIR}" gst-launch-1.0 -v --gst-plugin-path="${SIMA_PLUGINS_DIR}" \ simaaisrc mem-target=1 node-name="my_image_src" location="${SAMPLE_IMAGE_SRC}" num-buffers=1 ! \ simaaiprocesscvu source-node-name="my_image_src" buffers-list="my_image_src" config="$PREPROC_CVU_CONFIG_JSON" name="generic_preproc" ! \ simaaiprocessmla config="${INFERENCE_MLA_CONFIG_JSON}" name="mla_inference" ! \ fakesink To run the application: .. code:: console davinci:~/resnet50_example_app$ sudo sh run_pipeline.sh Password: Completed SIMA_GENERIC_PREPROC graph configure ** Message: 04:37:40.073: Num of chunks 1 ** Message: 04:37:40.073: Buffer_name: my_image_src, num_of_chunks:1 (gst-launch-1.0:2398): GLib-GObject-CRITICAL **: 04:37:40.084: g_pointer_type_register_static: assertion 'g_type_from_name (name) == 0' failed (gst-launch-1.0:2398): GLib-GObject-CRITICAL **: 04:37:40.085: g_type_set_qdata: assertion 'node != NULL' failed (gst-launch-1.0:2398): GLib-GObject-CRITICAL **: 04:37:40.085: g_pointer_type_register_static: assertion 'g_type_from_name (name) == 0' failed (gst-launch-1.0:2398): GLib-GObject-CRITICAL **: 04:37:40.086: g_type_set_qdata: assertion 'node != NULL' failed Setting pipeline to PAUSED ... ** Message: 04:37:40.093: Initialize dispatcher ** Message: 04:37:40.094: handle: 0xa3b295b0, 0xffffa3b295b0 ** Message: 04:37:41.238: Loaded model from location /data/simaai/building_apps_palette/gstreamer/resnet50_example_app/models/quantized_resnet50_stage1_mla.lm, model:hdl: 0xaaaae079eaa0 ** Message: 04:37:41.242: Filename memalloc = /data/simaai/building_apps_palette/gstreamer/resnet50_example_app/data/golden_retriever_207_rgb.bin Pipeline is PREROLLING ... Pipeline is PREROLLED ... Setting pipeline to PLAYING ... Redistribute latency... New clock: GstSystemClock Got EOS from element "pipeline0". Execution ended after 0:00:00.001474163 Setting pipeline to NULL ... Freeing pipeline ... You will see the output of the CVU preprocess and the MLA inference in the ``/tmp/`` folder: .. code-block:: console davinci:~/resnet50_example_app$ ls /tmp/*.out generic_preproc-001.out mla-resnet-1.out Verifying the output ==================== Just like the input of the MLA needs to be ``quantized`` and ``tesselated``, the output of the MLA is still ``quantized`` and ``tesselated``. Thus, any reference we are going to compare against, also needs to be in the same form, or, we must ``dequantize`` and ``detesselate`` before verifying the output. #. Let's first take a look at the output from the plugin: .. code-block:: console davinci:~/resnet50_example_app$ hexdump -C /tmp/mla-resnet-1.out 00000000 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 |................| * 000000c0 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 7f |................| 000000d0 81 80 80 80 80 80 80 80 81 80 80 80 80 80 80 80 |................| 000000e0 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 |................| * 000003e0 80 80 80 80 80 80 80 80 00 00 00 00 00 00 00 00 |................| 000003f0 As noted earlier, this output should be interpreted as ``1008`` values representing the output of the ``softmax`` function from the network of type ``int8`` (2's complement). .. note:: Why ``1008`` values and not ``1000`` as expected for the ResNet50 output? This is just due to memory alignment requirements (``tesselation``) for the MLA. When the output is ``detesselated``, we will again have the output having ``1000`` values. .. note:: The ``*`` between values of the ``hexdump`` signify that there are n lines before that contain repeated values. In order to view the entire output, you can use the ``-v`` flag: ``hexdump -C -v /tmp/mla-resnet-1.out`` Using a Python on the MLSoC, let's manually ``dequantize`` the output in order to see if the top 3 results match our expectations: .. code-block:: console davinci:~/resnet50_example_app$ vi print_mla_top_output_indices.py Copy the following inside the script: .. code-block:: python import numpy as np # Step 1: Read the binary file as int8 values mla_data = np.fromfile('/tmp/mla-resnet-1.out', dtype=np.int8)[9:] # Step 2: Dequantize (values from *_mpk.json for the output dequantization node) dequantize_scale, dequantized_zero_point = 255.02200010497842, -128 dequantized_data = (mla_data - dequantized_zero_point).astype(np.float32) / dequantize_scale # Step 3: Find the indices of the top 3 largest values top_3_indices = np.argpartition(dequantized_data, -3)[-3:] # Step 4: Sort the top 3 indices by the actual values (descending) top_3_indices = top_3_indices[np.argsort(-dequantized_data[top_3_indices])] # Print the results print("Top 3 largest values and their indices:") for idx in top_3_indices: print(f"Index: {idx}") .. note:: The values in the script can be extracted from the compiled ``.tar.gz`` ``*_mpk.json`` found under: * dequantize_scale = ``plugins[5] → config_params → params → channel_params[0][0]`` * dequantized_zero_point = ``plugins[5] → config_params → params → channel_params[0][1]`` ``mla_data`` is gathered without the first 8 pixels (``[9:]``) in order to remove 0's that are a result of ``tesselation``. Run the script to take a look at the highest value classes to get an idea if the expected ``207`` class is the top class: .. code-block:: console davinci:~/resnet50_example_app$ python3 print_mla_top_output_indices.py Top 3 largest values and their indices: Index: 207 Index: 199 Index: 332 Excellent, that is what we expected. Conclusion and next steps ========================= In this section, we: * Went through the steps of setting up the ``simaaiprocessmla`` plugin to run inference using a model that was compiled using the ModelSDK. * Ran and verified the output given the ``dump`` of the plugin as configured in the JSON and the output from our python reference application Next, we will add another CVU graph in order to ``detesselate`` and ``dequantize`` the output from the MLA.