.. _developing_gstreamer_app_modelsdk_compile:

ModelSDK - Compiling ML Models
##############################

The first step in developing any pipeline, is to compile the ML models so that they can run on the MLA.
In order to compile the ResNet50 model in our example application, we will use the script below on the Palette docker container using :ref:`modelsdk` APIs.

.. code-block:: python
    :caption: resnet50_quant.py
    :linenos:

    #**************************************************************************
    #||                        SiMa.ai CONFIDENTIAL                          ||
    #||   Unpublished Copyright (c) 2023-2024 SiMa.ai, All Rights Reserved.  ||
    #**************************************************************************
    # NOTICE:  All information contained herein is, and remains the property of
    # SiMa.ai. The intellectual and technical concepts contained herein are
    # proprietary to SiMa and may be covered by U.S. and Foreign Patents,
    # patents in process, and are protected by trade secret or copyright law.
    #
    # Dissemination of this information or reproduction of this material is
    # strictly forbidden unless prior written permission is obtained from
    # SiMa.ai.  Access to the source code contained herein is hereby forbidden
    # to anyone except current SiMa.ai employees, managers or contractors who
    # have executed Confidentiality and Non-disclosure agreements explicitly
    # covering such access.
    #
    # The copyright notice above does not evidence any actual or intended
    # publication or disclosure  of  this source code, which includes information
    # that is confidential and/or proprietary, and is a trade secret, of SiMa.ai.
    #
    # ANY REPRODUCTION, MODIFICATION, DISTRIBUTION, PUBLIC PERFORMANCE, OR PUBLIC
    # DISPLAY OF OR THROUGH USE OF THIS SOURCE CODE WITHOUT THE EXPRESS WRITTEN
    # CONSENT OF SiMa.ai IS STRICTLY PROHIBITED, AND IN VIOLATION OF APPLICABLE
    # LAWS AND INTERNATIONAL TREATIES. THE RECEIPT OR POSSESSION OF THIS SOURCE
    # CODE AND/OR RELATED INFORMATION DOES NOT CONVEY OR IMPLY ANY RIGHTS TO
    # REPRODUCE, DISCLOSE OR DISTRIBUTE ITS CONTENTS, OR TO MANUFACTURE, USE, OR
    # SELL ANYTHING THAT IT  MAY DESCRIBE, IN WHOLE OR IN PART.
    #
    #**************************************************************************

    import cv2
    import numpy as np
    import pickle as pkl

    from afe.apis.defines import QuantizationParams, quantization_scheme, CalibrationMethod
    from afe.apis.loaded_net import load_model
    from afe.apis.release_v1 import get_model_sdk_version
    from afe.ir.tensor_type import ScalarType
    from afe.load.importers.general_importer import onnx_source
    from afe.core.utils import convert_data_generator_to_iterable

    from typing import Dict
    from pathlib import Path
    from afe import DataGenerator

    np.random.seed(9)

    # Constants
    ROOT_PATH = Path(__file__).parent.resolve()
    MODEL_INPUT_NAME = "input"
    MAX_DATA_SAMPLES = 50
    MODELS_PATH = ROOT_PATH/"../../models"
    DATA_PATH = ROOT_PATH/"../../data/"
    MODEL_PATH = MODELS_PATH/"resnet50_model.onnx"
    LABELS_PATH = DATA_PATH/"imagenet_labels.txt"
    CALIBRATION_SET_PATH = DATA_PATH/"openimages_v7_images_and_labels.pkl"

    # Dataset and preprocessing #
    def create_imagenet_dataset(num_samples: int = 1) -> Dict[str, DataGenerator]:
        """
        Creates a data generator with the structure
        { 'images': DataGenerator of image arrays
        'labels': DataGenerators of labels }
        """
        dataset_path = CALIBRATION_SET_PATH

        with open(dataset_path, 'rb') as f:
            dataset = pkl.load(f)

        images_and_labels = {'images': dataset['data'][:num_samples], 
                            'labels': dataset['target'][:num_samples]}
        
        return images_and_labels

    def preprocess(image, skip_transpose=True, input_shape: tuple = (224, 224), scale_factor: tuple = 255.0):
        mean = [0.485, 0.456, 0.406]
        stddv = [0.229, 0.224, 0.225]
        
        # val224 images come in CHW format, need to transpose to HWC format
        if not skip_transpose:
            image = image.transpose(1, 2, 0)
        
        # Resize, color convert, scale, normalize
        image = cv2.resize(image, input_shape)
        image = image / scale_factor
        image = (image - mean) / stddv
        
        return image

    # Function to post-process the output
    def postprocess_output(output: np.ndarray):
        probabilities = output[0][0]
        max_idx = np.argmax(probabilities)
        return max_idx, probabilities[max_idx]

    # Get Model SDK version
    sdk_version = get_model_sdk_version()
    print(f"Model SDK version: {sdk_version}")

    # Model information
    input_name, input_shape, input_type = ("input", (1, 3, 224, 224), ScalarType.float32)
    input_shapes_dict = {input_name: input_shape}
    input_types_dict = {input_name: input_type}

    # Load the ONNX model
    importer_params = onnx_source(str(MODEL_PATH), input_shapes_dict, input_types_dict)
    loaded_net = load_model(importer_params)

    # Create the calibration dataset
    images_and_labels = create_imagenet_dataset(num_samples=MAX_DATA_SAMPLES)

    # Create a datagenerator from it and map the preprocessing function
    images_generator = DataGenerator({MODEL_INPUT_NAME: images_and_labels["images"]})
    images_generator.map({MODEL_INPUT_NAME: preprocess})

    # Setup the quantization parameters and quantize using MSE and INT8
    quant_configs: QuantizationParams = QuantizationParams(calibration_method=CalibrationMethod.from_str('min_max'),
                                                        activation_quantization_scheme=quantization_scheme(asymmetric=True, per_channel=False, bits=8),
                                                        weight_quantization_scheme=quantization_scheme(asymmetric=False, per_channel=True, bits=8))

    sdk_net = loaded_net.quantize(convert_data_generator_to_iterable(images_generator),
                                quant_configs,
                                model_name="quantized_resnet50",
                                arm_only=False)

    # Execute the quantized net with ImageNet samples
    with open(LABELS_PATH, "r") as f:
            imagenet_labels = [line.strip() for line in f.readlines()]

    for idx in range(6):
        sdk_net_output = sdk_net.execute(inputs={"input": images_generator[idx]["input"]})
        inference_label, inference_result = postprocess_output(sdk_net_output)
        reference_label = images_and_labels["labels"][idx]
        
        print(f"[{idx}] --> {imagenet_labels[inference_label]} / {reference_label} -> {inference_result:.2%}")
        
    # Load image -> preprocess -> inference -> postprocess -> print ; 207 is the expected label
    print("Inference on a happy golden retriever (class 207)  ..")
    dog_image = cv2.imread(str(DATA_PATH/"golden_retriever_207.jpg"))
    dog_image = cv2.cvtColor(dog_image, cv2.COLOR_BGR2RGB)
    pp_dog_image = np.expand_dims(preprocess(dog_image), axis=0).astype(np.float32)
    sdk_net_output = sdk_net.execute(inputs={"input": pp_dog_image})
    inference_label, inference_result = postprocess_output(sdk_net_output)
    print(f"[{idx}] --> {imagenet_labels[inference_label]} / 207  -> {inference_result:.2%}")

    # Save model
    sdk_net.save(model_name="quantized_resnet50", output_directory=str(MODELS_PATH))

    # Compile the quantized net and generate LM file and MPK JSON file
    print("Compiling the model ..")
    sdk_net.compile(output_path=str(MODELS_PATH/"compiled_resnet50"))

To compile the model, you will need the following directory structure:

.. code-block:: console

    sima-user@docker-image-id$ tree -L3
    .
    ├── data
    │   ├── golden_retriever_207.jpg
    │   ├── imagenet_labels.txt
    │   └── openimages_v7_images_and_labels.pkl
    ├── models
    │   ├── download_resnet50.py
    │   └── resnet50_model.onnx
    ├── README.md
    ├── requirements.txt
    └── src
        ├── modelsdk_quantize_model
        │   └── resnet50_quant.py
        └── x86_reference_app
            └── resnet50_reference_classification_app.py
    
Within the docker Palette container, run the script:

.. code-block:: console

    sima-user@docker-image-id$ python src/modelsdk_quantize_model/resnet50_quant.py 
        /usr/local/lib/python3.10/site-packages/tensorflow/python/keras/engine/training_arrays_v1.py:37: UserWarning: A NumPy version >=1.23.5 and <2.3.0 is required for this version of SciPy (detected version 1.23.2)
        from scipy.sparse import issparse  # pylint: disable=g-import-not-at-top
        Model SDK version: 1.4.0
        Running calibration ...DONE
        ...
        Running quantization ...DONE
        [0] --> 817: 'sports car, sport car', / ['Clothing', 'Person', 'Car', 'Wheel'] -> 47.84%
        [1] --> 248: 'Eskimo dog, husky', / ['Dog'] -> 51.76%
        [2] --> 668: 'mosque', / ['Person'] -> 93.72%
        [3] --> 515: 'cowboy hat, ten-gallon hat', / ['Sun hat', 'Cowboy hat', 'Fedora', 'Clothing'] -> 98.82%
        [4] --> 113: 'snail', / ['Animal', 'Snail'] -> 98.82%
        [5] --> 517: 'crane', / ['Land vehicle'] -> 98.82%
        Inference on a happy golden retriever (class 207)  ..
        [5] --> 207: 'golden retriever', / 207  -> 98.82%
        Compiling the model ..
    
    sima-user@docker-image-id$ tree -L 3
        .
        ├── data
        │   ├── golden_retriever_207.jpg
        │   ├── imagenet_labels.txt
        │   └── openimages_v7_images_and_labels.pkl
        ├── models
        │   ├── compiled_resnet50
        │   │   └── quantized_resnet50_mpk.tar.gz
        │   ├── download_resnet50.py
        │   ├── quantized_resnet50.sima
        │   ├── quantized_resnet50.sima.json
        │   └── resnet50_model.onnx
        ├── README.md
        ├── requirements.txt
        └── src
            ├── modelsdk_quantize_model
            │   └── resnet50_quant.py
            └── x86_reference_app
                └── resnet50_reference_classification_app.py

The output of a compiled model in the ModelSDK is a `tar.gz` model that contains the compiled model, metadata in the form of a ``_mpk.json`` file and a stats file.
Both the ``.lm`` compiled models and the ``*_mpk.json`` files will be used throughout this guide as you build the pipeline. For more information please refer to the :ref:`modelsdk` section.

.. image:: media/modelsdk_output_files.jpg
    :align: center
    :scale: 55%

|

Conclusion and next steps
=========================

In this section, we: 

* Reviewed how to take an existing ResNet50 ONNX model and load it, quantize it, test it and compile it using the :ref:`modelsdk`
* How we will use the output ``*.tar.gz`` and its contents to develop an application and enable the runtime to run the model.

In the next section we will review how to program the various HW IPs found on the MLSoC before diving into our development example.