ModelSDK - Compiling ML Models

The first step in developing any pipeline, is to compile the ML models so that they can run on the MLA. In order to compile the ResNet50 model in our example application, we will use the script below on the Palette docker container using ModelSDK APIs.

resnet50_quant.py

#**************************************************************************
#||                        SiMa.ai CONFIDENTIAL                          ||
#||   Unpublished Copyright (c) 2023-2024 SiMa.ai, All Rights Reserved.  ||
#**************************************************************************
# NOTICE:  All information contained herein is, and remains the property of
# SiMa.ai. The intellectual and technical concepts contained herein are
# proprietary to SiMa and may be covered by U.S. and Foreign Patents,
# patents in process, and are protected by trade secret or copyright law.
#
# Dissemination of this information or reproduction of this material is
# strictly forbidden unless prior written permission is obtained from
# SiMa.ai.  Access to the source code contained herein is hereby forbidden
# to anyone except current SiMa.ai employees, managers or contractors who
# have executed Confidentiality and Non-disclosure agreements explicitly
# covering such access.
#
# The copyright notice above does not evidence any actual or intended
# publication or disclosure  of  this source code, which includes information
# that is confidential and/or proprietary, and is a trade secret, of SiMa.ai.
#
# ANY REPRODUCTION, MODIFICATION, DISTRIBUTION, PUBLIC PERFORMANCE, OR PUBLIC
# DISPLAY OF OR THROUGH USE OF THIS SOURCE CODE WITHOUT THE EXPRESS WRITTEN
# CONSENT OF SiMa.ai IS STRICTLY PROHIBITED, AND IN VIOLATION OF APPLICABLE
# LAWS AND INTERNATIONAL TREATIES. THE RECEIPT OR POSSESSION OF THIS SOURCE
# CODE AND/OR RELATED INFORMATION DOES NOT CONVEY OR IMPLY ANY RIGHTS TO
# REPRODUCE, DISCLOSE OR DISTRIBUTE ITS CONTENTS, OR TO MANUFACTURE, USE, OR
# SELL ANYTHING THAT IT  MAY DESCRIBE, IN WHOLE OR IN PART.
#
#**************************************************************************

import cv2
import numpy as np
import pickle as pkl

from afe.apis.defines import QuantizationParams, quantization_scheme, CalibrationMethod
from afe.apis.loaded_net import load_model
from afe.apis.release_v1 import get_model_sdk_version
from afe.ir.tensor_type import ScalarType
from afe.load.importers.general_importer import onnx_source
from afe.core.utils import convert_data_generator_to_iterable

from typing import Dict
from pathlib import Path
from afe import DataGenerator

np.random.seed(9)

# Constants
ROOT_PATH = Path(__file__).parent.resolve()
MODEL_INPUT_NAME = "input"
MAX_DATA_SAMPLES = 50
MODELS_PATH = ROOT_PATH/"../../models"
DATA_PATH = ROOT_PATH/"../../data/"
MODEL_PATH = MODELS_PATH/"resnet50_model.onnx"
LABELS_PATH = DATA_PATH/"imagenet_labels.txt"
CALIBRATION_SET_PATH = DATA_PATH/"openimages_v7_images_and_labels.pkl"

# Dataset and preprocessing #
def create_imagenet_dataset(num_samples: int = 1) -> Dict[str, DataGenerator]:
    """
    Creates a data generator with the structure
    { 'images': DataGenerator of image arrays
    'labels': DataGenerators of labels }
    """
    dataset_path = CALIBRATION_SET_PATH

    with open(dataset_path, 'rb') as f:
        dataset = pkl.load(f)

    images_and_labels = {'images': dataset['data'][:num_samples],
                        'labels': dataset['target'][:num_samples]}

    return images_and_labels

def preprocess(image, skip_transpose=True, input_shape: tuple = (224, 224), scale_factor: tuple = 255.0):
    mean = [0.485, 0.456, 0.406]
    stddv = [0.229, 0.224, 0.225]

    # val224 images come in CHW format, need to transpose to HWC format
    if not skip_transpose:
        image = image.transpose(1, 2, 0)

    # Resize, color convert, scale, normalize
    image = cv2.resize(image, input_shape)
    image = image / scale_factor
    image = (image - mean) / stddv

    return image

# Function to post-process the output
def postprocess_output(output: np.ndarray):
    probabilities = output[0][0]
    max_idx = np.argmax(probabilities)
    return max_idx, probabilities[max_idx]

# Get Model SDK version
sdk_version = get_model_sdk_version()
print(f"Model SDK version: {sdk_version}")

# Model information
input_name, input_shape, input_type = ("input", (1, 3, 224, 224), ScalarType.float32)
input_shapes_dict = {input_name: input_shape}
input_types_dict = {input_name: input_type}

# Load the ONNX model
importer_params = onnx_source(str(MODEL_PATH), input_shapes_dict, input_types_dict)
loaded_net = load_model(importer_params)

# Create the calibration dataset
images_and_labels = create_imagenet_dataset(num_samples=MAX_DATA_SAMPLES)

# Create a datagenerator from it and map the preprocessing function
images_generator = DataGenerator({MODEL_INPUT_NAME: images_and_labels["images"]})
images_generator.map({MODEL_INPUT_NAME: preprocess})

# Setup the quantization parameters and quantize using MSE and INT8
quant_configs: QuantizationParams = QuantizationParams(calibration_method=CalibrationMethod.from_str('min_max'),
                                                    activation_quantization_scheme=quantization_scheme(asymmetric=True, per_channel=False, bits=8),
                                                    weight_quantization_scheme=quantization_scheme(asymmetric=False, per_channel=True, bits=8))

sdk_net = loaded_net.quantize(convert_data_generator_to_iterable(images_generator),
                            quant_configs,
                            model_name="quantized_resnet50",
                            arm_only=False)

# Execute the quantized net with ImageNet samples
with open(LABELS_PATH, "r") as f:
        imagenet_labels = [line.strip() for line in f.readlines()]

for idx in range(6):
    sdk_net_output = sdk_net.execute(inputs={"input": images_generator[idx]["input"]})
    inference_label, inference_result = postprocess_output(sdk_net_output)
    reference_label = images_and_labels["labels"][idx]

    print(f"[{idx}] --> {imagenet_labels[inference_label]} / {reference_label} -> {inference_result:.2%}")

# Load image -> preprocess -> inference -> postprocess -> print ; 207 is the expected label
print("Inference on a happy golden retriever (class 207)  ..")
dog_image = cv2.imread(str(DATA_PATH/"golden_retriever_207.jpg"))
dog_image = cv2.cvtColor(dog_image, cv2.COLOR_BGR2RGB)
pp_dog_image = np.expand_dims(preprocess(dog_image), axis=0).astype(np.float32)
sdk_net_output = sdk_net.execute(inputs={"input": pp_dog_image})
inference_label, inference_result = postprocess_output(sdk_net_output)
print(f"[{idx}] --> {imagenet_labels[inference_label]} / 207  -> {inference_result:.2%}")

# Save model
sdk_net.save(model_name="quantized_resnet50", output_directory=str(MODELS_PATH))

# Compile the quantized net and generate LM file and MPK JSON file
print("Compiling the model ..")
sdk_net.compile(output_path=str(MODELS_PATH/"compiled_resnet50"))

To compile the model, you will need the following directory structure:

sima-user@docker-image-id$ tree -L3
.
├── data
│   ├── golden_retriever_207.jpg
│   ├── imagenet_labels.txt
│   └── openimages_v7_images_and_labels.pkl
├── models
│   ├── download_resnet50.py
│   └── resnet50_model.onnx
├── README.md
├── requirements.txt
└── src
    ├── modelsdk_quantize_model
    │   └── resnet50_quant.py
    └── x86_reference_app
        └── resnet50_reference_classification_app.py

Within the docker Palette container, run the script:

sima-user@docker-image-id$ python src/modelsdk_quantize_model/resnet50_quant.py
    /usr/local/lib/python3.10/site-packages/tensorflow/python/keras/engine/training_arrays_v1.py:37: UserWarning: A NumPy version >=1.23.5 and <2.3.0 is required for this version of SciPy (detected version 1.23.2)
    from scipy.sparse import issparse  # pylint: disable=g-import-not-at-top
    Model SDK version: 1.4.0
    Running calibration ...DONE
    ...
    Running quantization ...DONE
    [0] --> 817: 'sports car, sport car', / ['Clothing', 'Person', 'Car', 'Wheel'] -> 47.84%
    [1] --> 248: 'Eskimo dog, husky', / ['Dog'] -> 51.76%
    [2] --> 668: 'mosque', / ['Person'] -> 93.72%
    [3] --> 515: 'cowboy hat, ten-gallon hat', / ['Sun hat', 'Cowboy hat', 'Fedora', 'Clothing'] -> 98.82%
    [4] --> 113: 'snail', / ['Animal', 'Snail'] -> 98.82%
    [5] --> 517: 'crane', / ['Land vehicle'] -> 98.82%
    Inference on a happy golden retriever (class 207)  ..
    [5] --> 207: 'golden retriever', / 207  -> 98.82%
    Compiling the model ..

sima-user@docker-image-id$ tree -L 3
    .
    ├── data
    │   ├── golden_retriever_207.jpg
    │   ├── imagenet_labels.txt
    │   └── openimages_v7_images_and_labels.pkl
    ├── models
    │   ├── compiled_resnet50
    │   │   └── quantized_resnet50_mpk.tar.gz
    │   ├── download_resnet50.py
    │   ├── quantized_resnet50.sima
    │   ├── quantized_resnet50.sima.json
    │   └── resnet50_model.onnx
    ├── README.md
    ├── requirements.txt
    └── src
        ├── modelsdk_quantize_model
        │   └── resnet50_quant.py
        └── x86_reference_app
            └── resnet50_reference_classification_app.py

The output of a compiled model in the ModelSDK is a tar.gz model that contains the compiled model, metadata in the form of a _mpk.json file and a stats file. Both the .lm compiled models and the *_mpk.json files will be used throughout this guide as you build the pipeline. For more information please refer to the ModelSDK section.

../../../_images/modelsdk_output_files.jpg

Conclusion and next steps

In this section, we:

Reviewed how to take an existing ResNet50 ONNX model and load it, quantize it, test it and compile it using the ModelSDK
How we will use the output *.tar.gz and its contents to develop an application and enable the runtime to run the model.

In the next section we will review how to program the various HW IPs found on the MLSoC before diving into our development example.