Get Model FPS

The Palette software enables users to determine the frames-per-second (FPS) KPIs of a machine learning (ML) model using a mode called the Accelerator Mode. Specifically, a Python script, network_eval.py is provided to generate KPIs for a given model.

To use the Accelerator Mode, first you will compile the ML model to get a single .elf file within our compiled tar.gz file.

Note

This example uses the Resnet50 classification model, created by Microsoft. The model adheres to the Apache 2.0 License. Please also follow the same licensing guidelines for this example.

Setup Tool

Download the Example

Unzip to a local directory and move the unzipped folder get_fps under your workspace directory:

sima-user@sima-user-machine:~$ cd ~/Downloads
sima-user@sima-user-machine:~/Downloads$ unzip get_fps.zip
sima-user@sima-user-machine:~/Downloads$ mv get_fps ~/workspace/

Access the Palette ModelSDK container:

sima-user@sima-user-machine:~$ sima-cli sdk model
sima-user@vdp-cli-modelsdk-2:/home/docker/sima-cli$

Go to the directory /home/docker/sima-cli/get_fps/ within the SDK container.

sima-user@vdp-cli-modelsdk-2:/home/docker/sima-cli$  cd get_fps
sima-user@vdp-cli-modelsdk-2:/home/docker/sima-cli$ ls
    apis  models  network_eval  utils

Download Model

Download the original .onnx file using wget:

sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# wget -O models/resnet50-v1-7_fp32_224_224.onnx https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
    /main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
    --2024-03-15 11:33:09--  https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
    Resolving github.com (github.com)... 140.82.121.3, ::ffff:140.82.121.3
    Connecting to github.com (github.com)|140.82.121.3|:443... connected.
    HTTP request sent, awaiting response... 302 Found
    Location: https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx [following]
    --2024-03-15 11:33:10--  https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
    Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
    Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.110.133|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 102583340 (98M) [application/octet-stream]
    Saving to: ‘models/resnet50-v1-7_fp32_224_224.onnx’

    models/resnet50-v1-7_fp32_224_22 100%[==========================================================>]  97.83M  8.35MB/s    in 12s

    2024-03-15 11:33:25 (8.39 MB/s) - ‘models/resnet50-v1-7_fp32_224_224.onnx’ saved [102583340/102583340]

Compile Model

Compile the downloaded model using the script models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py:

sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py --target gen2
    Model SDK version: 2.0.0
    {'model_path': 'models/resnet50-v1-7_fp32_224_224.onnx', 'shape_dict': {'data': [1, 3, 224, 224]}, 'dtype_dict': {'data': <ScalarType.float32: 6>}}
    2024-03-15 11:23:13,725 - autotvm - WARNING - One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.
    Running calibration ...DONE
    2024-03-15 11:23:56,736 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_relu_0, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the bias term.
    .. .. ..
    2024-03-15 11:23:56,874 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_35, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the zero point.
    Running quantization ...DONE
    Max absolute error between outputs of loaded net and quantized net = 0.6192820072174072

Note

Target Type:

gen2 : It is a default option to compile for Modalix target if you don’t specify --target
gen1 : --target gen1 to compile for MLSoC target

Example: python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py --target gen2

Afterwards, the model will be compiled and the artifacts will be saved in the result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True directory.

Execute

Untar the result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz file:

sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# tar zxvf result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz

See the various command-line arguments to network_eval.py using the -h option:

sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# python3 network_eval/network_eval.py -h
    usage: network_eval.py [-h] --model_file_path MODEL_FILE_PATH --mpk_json_path MPK_JSON_PATH --dv_host DV_HOST [--dv_port DV_PORT]
               [--dv_user DV_USER] --image_size IMAGE_SIZE [IMAGE_SIZE ...] [-v] [--bypass_tunnel]
               [--layer_stats_path LAYER_STATS_PATH] [--max_frames MAX_FRAMES] [--batch_size BATCH_SIZE]

    Emit FPS KPIs for networks that run on the MLA and/or A65

    options:
    -h, --help            show this help message and exit
    --model_file_path MODEL_FILE_PATH
                            Path to .elf or .tar.gz file
    --mpk_json_path MPK_JSON_PATH
                            Path to MPK JSON file
    --dv_host DV_HOST     DevKit IP Address / FQDN
    --dv_port DV_PORT     DevKit port on which the mla_rt_service is running
    --dv_user DV_USER     DevKit ssh username
    --image_size IMAGE_SIZE [IMAGE_SIZE ...]
                            RGB image size specified as: H W C
    -v, --verbose         increase output verbosity
    --bypass_tunnel       set to bypass ssh tunnel
    --layer_stats_path LAYER_STATS_PATH
                            Path to layer stats YAML file
    --max_frames MAX_FRAMES
                            Max number of frames to run
    --batch_size BATCH_SIZE
                            Batch size - default 1

Run the network_eval.py script, providing the paths of the .elf and .json files:

sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# python3 network_eval/network_eval.py \
--model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.elf \
--mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \
--dv_host <DevKitIP> --dv_port 8000 --image_size 224 224 3 -v
    Running model in MLA-only mode
    Creating the Forwarding from host
    The authenticity of host '{DevKitIP} ({DevKitIP})' can't be established.
    ECDSA key fingerprint is SHA256:zfEgZ7NPK5uE3WkrPjx9VsoVnsGvIHoav/prFVMLuSQ.
    Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
    sima@{DevKitIP}'s password:
    Copying the model files to DevKit
    sima@{DevKitIP}'s password:
    FPS = 1174
    FPS = 1187
    FPS = 1197
    FPS = 1205
    FPS = 1210
    .. .. ..

Once started, the network_eval.py script runs forever. You will need to hit Ctrl+C to interrupt the execution.

Verify

By specifying the --layer_stats_path option and passing the *_stats.yaml file corresponding to the .elf file, we can request the network_eval.py script to instead return a new output.yaml file in which the total execution time of the layer is computed.

sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# python3 network_eval/network_eval.py \
--model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.elf \
--mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \
--dv_host <DevKitIP> --image_size 224 224 3 \
--layer_stats_path  resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml

After running this command, the new yaml file is displayed in the current directory with the suffix *_output.yaml. In this example, the file is named resnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml and the record for layer MLA_0/conv2d_add_relu_7 now contains a single value that represents the total execution time in the layer shown below:

sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# ls
    apis
    models
    network_eval
    network_eval.log
    resnet50-v1-7_fp32_224_224_mpk.json
    resnet50-v1-7_fp32_224_224_stage1_mla.elf
    resnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml
    resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml
    result
    utils

...
8:
"name: ": MLA_0/conv2d_add_relu_7
"run_time: ": 9.27us
...

The total execution time of the MLA_0/conv2d_add_relu_7 layer is 9.27us microseconds. The ‘run_time’ represents the amount of time the layer took to execute on the MLA in microseconds. This time accounts for compute and memory cycles for this layer only.