Get Model FPS
The Palette software enables users to determine the frames-per-second (FPS) KPIs of a machine learning (ML) model using a mode called the Accelerator Mode.
Specifically, a Python script, network_eval.py
is provided to generate KPIs for a given model.
To use the Accelerator Mode, first you will compile the ML model to get a single .elf
file within our compiled tar.gz
file.
Setup Tool
Unzip to a local directory and move the unzipped folder
get_fps
under yourworkspace
directory:sima-user@sima-user-machine:~$ cd ~/Downloads sima-user@sima-user-machine:~/Downloads$ unzip get_fps.zip sima-user@sima-user-machine:~/Downloads$ mv get_fps ~/workspace/
Go to the directory
/home/docker/sima-cli/get_fps/
within the SDK container.sima-user@docker-image-id:/home# cd /home/docker/sima-cli/get_fps sima-user@docker-image-id:/home/docker/sima-cli/get_fps$ ls apis network_eval utils sima-user@docker-image-id:/home/docker/sima-cli/get_fps# chown <YOUR_USERNAME> ../get_fps sima-user@docker-image-id:/home/docker/sima-cli/get_fps# sudo apt-get update sima-user@docker-image-id:/home/docker/sima-cli/get_fps# sudo apt-get install sshpass
Download Model
Download the original
.onnx
file usingwget
:sima-user@docker-image-id:/home/docker/sima-cli/get_fps# wget -O models/resnet50-v1-7_fp32_224_224.onnx https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx /main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx --2024-03-15 11:33:09-- https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx Resolving github.com (github.com)... 140.82.121.3, ::ffff:140.82.121.3 Connecting to github.com (github.com)|140.82.121.3|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx [following] --2024-03-15 11:33:10-- https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ... Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.110.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 102583340 (98M) [application/octet-stream] Saving to: ‘models/resnet50-v1-7_fp32_224_224.onnx’ models/resnet50-v1-7_fp32_224_22 100%[==========================================================>] 97.83M 8.35MB/s in 12s 2024-03-15 11:33:25 (8.39 MB/s) - ‘models/resnet50-v1-7_fp32_224_224.onnx’ saved [102583340/102583340]
Compile Model
Compile the downloaded model using the script
models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py
:sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py Model SDK version: 1.7.0 {'model_path': 'models/resnet50-v1-7_fp32_224_224.onnx', 'shape_dict': {'data': [1, 3, 224, 224]}, 'dtype_dict': {'data': <ScalarType.float32: 6>}} 2024-03-15 11:23:13,725 - autotvm - WARNING - One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details. Running calibration ...DONE 2024-03-15 11:23:56,736 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_relu_0, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the bias term. ... 2024-03-15 11:23:56,874 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_35, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the zero point. Running quantization ...DONE Max absolute error between outputs of loaded net and quantized net = 0.6192820072174072
Note
Target Type:
gen1
: It is a default option to compile forMLSoC
target if you don’t specify--target
gen2
:--target gen2
to compile forModalix
target
Example:
python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py --target gen2
Execute
Untar the
result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz
file:sima-user@docker-image-id:/home/docker/sima-cli/get_fps# tar zxvf result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz
See the various command-line arguments to
network_eval.py
using the-h
option:sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py -h usage: network_eval.py [-h] --model_file_path MODEL_FILE_PATH --mpk_json_path MPK_JSON_PATH --dv_host DV_HOST [--dv_port DV_PORT] [--dv_user DV_USER] --image_size IMAGE_SIZE [IMAGE_SIZE ...] [-v] [--bypass_tunnel] [--layer_stats_path LAYER_STATS_PATH] [--max_frames MAX_FRAMES] [--batch_size BATCH_SIZE] Emit FPS KPIs for networks that run on the MLA and/or A65 options: -h, --help show this help message and exit --model_file_path MODEL_FILE_PATH Path to .elf or .tar.gz file --mpk_json_path MPK_JSON_PATH Path to MPK JSON file --dv_host DV_HOST DevKit IP Address / FQDN --dv_port DV_PORT DevKit port on which the mla_rt_service is running --dv_user DV_USER DevKit ssh username --image_size IMAGE_SIZE [IMAGE_SIZE ...] RGB image size specified as: H W C -v, --verbose increase output verbosity --bypass_tunnel set to bypass ssh tunnel --layer_stats_path LAYER_STATS_PATH Path to layer stats YAML file --max_frames MAX_FRAMES Max number of frames to run --batch_size BATCH_SIZE Batch size - default 1
Run the
network_eval.py
script, providing the paths of the.elf
and.json
files:sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py \ --model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.elf \ --mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \ --dv_host <devkit IP address> --dv_port 8000 --image_size 224 224 3 -v Running model in MLA-only mode Creating the Forwarding from host The authenticity of host '10.42.0.240 (10.42.0.240)' can't be established. ECDSA key fingerprint is SHA256:zfEgZ7NPK5uE3WkrPjx9VsoVnsGvIHoav/prFVMLuSQ. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes sima@10.42.0.240's password: Copying the model files to DevKit sima@10.42.0.240's password: FPS = 1036 FPS = 1036 FPS = 1036 FPS = 1036 FPS = 1036 FPS = 1037
Once started, the network_eval.py
script runs forever. You will need to hit Ctrl+C
to interrupt the execution.
Verify
By specifying the
--layer_stats_path
option and passing the*_stats.yaml
file corresponding to the.elf
file, we can request thenetwork_eval.py
script to instead return a newoutput.yaml
file in which the total execution time of the layer is computed.sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py \ --model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.elf \ --mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \ --dv_host <devkit IP address> --image_size 224 224 3 \ --layer_stats_path resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml
After running this command, the new yaml file is displayed in the current directory with the suffix
*_output.yaml
. In this example, the file is namedresnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml
and the record for layerMLA_0/conv2d_add_relu_7
now contains a single value that represents the total execution time in the layer shown below:sima-user@docker-image-id:/home/docker/sima-cli/get_fps# ls apis models network_eval network_eval.log resnet50-v1-7_fp32_224_224_mpk.json resnet50-v1-7_fp32_224_224_stage1_mla.elf resnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml result utils
... 8: "name: ": MLA_0/conv2d_add_relu_7 "run_time: ": 9.27us ...
The total execution time of the
MLA_0/conv2d_add_relu_7
layer is 9.27us microseconds. The ‘run_time’ represents the amount of time the layer took to execute on the MLA in microseconds. This time accounts for compute and memory cycles for this layer only.