Get Model FPS
The Palette software enables users to determine the frames-per-second (FPS) KPIs of a machine learning (ML) model using a mode called the Accelerator Mode.
Specifically, a Python script, network_eval.py is provided to generate KPIs for a given model.
To use the Accelerator Mode, first you will compile the ML model to get a single .elf file within our compiled tar.gz file.
Note
This example uses the Resnet50 classification model, created by Microsoft. The model adheres to the Apache 2.0 License. Please also follow the same licensing guidelines for this example.
Setup Tool
Unzip to a local directory and move the unzipped folder
get_fpsunder yourworkspacedirectory:sima-user@sima-user-machine:~$ cd ~/Downloads sima-user@sima-user-machine:~/Downloads$ unzip get_fps.zip sima-user@sima-user-machine:~/Downloads$ mv get_fps ~/workspace/
Access the Palette ModelSDK container:
sima-user@sima-user-machine:~$ sima-cli sdk model sima-user@vdp-cli-modelsdk-2:/home/docker/sima-cli$
Go to the directory
/home/docker/sima-cli/get_fps/within the SDK container.sima-user@vdp-cli-modelsdk-2:/home/docker/sima-cli$ cd get_fps sima-user@vdp-cli-modelsdk-2:/home/docker/sima-cli$ ls apis models network_eval utils
Download Model
Download the original
.onnxfile usingwget:sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# wget -O models/resnet50-v1-7_fp32_224_224.onnx https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx /main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx --2024-03-15 11:33:09-- https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx Resolving github.com (github.com)... 140.82.121.3, ::ffff:140.82.121.3 Connecting to github.com (github.com)|140.82.121.3|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx [following] --2024-03-15 11:33:10-- https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ... Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.110.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 102583340 (98M) [application/octet-stream] Saving to: ‘models/resnet50-v1-7_fp32_224_224.onnx’ models/resnet50-v1-7_fp32_224_22 100%[==========================================================>] 97.83M 8.35MB/s in 12s 2024-03-15 11:33:25 (8.39 MB/s) - ‘models/resnet50-v1-7_fp32_224_224.onnx’ saved [102583340/102583340]
Compile Model
Compile the downloaded model using the script
models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py:sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py --target gen2 Model SDK version: 2.0.0 {'model_path': 'models/resnet50-v1-7_fp32_224_224.onnx', 'shape_dict': {'data': [1, 3, 224, 224]}, 'dtype_dict': {'data': <ScalarType.float32: 6>}} 2024-03-15 11:23:13,725 - autotvm - WARNING - One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details. Running calibration ...DONE 2024-03-15 11:23:56,736 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_relu_0, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the bias term. .. .. .. 2024-03-15 11:23:56,874 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_35, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the zero point. Running quantization ...DONE Max absolute error between outputs of loaded net and quantized net = 0.6192820072174072
Note
Target Type:
gen2: It is a default option to compile forModalixtarget if you don’t specify--targetgen1:--target gen1to compile forMLSoCtarget
Example:
python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py --target gen2
Execute
Untar the
result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gzfile:sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# tar zxvf result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz
See the various command-line arguments to
network_eval.pyusing the-hoption:sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# python3 network_eval/network_eval.py -h usage: network_eval.py [-h] --model_file_path MODEL_FILE_PATH --mpk_json_path MPK_JSON_PATH --dv_host DV_HOST [--dv_port DV_PORT] [--dv_user DV_USER] --image_size IMAGE_SIZE [IMAGE_SIZE ...] [-v] [--bypass_tunnel] [--layer_stats_path LAYER_STATS_PATH] [--max_frames MAX_FRAMES] [--batch_size BATCH_SIZE] Emit FPS KPIs for networks that run on the MLA and/or A65 options: -h, --help show this help message and exit --model_file_path MODEL_FILE_PATH Path to .elf or .tar.gz file --mpk_json_path MPK_JSON_PATH Path to MPK JSON file --dv_host DV_HOST DevKit IP Address / FQDN --dv_port DV_PORT DevKit port on which the mla_rt_service is running --dv_user DV_USER DevKit ssh username --image_size IMAGE_SIZE [IMAGE_SIZE ...] RGB image size specified as: H W C -v, --verbose increase output verbosity --bypass_tunnel set to bypass ssh tunnel --layer_stats_path LAYER_STATS_PATH Path to layer stats YAML file --max_frames MAX_FRAMES Max number of frames to run --batch_size BATCH_SIZE Batch size - default 1
Run the
network_eval.pyscript, providing the paths of the.elfand.jsonfiles:sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# python3 network_eval/network_eval.py \ --model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.elf \ --mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \ --dv_host <DevKitIP> --dv_port 8000 --image_size 224 224 3 -v Running model in MLA-only mode Creating the Forwarding from host The authenticity of host '{DevKitIP} ({DevKitIP})' can't be established. ECDSA key fingerprint is SHA256:zfEgZ7NPK5uE3WkrPjx9VsoVnsGvIHoav/prFVMLuSQ. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes sima@{DevKitIP}'s password: Copying the model files to DevKit sima@{DevKitIP}'s password: FPS = 1174 FPS = 1187 FPS = 1197 FPS = 1205 FPS = 1210 .. .. ..
Once started, the network_eval.py script runs forever. You will need to hit Ctrl+C to interrupt the execution.
Verify
By specifying the
--layer_stats_pathoption and passing the*_stats.yamlfile corresponding to the.elffile, we can request thenetwork_eval.pyscript to instead return a newoutput.yamlfile in which the total execution time of the layer is computed.sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# python3 network_eval/network_eval.py \ --model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.elf \ --mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \ --dv_host <DevKitIP> --image_size 224 224 3 \ --layer_stats_path resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml
After running this command, the new yaml file is displayed in the current directory with the suffix
*_output.yaml. In this example, the file is namedresnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yamland the record for layerMLA_0/conv2d_add_relu_7now contains a single value that represents the total execution time in the layer shown below:sima-user@vdp-cli-modelsdk-2 :/home/docker/sima-cli/get_fps# ls apis models network_eval network_eval.log resnet50-v1-7_fp32_224_224_mpk.json resnet50-v1-7_fp32_224_224_stage1_mla.elf resnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml result utils
... 8: "name: ": MLA_0/conv2d_add_relu_7 "run_time: ": 9.27us ...
The total execution time of the
MLA_0/conv2d_add_relu_7layer is 9.27us microseconds. The ‘run_time’ represents the amount of time the layer took to execute on the MLA in microseconds. This time accounts for compute and memory cycles for this layer only.