Get Model FPS

The Palette software enables users to determine the frames-per-second (FPS) KPIs of a machine learning (ML) model using a mode called the Accelerator Mode. Specifically, a Python script, network_eval.py is provided to generate KPIs for a given model.

To use the Accelerator Mode, first you will compile the ML model to get a single .lm file within our compiled tar.gz file. To obtain this .lm file, you will use one of our compilation scripts from our GitHub page.

Executing Model Files using Network Evaluation

Download the Example

  1. Unzip to a local directory and move the unzipped folder get_fps under your workspace directory:

    sima-user@sima-user-machine:~$ cd ~/Downloads
    sima-user@sima-user-machine:~/Downloads$ unzip get_fps.zip
    sima-user@sima-user-machine:~/Downloads$ mv get_fps ~/workspace/
    
  2. Go to the Network Eval directory /home/docker/sima-cli/get_fps/ within the SDK container.

    sima-user@docker-image-id:/home# cd /home/docker/sima-cli/get_fps
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps$ ls
        apis  network_eval  utils
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# chown <YOUR_USERNAME> ../get_fps
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# sudo apt-get update
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# sudo apt-get install sshpass
    
  3. Install packages defined in the requirements.txt file:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# pip3 install -r apis/requirements.txt
    
  4. Git clone SiMa’s GitHub and download the model resnet50-v1-7.onnx, you can check all the links for each original model file in the README.md of the repo:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# sudo chmod 777 ../get_fps/
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# git clone https://github.com/SiMa-ai/models.git
        Cloning into 'models'...
        remote: Enumerating objects: 988, done.
        remote: Counting objects: 100% (988/988), done.
        remote: Compressing objects: 100% (292/292), done.
        remote: Total 988 (delta 488), reused 971 (delta 474), pack-reused 0
        Receiving objects: 100% (988/988), 23.58 MiB | 4.53 MiB/s, done.
        Resolving deltas: 100% (488/488), done.
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# ls
        apis  models  network_eval  utils
    
  5. Download the original .onnx file using wget:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# wget -O models/resnet50-v1-7_fp32_224_224.onnx https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
        /main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
        --2024-03-15 11:33:09--  https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
        Resolving github.com (github.com)... 140.82.121.3, ::ffff:140.82.121.3
        Connecting to github.com (github.com)|140.82.121.3|:443... connected.
        HTTP request sent, awaiting response... 302 Found
        Location: https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx [following]
        --2024-03-15 11:33:10--  https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
        Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
        Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.110.133|:443... connected.
        HTTP request sent, awaiting response... 200 OK
        Length: 102583340 (98M) [application/octet-stream]
        Saving to: ‘models/resnet50-v1-7_fp32_224_224.onnx’
    
        models/resnet50-v1-7_fp32_224_22 100%[==========================================================>]  97.83M  8.35MB/s    in 12s
    
        2024-03-15 11:33:25 (8.39 MB/s) - ‘models/resnet50-v1-7_fp32_224_224.onnx’ saved [102583340/102583340]
    
  6. Compile the downloaded model using the script models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps/models# python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py
        Model SDK version: 1.4.0
        {'model_path': 'models/resnet50-v1-7_fp32_224_224.onnx', 'shape_dict': {'data': [1, 3, 224, 224]}, 'dtype_dict': {'data': <ScalarType.float32: 6>}}
        2024-03-15 11:23:13,725 - autotvm - WARNING - One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.
        Running calibration ...DONE
        2024-03-15 11:23:56,736 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_relu_0, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the bias term.
        ...
        2024-03-15 11:23:56,874 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_35, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the zero point.
        Running quantization ...DONE
        Max absolute error between outputs of loaded net and quantized net = 0.6192820072174072
    
  7. Untar the result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz file:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# tar zxvf result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz
    
  8. See the various command-line arguments to network_eval.py using the -h option:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py -h
        usage: network_eval.py [-h] --model_file_path MODEL_FILE_PATH --mpk_json_path MPK_JSON_PATH --dv_host DV_HOST [--dv_port DV_PORT]
                   [--dv_user DV_USER] --image_size IMAGE_SIZE [IMAGE_SIZE ...] [-v] [--bypass_tunnel]
                   [--layer_stats_path LAYER_STATS_PATH] [--max_frames MAX_FRAMES] [--batch_size BATCH_SIZE]
    
        Emit FPS KPIs for networks that run on the MLA and/or A65
    
        options:
        -h, --help            show this help message and exit
        --model_file_path MODEL_FILE_PATH
                                Path to .lm or .tar.gz file
        --mpk_json_path MPK_JSON_PATH
                                Path to MPK JSON file
        --dv_host DV_HOST     DevKit IP Address / FQDN
        --dv_port DV_PORT     DevKit port on which the mla_rt_service is running
        --dv_user DV_USER     DevKit ssh username
        --image_size IMAGE_SIZE [IMAGE_SIZE ...]
                                RGB image size specified as: H W C
        -v, --verbose         increase output verbosity
        --bypass_tunnel       set to bypass ssh tunnel
        --layer_stats_path LAYER_STATS_PATH
                                Path to layer stats YAML file
        --max_frames MAX_FRAMES
                                Max number of frames to run
        --batch_size BATCH_SIZE
                                Batch size - default 1
    
  9. Run the network_eval.py script, providing the paths of the .lm and .json files:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py \
    --model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.lm \
    --mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \
    --dv_host <devkit IP address> --dv_port 8000 --image_size 224 224 3 -v
        Running model in MLA-only mode
        Creating the Forwarding from host
        The authenticity of host '10.42.0.240 (10.42.0.240)' can't be established.
        ECDSA key fingerprint is SHA256:zfEgZ7NPK5uE3WkrPjx9VsoVnsGvIHoav/prFVMLuSQ.
        Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
        sima@10.42.0.240's password:
        Copying the model files to DevKit
        sima@10.42.0.240's password:
        FPS = 855
        FPS = 874
        FPS = 883
        FPS = 880
    

Once started, the network_eval.py script runs forever. You will need to hit Ctrl+C to interrupt the execution.

Verifying Runtime Statistics

  1. By specifying the --layer_stats_path option and passing the *_stats.yaml file corresponding to the .lm file, we can request the network_eval.py script to instead return a new output.yaml file in which the total execution time of the layer is computed.

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py \
    --model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.lm \
    --mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \
    --dv_host <devkit IP address> --image_size 224 224 3 \
    --layer_stats_path  resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml
    
  2. After running this command, the new yaml file is displayed in the current directory with the suffix *_output.yaml. In this example, the file is named resnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml and the record for layer MLA_0/conv2d_add_relu_7 now contains a single value that represents the total execution time in the layer shown below:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# ls
        apis
        models
        network_eval
        network_eval.log
        resnet50-v1-7_fp32_224_224_mpk.json
        resnet50-v1-7_fp32_224_224_stage1_mla.lm
        resnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml
        resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml
        result
        utils
    
    ...
    8:
    "name: ": MLA_0/conv2d_add_relu_7
    "run_time: ": 5.01us
    ...
    
  3. The total execution time of the MLA_0/conv2d_add_relu_7 layer is 5.01 microseconds. The ‘name’ of the layer corresponds to the name that is viewable in Netron when opening the .sima.json file that was generated using the Model.save() API. The ‘run_time’ represents the amount of time the layer took to execute on the MLA in microseconds. This time accounts for compute and memory cycles for this layer only.

  4. You may now examine the Netron graph (*_sima.json file generated using the Model.save() API) and *.output.yaml file side-by-side in order to view the execution stats of each layer.