Get Model FPS

The Palette software enables users to determine the frames-per-second (FPS) KPIs of a machine learning (ML) model using a mode called the Accelerator Mode. Specifically, a Python script, network_eval.py is provided to generate KPIs for a given model.

To use the Accelerator Mode, first you will compile the ML model to get a single .elf file within our compiled tar.gz file.

Setup Tool

Download the Example

  1. Unzip to a local directory and move the unzipped folder get_fps under your workspace directory:

    sima-user@sima-user-machine:~$ cd ~/Downloads
    sima-user@sima-user-machine:~/Downloads$ unzip get_fps.zip
    sima-user@sima-user-machine:~/Downloads$ mv get_fps ~/workspace/
    
  2. Go to the directory /home/docker/sima-cli/get_fps/ within the SDK container.

    sima-user@docker-image-id:/home# cd /home/docker/sima-cli/get_fps
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps$ ls
        apis  network_eval  utils
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# chown <YOUR_USERNAME> ../get_fps
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# sudo apt-get update
    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# sudo apt-get install sshpass
    

Download Model

  1. Download the original .onnx file using wget:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# wget -O models/resnet50-v1-7_fp32_224_224.onnx https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
        /main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
        --2024-03-15 11:33:09--  https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
        Resolving github.com (github.com)... 140.82.121.3, ::ffff:140.82.121.3
        Connecting to github.com (github.com)|140.82.121.3|:443... connected.
        HTTP request sent, awaiting response... 302 Found
        Location: https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx [following]
        --2024-03-15 11:33:10--  https://media.githubusercontent.com/media/onnx/models/main/validated/vision/classification/resnet/model/resnet50-v1-7.onnx
        Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
        Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.110.133|:443... connected.
        HTTP request sent, awaiting response... 200 OK
        Length: 102583340 (98M) [application/octet-stream]
        Saving to: ‘models/resnet50-v1-7_fp32_224_224.onnx’
    
        models/resnet50-v1-7_fp32_224_22 100%[==========================================================>]  97.83M  8.35MB/s    in 12s
    
        2024-03-15 11:33:25 (8.39 MB/s) - ‘models/resnet50-v1-7_fp32_224_224.onnx’ saved [102583340/102583340]
    

Compile Model

  1. Compile the downloaded model using the script models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py
        Model SDK version: 1.7.0
        {'model_path': 'models/resnet50-v1-7_fp32_224_224.onnx', 'shape_dict': {'data': [1, 3, 224, 224]}, 'dtype_dict': {'data': <ScalarType.float32: 6>}}
        2024-03-15 11:23:13,725 - autotvm - WARNING - One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.
        Running calibration ...DONE
        2024-03-15 11:23:56,736 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_relu_0, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the bias term.
        ...
        2024-03-15 11:23:56,874 - afe.ir.defines - WARNING - In node MLA_0/conv2d_add_35, Precision of weights was reduced to avoid numeric saturation. Saturation was detected in the zero point.
        Running quantization ...DONE
        Max absolute error between outputs of loaded net and quantized net = 0.6192820072174072
    

    Note

    Target Type:

    • gen1 : It is a default option to compile for MLSoC target if you don’t specify --target

    • gen2 : --target gen2 to compile for Modalix target

    Example: python3 models/scripts/resnet50-v1-7_fp32_224_224/resnet50-v1-7_fp32_224_224.py --target gen2

Execute

  1. Untar the result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz file:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# tar zxvf result/resnet50-v1-7_fp32_224_224_asym_True_per_ch_True/mpk/resnet50-v1-7_fp32_224_224_mpk.tar.gz
    
  2. See the various command-line arguments to network_eval.py using the -h option:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py -h
        usage: network_eval.py [-h] --model_file_path MODEL_FILE_PATH --mpk_json_path MPK_JSON_PATH --dv_host DV_HOST [--dv_port DV_PORT]
                   [--dv_user DV_USER] --image_size IMAGE_SIZE [IMAGE_SIZE ...] [-v] [--bypass_tunnel]
                   [--layer_stats_path LAYER_STATS_PATH] [--max_frames MAX_FRAMES] [--batch_size BATCH_SIZE]
    
        Emit FPS KPIs for networks that run on the MLA and/or A65
    
        options:
        -h, --help            show this help message and exit
        --model_file_path MODEL_FILE_PATH
                                Path to .elf or .tar.gz file
        --mpk_json_path MPK_JSON_PATH
                                Path to MPK JSON file
        --dv_host DV_HOST     DevKit IP Address / FQDN
        --dv_port DV_PORT     DevKit port on which the mla_rt_service is running
        --dv_user DV_USER     DevKit ssh username
        --image_size IMAGE_SIZE [IMAGE_SIZE ...]
                                RGB image size specified as: H W C
        -v, --verbose         increase output verbosity
        --bypass_tunnel       set to bypass ssh tunnel
        --layer_stats_path LAYER_STATS_PATH
                                Path to layer stats YAML file
        --max_frames MAX_FRAMES
                                Max number of frames to run
        --batch_size BATCH_SIZE
                                Batch size - default 1
    
  3. Run the network_eval.py script, providing the paths of the .elf and .json files:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py \
    --model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.elf \
    --mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \
    --dv_host <devkit IP address> --dv_port 8000 --image_size 224 224 3 -v
        Running model in MLA-only mode
        Creating the Forwarding from host
        The authenticity of host '10.42.0.240 (10.42.0.240)' can't be established.
        ECDSA key fingerprint is SHA256:zfEgZ7NPK5uE3WkrPjx9VsoVnsGvIHoav/prFVMLuSQ.
        Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
        sima@10.42.0.240's password:
        Copying the model files to DevKit
        sima@10.42.0.240's password:
        FPS = 1036
        FPS = 1036
        FPS = 1036
        FPS = 1036
        FPS = 1036
        FPS = 1037
    

Once started, the network_eval.py script runs forever. You will need to hit Ctrl+C to interrupt the execution.

Verify

  1. By specifying the --layer_stats_path option and passing the *_stats.yaml file corresponding to the .elf file, we can request the network_eval.py script to instead return a new output.yaml file in which the total execution time of the layer is computed.

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# python network_eval/network_eval.py \
    --model_file_path resnet50-v1-7_fp32_224_224_stage1_mla.elf \
    --mpk_json_path resnet50-v1-7_fp32_224_224_mpk.json \
    --dv_host <devkit IP address> --image_size 224 224 3 \
    --layer_stats_path  resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml
    
  2. After running this command, the new yaml file is displayed in the current directory with the suffix *_output.yaml. In this example, the file is named resnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml and the record for layer MLA_0/conv2d_add_relu_7 now contains a single value that represents the total execution time in the layer shown below:

    sima-user@docker-image-id:/home/docker/sima-cli/get_fps# ls
        apis
        models
        network_eval
        network_eval.log
        resnet50-v1-7_fp32_224_224_mpk.json
        resnet50-v1-7_fp32_224_224_stage1_mla.elf
        resnet50-v1-7_fp32_224_224_stage1_mla_stats_output.yaml
        resnet50-v1-7_fp32_224_224_stage1_mla_stats.yaml
        result
        utils
    
    ...
    8:
    "name: ": MLA_0/conv2d_add_relu_7
    "run_time: ": 9.27us
    ...
    
  3. The total execution time of the MLA_0/conv2d_add_relu_7 layer is 9.27us microseconds. The ‘run_time’ represents the amount of time the layer took to execute on the MLA in microseconds. This time accounts for compute and memory cycles for this layer only.