MLPerf Benchmark

This section provides an overview of how to run the MLPerf benchmark tests using the Palette Software. The official MLPerf benchmark is run on a board that is different from the one in the Development Kit. However, following the steps below using the board in the Development Kit will allow users to run the MLPerf benchmark and replicate the performance results published.

Note

You can get the FPS (Frames Per Second), but you would not be able to get accuracy or power figures. Accuracy requires a different and bigger dataset than what is shipped with SiMa’s Software Development Kit.

Before you begin running the MLPerf tests, follow the steps below:

  1. Confirm that your board has been flashed with the latest tRoot and Yocto build. Please refer to Firmware and Board Software Update for more details.

    • To verify, run cat /etc/build and look for the build number.

    • If the version needs to be upgraded, follow the instructions to flash the board or contact support@sima.ai.

  2. Connect your Developer Board to your laptop and make sure you can SSH to the board.

  3. Download the following files within the .zip file using the download button:
    • mlperf_resnet50_dataset.dat

    • check_accuracy.sh

    • imagenet_accuracy.py

    • val_map.txt

Accessing the MLPerf Files

Download Now

  1. Unzip the file.

    sima-user@sima-user-machine:~$ cd ~/Downloads
    sima-user@sima-user-machine:~/Downloads$ unzip ml_perf.zip
    
  2. SSH into the board.

    sima-user@sima-user-machine:~$ ssh sima@10.42.0.241
        The authenticity of host '10.42.0.241 (10.42.0.241)' can't be established.
        ED25519 key fingerprint is SHA256:FbMdheLl0xLWy33YLEWUAcddRvjavYqg83rgnkFYcos.
        This key is not known by any other names
        Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
        Warning: Permanently added '10.42.0.241' (ED25519) to the list of known hosts.
        sima@10.42.0.241's password:
    davinci:~$
    
  3. Change the directory to /data and secure Copy (scp) the above four files to the /data directory on the Development Kit (device).

    davinci:/data# sudo scp <host_user_name>@<host_ip_address>:/path/to/datafile/mlperf_resnet50_dataset.dat .
    

Running MLPerf Tests

To run the Batch1, Batch8, and Batch14 performance and accuracy mode tests, follow the steps described below.

Batch1 Performance Mode Test

  1. Go to the MLPerf directory in the Docker container.

    sima-user@docker-image-id:/home# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf
    
  2. Verify the dependencies section in the application.json file in the SDK.

    sima-user@docker-image-id:/home# vi /usr/local/simaai/app_zoo/Gstreamer/MLPerf/application.json
    
  3. Make sure the gst section in /usr/local/simaai/app_zoo/Gstreamer/MLPerf/application.json within the Pallete Docker container is updated as shown below:

    "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=1:3:224:224 out-dims=1:64 mlperf-run-type=0 mlperf-scenario=0 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\"  inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config\" silent=true ! fakesink"
    

    Update the parameter values as shown below:

    “In-dims” to 1:3:224:224
    “Out-dims” to 1:64
    “Mlperf-run-type” to 0
    “Mlperf-scenario” to 0
    “Inpath” to “/data/mlperf_resnet50_dataset.dat”
    “Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config”

    This will change the configuration to the current set of values that we want to evaluate.

  4. After modifying the application.json file, create an mpk.

    sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk create -s . -d .
        ℹ Step a65-apps COMPILE completed successfully.
        ℹ Step COMPILE completed successfully.
        ℹ Step COPY RESOURCE completed successfully
        ℹ Step RPM BUILD completed successfully.
        ✔ Successfully created MPK at '/usr/local/simaai/app_zoo/Gstreamer/MLPerf/project.mpk'
    

    By default an mpk file gets created with the name, “project.mpk”.

  5. Connect to the device using the IP address.

    sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk device connect -t sima@<your-device_ip>
        ℹ Please enter the password for 10.42.0.241 🔐 :
        ℹ Connecting to sima@10.42.0.241...
        ✔ Connection established to 10.42.0.241 .
    
  6. Enter the password for the connection. By default the password is edgeai unless you already changed the default password to your own password.

  7. Deploy the mpk on the device.

    sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk deploy -f project.mpk -t <your_device_ip>
        🚀 Sending MPK to 10.42.0.241...
        Transfer Progress for project.mpk:  100.00%
        🏁 MPK sent successfully!
        ✔ MPK Deployed! ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
        ✔ MPK Deployment is successful for project.mpk.
    
  8. Upon successful deployment of the mpk, memory gets allocated in the device (log prints can be seen on the device console) and the gst process starts in the device (check running process list on device using the top or ps commands):

    davinci:/data$ top
        Mem: 4291844K used, 89864K free, 199220K shrd, 7504K buff, 3604120K cached
        CPU:   9% usr  10% sys   0% nic  79% idle   0% io   0% irq   0% sirq
        Load average: 1.60 1.33 0.72 4/162 606
        PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
        546   305 root     S    8476m 195%  14% gst-launch-1.0 --gst-plugin-path=/data/simaai/applications/MLPerf/lib fakesrc
    
  9. The performance mode test runs for approximately 14 to 15 minutes. Wait until the test runs are complete.

  10. Upon completion of the test, memory gets deallocated (log prints can be seen in the log file /var/log/simaai.log) and the gst process ends.

  11. Verify the results on the device as root user at the /data/simaai/applications/MLPerf directory. View the test summary in the file mlperf_log_summary.txt.

    Follow the steps below to verify test results:

    davinci:~$ sudo cat /data/simaai/applications/MLPerf/mlperf_log_summary.txt
        ================================================
        MLPerf Results Summary
        ================================================
        SUT name :
        Scenario : SingleStream
        Mode     : PerformanceOnly
        90th percentile latency (ns) : 892472
        Result is : VALID
        Min duration satisfied : Yes
        Min queries satisfied : Yes
        Early stopping satisfied: Yes
        Early Stopping Result:
        * Processed at least 64 queries (688100).
        * Would discard 68230 highest latency queries.
        * Early stopping 90th percentile estimate: 892664
        * Early stopping 99th percentile estimate: 940977
    
        ================================================
        Additional Stats
        ================================================
        QPS w/ loadgen overhead         : 1146.83
        QPS w/o loadgen overhead        : 1175.86
    
        Min latency (ns)                : 797311
        Max latency (ns)                : 7575694
        Mean latency (ns)               : 850443
        50.00 percentile latency (ns)   : 840563
        90.00 percentile latency (ns)   : 892472
        95.00 percentile latency (ns)   : 907110
        97.00 percentile latency (ns)   : 917172
        99.00 percentile latency (ns)   : 940319
        99.90 percentile latency (ns)   : 1001692
    
        ================================================
        Test Parameters Used
        ================================================
        samples_per_query : 1
        target_qps : 1113.27
        target_latency (ns): 0
        max_async_queries : 1
        min_duration (ms): 600000
        max_duration (ms): 0
        min_query_count : 50000
        max_query_count : 0
        qsl_rng_seed : 148687905518835231
        sample_index_rng_seed : 520418551913322573
        schedule_rng_seed : 811580660758947900
        accuracy_log_rng_seed : 0
        accuracy_log_probability : 0
        accuracy_log_sampling_target : 0
        print_timestamps : 0
        performance_issue_unique : 0
        performance_issue_same : 0
        performance_issue_same_index : 0
        performance_sample_count : 2048
    
        No warnings encountered during test.
    
        No errors encountered during test.
    

    Under the “Additional Stats” section, compare QPS w/ and w/o loadgen stats with reference values.

    QPS w/ loadgen overhead : 1030.15
    QPS w/o loadgen overhead : 1052.92
    
  12. Use the mpk remove command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on MPK Tool.

Batch1 Accuracy Mode Test

  1. Make sure that the Batch1 performance mode test is run before running the accuracy mode test.

  2. Make sure to update the gst command in the application.json as shown in the code below (with “mlperf-run-type”=1) and then save and close the file.

    "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=1:3:224:224 out-dims=1:64 mlperf-run-type=1 mlperf-scenario=0 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\"  inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config\" silent=true ! fakesink
    
    Parameters values to change:
    “Mlperf-run-type” to 1
  3. Create mpk, deploy, and wait till the test ends. The test run will take approximately approximately 4 to 5 minutes.

  4. After the test run is complete, memory deallocation messages can be seen in the log file /var/log/simaai.log.

  5. For verifying results, use the validation script stored in the /data directory.

    davinci:~$ cd /data
    
  6. Change the file permissions, if required, by running the following command.

    davinci:/data$ sudo chmod 777 ./check_accuracy.sh
    
  7. Run the validation script.

    davinci:/data$ ./check_accuracy.sh /data/simaai/applications/MLPerf/mlperf_log_accuracy.json
    
  8. The output numbers should match the values shown below.

    accuracy=75.698%, good=37849, total=50000
    
  9. Use the mpk remove command to free up disk space on the device before running a new test.

Batch8 Performance Mode Test

  1. Go to the MLPerf directory in the Docker container.

    sima-user@docker-image-id:# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf
    
  2. Make sure the gst command has been updated in the application.json, as shown below.

    "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=8:3:224:224 out-dims=8:64 mlperf-run-type=0 mlperf-scenario=1 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\"  inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config\" silent=true ! fakesink"
    
    Parameters values to change:
    “In-dims” to 8:3:224:224
    “Out-dims” to 8:64
    “Mlperf-run-type” to 0
    “Mlperf-scenario” to 1
    “Inpath” to “/data/mlperf_resnet50_dataset.dat”
    “Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config”
  3. Create mpk in the SDK and deploy.

    sima-user@docker-image-id:/usr/local/lib/simaai/MLPerf# mpk create -s . -d .
    sima-user@docker-image-id:/usr/local/lib/simaai/MLPerf# mpk deploy -f project.mpk -t <your_device_ip>
    
  4. Check for memory allocation messages in the device console and cross-check the gst command in the device process tree.

    davinci:~$ top
    
  5. Wait till the test run is complete (memory deallocation happens) and check for the mlperf summary log file in the device root (/data/simaai/applications/MLPerf/) directory.

  6. The log summary should appear as shown below. The result files are stored under the same directory path /data/simaai/applications/MLPerf/ directory.

    davinci:/data$ sudo cat ./simaai/applications/MLPerf/mlperf_log_summary.txt
        ================================================
        MLPerf Results Summary
        ================================================
        SUT name :
        Scenario : MultiStream
        Mode     : PerformanceOnly
        99th percentile latency (ns) : 2927200
        Result is : VALID
        Min duration satisfied : Yes
        Min queries satisfied : Yes
        Early stopping satisfied: Yes
        Early Stopping Result:
        * Processed at least 662 queries (207154).
        * Would discard 1965 highest latency queries.
        * Early stopping 99th percentile estimate: 2928160
        ================================================
        Additional Stats
        ================================================
        Per-query latency:
        Min latency (ns)                : 2815320
        Max latency (ns)                : 5802920
        Mean latency (ns)               : 2873437
        50.00 percentile latency (ns)   : 2872120
        90.00 percentile latency (ns)   : 2895280
        95.00 percentile latency (ns)   : 2903880
        97.00 percentile latency (ns)   : 2910600
        99.00 percentile latency (ns)   : 2927200
        99.90 percentile latency (ns)   : 2998560
        ================================================
        Test Parameters Used
        ================================================
        samples_per_query : 8
        target_qps : 348.432
        target_latency (ns): 0
        max_async_queries : 1
        min_duration (ms): 600000
        max_duration (ms): 0
        min_query_count : 50000
        max_query_count : 0
        qsl_rng_seed : 148687905518835231
        sample_index_rng_seed : 520418551913322573
        schedule_rng_seed : 811580660758947900
        accuracy_log_rng_seed : 0
        accuracy_log_probability : 0
        accuracy_log_sampling_target : 0
        print_timestamps : 0
        performance_issue_unique : 0
        performance_issue_same : 0
        performance_issue_same_index : 0
        performance_sample_count : 1024
        1 warning encountered. See detailed log.
        No errors encountered during test.
    
  7. Check the Additional Stats section in the mlperf_log_summary file for Mean latency (ns) and compare against the reference value shown below.

    Mean latency (ns) : 2873437
    
  8. Use the mpk remove command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on MPK Tool.

Batch8 Accuracy Mode Test

  1. Make sure the Batch8 performance mode test is run before running the Batch8 accuracy mode test.

  2. Update the application.json file and the gst command, as shown below. That is, the mlperf-run-type=1 mlperf-scenario=1 in the applicaton json.

    "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=8:3:224:224 out-dims=8:64 mlperf-run-type=1 mlperf-scenario=1 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\"  inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config\" silent=true ! fakesink"
    
    Parameters values to change:
    “In-dims” to 8:3:224:224
    “Out-dims” to 8:64
    “Mlperf-run-type” to 1
    “Mlperf-scenario” to 1
    “Inpath” to “/data/mlperf_resnet50_dataset.dat”
    “Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config”
  3. Follow steps 3 through 8 of the “Batch1 Accuracy Mode Test”.

    Run the validation script.

    davinci:/data$ ./check_accuracy.sh ./simaai/applications/MLPerf/mlperf_log_accuracy.json
    

    The output numbers should match the values shown here : accuracy=75.990%, good=37995, total=50000

  4. Use the mpk remove command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on MPK Tool.

Batch14 Performance Mode Test

  1. Go to the MLPerf directory in the Docker container.

    sima-user@docker-image-id:/home# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf
    
  2. Modify the application.json file to update the parameters, as shown below.

    "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=14:3:224:224 out-dims=14:64 mlperf-run-type=0 mlperf-scenario=2 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\"  inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config\" silent=true ! fakesink"
    
    Parameters values to change:
    “In-dims” to 14:3:224:224
    “Out-dims” to 14:64
    “Mlperf-run-type” to 0
    “Mlperf-scenario” to 2
    “Inpath” to “/data/mlperf_resnet50_dataset.dat”
    “Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config”
  3. Create the mpk in SDK and deploy.

    sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk create -s . -d.
    sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk deploy -f project.mpk -t <device-ip>
    
  4. Check for memory allocation messages in the device console and cross-check the gst command in the device process tree by running the following command:

    davinci:~$ top
    
  5. Wait until the test is complete (memory deallocation happens) and check for the logs in the device as root in the /data/simaai/applications/MLPerf/ directory. Generally the test takes 20 to 22 minutes. The summary log should appear as shown below.

    davinci:~$ sudo cat /data/simaai/applications/MLPerf/mlperf_log_summary.txt
        ================================================
        MLPerf Results Summary
        ================================================
        SUT name :
        Scenario : Offline
        Mode     : PerformanceOnly
        Samples per second: 3397.96
        Result is : VALID
        Min duration satisfied : Yes
        Min queries satisfied : Yes
        Early stopping satisfied: Yes
    
        ================================================
        Additional Stats
        ================================================
        Min latency (ns)                : 1529062681
        Max latency (ns)                : 922611782310
        Mean latency (ns)               : 462219336486
        50.00 percentile latency (ns)   : 462310204729
        90.00 percentile latency (ns)   : 830534473673
        95.00 percentile latency (ns)   : 876460296204
        97.00 percentile latency (ns)   : 894974228836
        99.00 percentile latency (ns)   : 913361873441
        99.90 percentile latency (ns)   : 921692414874
    
        ================================================
        Test Parameters Used
        ================================================
        samples_per_query : 3135000
        target_qps : 4750
        target_latency (ns): 0
        max_async_queries : 1
        min_duration (ms): 600000
        max_duration (ms): 0
        min_query_count : 1
        max_query_count : 0
        qsl_rng_seed : 148687905518835231
        sample_index_rng_seed : 520418551913322573
        schedule_rng_seed : 811580660758947900
        accuracy_log_rng_seed : 0
        accuracy_log_probability : 0
        accuracy_log_sampling_target : 0
        print_timestamps : 0
        performance_issue_unique : 0
        performance_issue_same : 0
        performance_issue_same_index : 0
        performance_sample_count : 50000
    
        1 warning encountered. See detailed log.
    
        No errors encountered during test.
    
  6. Verify the samples per second against the reference value shown here Samples per second: 3397.96.

Batch14 Accuracy Mode Test

  1. Make sure the Batch14 performance mode test has been run before running the Batch14 accuracy mode test.

  2. Make sure to update the gst command in the application.json as shown below.

    "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gstplugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! \
    ml_filter indims=14:3:224:224 out-dims=14:64 mlperf-run-type=1 mlperf-scenario=2 toy-mode=false inpath=\"/data/mlperf_resnet50_dataset.dat\" \
    output-path=\"/data/simaai/applications/MLPerf\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config\" \
    silent=true ! fakesink"
    
    Parameters values to change:
    “In-dims” to 14:3:224:224
    “Out-dims” to 14:64
    “Mlperf-run-type” to 1
    “Mlperf-scenario” to 2
    “Inpath” to “/data/mlperf_resnet50_dataset.dat”
    “Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config”
  3. Follow steps “3 through 8” of the “Batch1 Accuracy Mode Test”.