MLPerf Benchmark
This section provides an overview of how to run the MLPerf benchmark tests using the Palette Software. The official MLPerf benchmark is run on a board that is different from the one in the Development Kit. However, following the steps below using the board in the Development Kit will allow users to run the MLPerf benchmark and replicate the performance results published.
Note
You can get the FPS (Frames Per Second), but you would not be able to get accuracy or power figures. Accuracy requires a different and bigger dataset than what is shipped with SiMa’s Software Development Kit.
Before you begin running the MLPerf tests, follow the steps below:
Confirm that your board has been flashed with the latest tRoot and Yocto build. Please refer to Firmware and Board Software Update for more details.
To verify, run
cat /etc/build
and look for the build number.If the version needs to be upgraded, follow the instructions to flash the board or contact support@sima.ai.
Connect your Developer Board to your laptop and make sure you can SSH to the board.
- Download the following files within the
.zip
file using the download button: mlperf_resnet50_dataset.dat
check_accuracy.sh
imagenet_accuracy.py
val_map.txt
- Download the following files within the
Accessing the MLPerf Files
Unzip the file.
sima-user@sima-user-machine:~$ cd ~/Downloads sima-user@sima-user-machine:~/Downloads$ unzip ml_perf.zip
SSH into the board.
sima-user@sima-user-machine:~$ ssh sima@10.42.0.241 The authenticity of host '10.42.0.241 (10.42.0.241)' can't be established. ED25519 key fingerprint is SHA256:FbMdheLl0xLWy33YLEWUAcddRvjavYqg83rgnkFYcos. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '10.42.0.241' (ED25519) to the list of known hosts. sima@10.42.0.241's password: davinci:~$
Change the directory to
/data
and secure Copy (scp
) the above four files to the/data
directory on the Development Kit (device).davinci:/data# sudo scp <host_user_name>@<host_ip_address>:/path/to/datafile/mlperf_resnet50_dataset.dat .
Running MLPerf Tests
To run the Batch1, Batch8, and Batch14 performance and accuracy mode tests, follow the steps described below.
Batch1 Performance Mode Test
Go to the MLPerf directory in the Docker container.
sima-user@docker-image-id:/home# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf
Verify the dependencies section in the
application.json
file in the SDK.sima-user@docker-image-id:/home# vi /usr/local/simaai/app_zoo/Gstreamer/MLPerf/application.json
Make sure the
gst
section in/usr/local/simaai/app_zoo/Gstreamer/MLPerf/application.json
within the Pallete Docker container is updated as shown below:"gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=1:3:224:224 out-dims=1:64 mlperf-run-type=0 mlperf-scenario=0 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config\" silent=true ! fakesink"
Update the parameter values as shown below:
“In-dims” to 1:3:224:224“Out-dims” to 1:64“Mlperf-run-type” to 0“Mlperf-scenario” to 0“Inpath” to “/data/mlperf_resnet50_dataset.dat”“Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config”This will change the configuration to the current set of values that we want to evaluate.
After modifying the
application.json
file, create anmpk
.sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk create -s . -d . ℹ Step a65-apps COMPILE completed successfully. ℹ Step COMPILE completed successfully. ℹ Step COPY RESOURCE completed successfully ℹ Step RPM BUILD completed successfully. ✔ Successfully created MPK at '/usr/local/simaai/app_zoo/Gstreamer/MLPerf/project.mpk'
By default an
mpk
file gets created with the name,“project.mpk”
.Connect to the device using the IP address.
sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk device connect -t sima@<your-device_ip> ℹ Please enter the password for 10.42.0.241 🔐 : ℹ Connecting to sima@10.42.0.241... ✔ Connection established to 10.42.0.241 .
Enter the password for the connection. By default the password is edgeai unless you already changed the default password to your own password.
Deploy the
mpk
on the device.sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk deploy -f project.mpk -t <your_device_ip> 🚀 Sending MPK to 10.42.0.241... Transfer Progress for project.mpk: 100.00% 🏁 MPK sent successfully! ✔ MPK Deployed! ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 ✔ MPK Deployment is successful for project.mpk.
Upon successful deployment of the
mpk
, memory gets allocated in the device (log prints can be seen on the device console) and thegst process
starts in the device (check running process list on device using thetop
orps
commands):davinci:/data$ top Mem: 4291844K used, 89864K free, 199220K shrd, 7504K buff, 3604120K cached CPU: 9% usr 10% sys 0% nic 79% idle 0% io 0% irq 0% sirq Load average: 1.60 1.33 0.72 4/162 606 PID PPID USER STAT VSZ %VSZ %CPU COMMAND 546 305 root S 8476m 195% 14% gst-launch-1.0 --gst-plugin-path=/data/simaai/applications/MLPerf/lib fakesrc
The performance mode test runs for approximately 14 to 15 minutes. Wait until the test runs are complete.
Upon completion of the test, memory gets deallocated (log prints can be seen in the log file
/var/log/simaai.log
) and thegst
process ends.Verify the results on the device as root user at the
/data/simaai/applications/MLPerf
directory. View the test summary in the filemlperf_log_summary.txt
.Follow the steps below to verify test results:
davinci:~$ sudo cat /data/simaai/applications/MLPerf/mlperf_log_summary.txt ================================================ MLPerf Results Summary ================================================ SUT name : Scenario : SingleStream Mode : PerformanceOnly 90th percentile latency (ns) : 892472 Result is : VALID Min duration satisfied : Yes Min queries satisfied : Yes Early stopping satisfied: Yes Early Stopping Result: * Processed at least 64 queries (688100). * Would discard 68230 highest latency queries. * Early stopping 90th percentile estimate: 892664 * Early stopping 99th percentile estimate: 940977 ================================================ Additional Stats ================================================ QPS w/ loadgen overhead : 1146.83 QPS w/o loadgen overhead : 1175.86 Min latency (ns) : 797311 Max latency (ns) : 7575694 Mean latency (ns) : 850443 50.00 percentile latency (ns) : 840563 90.00 percentile latency (ns) : 892472 95.00 percentile latency (ns) : 907110 97.00 percentile latency (ns) : 917172 99.00 percentile latency (ns) : 940319 99.90 percentile latency (ns) : 1001692 ================================================ Test Parameters Used ================================================ samples_per_query : 1 target_qps : 1113.27 target_latency (ns): 0 max_async_queries : 1 min_duration (ms): 600000 max_duration (ms): 0 min_query_count : 50000 max_query_count : 0 qsl_rng_seed : 148687905518835231 sample_index_rng_seed : 520418551913322573 schedule_rng_seed : 811580660758947900 accuracy_log_rng_seed : 0 accuracy_log_probability : 0 accuracy_log_sampling_target : 0 print_timestamps : 0 performance_issue_unique : 0 performance_issue_same : 0 performance_issue_same_index : 0 performance_sample_count : 2048 No warnings encountered during test. No errors encountered during test.
Under the “Additional Stats” section, compare QPS w/ and w/o loadgen stats with reference values.
QPS w/ loadgen overhead : 1030.15 QPS w/o loadgen overhead : 1052.92
Use the
mpk remove
command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on MPK Tool.
Batch1 Accuracy Mode Test
Make sure that the Batch1 performance mode test is run before running the accuracy mode test.
Make sure to update the
gst
command in theapplication.json
as shown in the code below (with“mlperf-run-type”=1
) and then save and close the file."gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=1:3:224:224 out-dims=1:64 mlperf-run-type=1 mlperf-scenario=0 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config\" silent=true ! fakesink
Parameters values to change:“Mlperf-run-type” to 1Create
mpk
,deploy
, and wait till the test ends. The test run will take approximately approximately 4 to 5 minutes.After the test run is complete, memory deallocation messages can be seen in the log file
/var/log/simaai.log
.For verifying results, use the validation script stored in the
/data
directory.davinci:~$ cd /data
Change the file permissions, if required, by running the following command.
davinci:/data$ sudo chmod 777 ./check_accuracy.sh
Run the validation script.
davinci:/data$ ./check_accuracy.sh /data/simaai/applications/MLPerf/mlperf_log_accuracy.json
The output numbers should match the values shown below.
accuracy=75.698%, good=37849, total=50000
Use the
mpk remove
command to free up disk space on the device before running a new test.
Batch8 Performance Mode Test
Go to the
MLPerf
directory in the Docker container.sima-user@docker-image-id:# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf
Make sure the
gst
command has been updated in theapplication.json
, as shown below."gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=8:3:224:224 out-dims=8:64 mlperf-run-type=0 mlperf-scenario=1 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config\" silent=true ! fakesink"
Parameters values to change:“In-dims” to 8:3:224:224“Out-dims” to 8:64“Mlperf-run-type” to 0“Mlperf-scenario” to 1“Inpath” to “/data/mlperf_resnet50_dataset.dat”“Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config”Create
mpk
in the SDK and deploy.sima-user@docker-image-id:/usr/local/lib/simaai/MLPerf# mpk create -s . -d . sima-user@docker-image-id:/usr/local/lib/simaai/MLPerf# mpk deploy -f project.mpk -t <your_device_ip>
Check for memory allocation messages in the device console and cross-check the
gst
command in the device process tree.davinci:~$ top
Wait till the test run is complete (memory deallocation happens) and check for the
mlperf summary log
file in the device root(/data/simaai/applications/MLPerf/)
directory.The log summary should appear as shown below. The result files are stored under the
same directory path /data/simaai/applications/MLPerf/
directory.davinci:/data$ sudo cat ./simaai/applications/MLPerf/mlperf_log_summary.txt ================================================ MLPerf Results Summary ================================================ SUT name : Scenario : MultiStream Mode : PerformanceOnly 99th percentile latency (ns) : 2927200 Result is : VALID Min duration satisfied : Yes Min queries satisfied : Yes Early stopping satisfied: Yes Early Stopping Result: * Processed at least 662 queries (207154). * Would discard 1965 highest latency queries. * Early stopping 99th percentile estimate: 2928160 ================================================ Additional Stats ================================================ Per-query latency: Min latency (ns) : 2815320 Max latency (ns) : 5802920 Mean latency (ns) : 2873437 50.00 percentile latency (ns) : 2872120 90.00 percentile latency (ns) : 2895280 95.00 percentile latency (ns) : 2903880 97.00 percentile latency (ns) : 2910600 99.00 percentile latency (ns) : 2927200 99.90 percentile latency (ns) : 2998560 ================================================ Test Parameters Used ================================================ samples_per_query : 8 target_qps : 348.432 target_latency (ns): 0 max_async_queries : 1 min_duration (ms): 600000 max_duration (ms): 0 min_query_count : 50000 max_query_count : 0 qsl_rng_seed : 148687905518835231 sample_index_rng_seed : 520418551913322573 schedule_rng_seed : 811580660758947900 accuracy_log_rng_seed : 0 accuracy_log_probability : 0 accuracy_log_sampling_target : 0 print_timestamps : 0 performance_issue_unique : 0 performance_issue_same : 0 performance_issue_same_index : 0 performance_sample_count : 1024 1 warning encountered. See detailed log. No errors encountered during test.
Check the Additional Stats section in the mlperf_log_summary file for Mean latency (ns) and compare against the reference value shown below.
Mean latency (ns) : 2873437
Use the
mpk remove
command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on MPK Tool.
Batch8 Accuracy Mode Test
Make sure the Batch8 performance mode test is run before running the Batch8 accuracy mode test.
Update the
application.json
file and thegst
command, as shown below. That is, themlperf-run-type=1 mlperf-scenario=1
in the applicaton json."gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=8:3:224:224 out-dims=8:64 mlperf-run-type=1 mlperf-scenario=1 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config\" silent=true ! fakesink"
Parameters values to change:“In-dims” to 8:3:224:224“Out-dims” to 8:64“Mlperf-run-type” to 1“Mlperf-scenario” to 1“Inpath” to “/data/mlperf_resnet50_dataset.dat”“Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config”Follow steps 3 through 8 of the “Batch1 Accuracy Mode Test”.
Run the validation script.
davinci:/data$ ./check_accuracy.sh ./simaai/applications/MLPerf/mlperf_log_accuracy.json
The output numbers should match the values shown here :
accuracy=75.990%, good=37995, total=50000
Use the
mpk remove
command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on MPK Tool.
Batch14 Performance Mode Test
Go to the
MLPerf
directory in the Docker container.sima-user@docker-image-id:/home# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf
Modify the
application.json
file to update the parameters, as shown below."gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=14:3:224:224 out-dims=14:64 mlperf-run-type=0 mlperf-scenario=2 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config\" silent=true ! fakesink"
Parameters values to change:“In-dims” to 14:3:224:224“Out-dims” to 14:64“Mlperf-run-type” to 0“Mlperf-scenario” to 2“Inpath” to “/data/mlperf_resnet50_dataset.dat”“Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config”Create the
mpk
in SDK and deploy.sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk create -s . -d. sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk deploy -f project.mpk -t <device-ip>
Check for memory allocation messages in the device console and cross-check the
gst
command in the device process tree by running the following command:davinci:~$ top
Wait until the test is complete (memory deallocation happens) and check for the logs in the device as root in the
/data/simaai/applications/MLPerf/
directory. Generally the test takes 20 to 22 minutes. The summary log should appear as shown below.davinci:~$ sudo cat /data/simaai/applications/MLPerf/mlperf_log_summary.txt ================================================ MLPerf Results Summary ================================================ SUT name : Scenario : Offline Mode : PerformanceOnly Samples per second: 3397.96 Result is : VALID Min duration satisfied : Yes Min queries satisfied : Yes Early stopping satisfied: Yes ================================================ Additional Stats ================================================ Min latency (ns) : 1529062681 Max latency (ns) : 922611782310 Mean latency (ns) : 462219336486 50.00 percentile latency (ns) : 462310204729 90.00 percentile latency (ns) : 830534473673 95.00 percentile latency (ns) : 876460296204 97.00 percentile latency (ns) : 894974228836 99.00 percentile latency (ns) : 913361873441 99.90 percentile latency (ns) : 921692414874 ================================================ Test Parameters Used ================================================ samples_per_query : 3135000 target_qps : 4750 target_latency (ns): 0 max_async_queries : 1 min_duration (ms): 600000 max_duration (ms): 0 min_query_count : 1 max_query_count : 0 qsl_rng_seed : 148687905518835231 sample_index_rng_seed : 520418551913322573 schedule_rng_seed : 811580660758947900 accuracy_log_rng_seed : 0 accuracy_log_probability : 0 accuracy_log_sampling_target : 0 print_timestamps : 0 performance_issue_unique : 0 performance_issue_same : 0 performance_issue_same_index : 0 performance_sample_count : 50000 1 warning encountered. See detailed log. No errors encountered during test.
Verify the samples per second against the reference value shown here
Samples per second: 3397.96
.
Batch14 Accuracy Mode Test
Make sure the Batch14 performance mode test has been run before running the Batch14 accuracy mode test.
Make sure to update the
gst
command in theapplication.json
as shown below."gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gstplugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! \ ml_filter indims=14:3:224:224 out-dims=14:64 mlperf-run-type=1 mlperf-scenario=2 toy-mode=false inpath=\"/data/mlperf_resnet50_dataset.dat\" \ output-path=\"/data/simaai/applications/MLPerf\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config\" \ silent=true ! fakesink"
Parameters values to change:“In-dims” to 14:3:224:224“Out-dims” to 14:64“Mlperf-run-type” to 1“Mlperf-scenario” to 2“Inpath” to “/data/mlperf_resnet50_dataset.dat”“Config” to “/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config”Follow steps “3 through 8” of the “Batch1 Accuracy Mode Test”.