.. _MLPerf Benchmark: MLPerf Benchmark ################ This section provides an overview of how to run the MLPerf benchmark tests using the Palette Software. The official MLPerf benchmark is run on a board that is different from the one in the Development Kit. However, following the steps below using the board in the Development Kit will allow users to run the MLPerf benchmark and replicate the performance results published. .. note:: You can get the FPS (Frames Per Second), but you would not be able to get accuracy or power figures. Accuracy requires a different and bigger dataset than what is shipped with SiMa’s Software Development Kit. Before you begin running the MLPerf tests, follow the steps below: #. Confirm that your board has been flashed with the latest tRoot and Yocto build. Please refer to :ref:`Firmware and Board Software Update` for more details. - To verify, run ``cat /etc/build`` and look for the build number. - If the version needs to be upgraded, follow the instructions to flash the board or contact **support@sima.ai**. #. Connect your Developer Board to your laptop and make sure you can SSH to the board. #. Download the following files within the ``.zip`` file using the download button: + mlperf_resnet50_dataset.dat + check_accuracy.sh + imagenet_accuracy.py + val_map.txt Accessing the MLPerf Files ========================== .. button-link:: https://docs.sima.ai/pkg_downloads/SDK1.3.0/ml_perf.zip :color: primary :shadow: Download Now #. Unzip the file. .. code-block:: console sima-user@sima-user-machine:~$ cd ~/Downloads sima-user@sima-user-machine:~/Downloads$ unzip ml_perf.zip #. SSH into the board. .. code-block:: console sima-user@sima-user-machine:~$ ssh sima@10.42.0.241 The authenticity of host '10.42.0.241 (10.42.0.241)' can't be established. ED25519 key fingerprint is SHA256:FbMdheLl0xLWy33YLEWUAcddRvjavYqg83rgnkFYcos. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '10.42.0.241' (ED25519) to the list of known hosts. sima@10.42.0.241's password: davinci:~$ #. Change the directory to ``/data`` and secure Copy (``scp``) the above four files to the ``/data`` directory on the Development Kit (device). .. code-block:: console davinci:/data# sudo scp @:/path/to/datafile/mlperf_resnet50_dataset.dat . Running MLPerf Tests ==================== To run the Batch1, Batch8, and Batch14 performance and accuracy mode tests, follow the steps described below. ---------------------------- Batch1 Performance Mode Test ---------------------------- #. Go to the MLPerf directory in the Docker container. .. code-block:: console sima-user@docker-image-id:/home# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf #. Verify the dependencies section in the ``application.json`` file in the SDK. .. code-block:: console sima-user@docker-image-id:/home# vi /usr/local/simaai/app_zoo/Gstreamer/MLPerf/application.json #. Make sure the ``gst`` section in ``/usr/local/simaai/app_zoo/Gstreamer/MLPerf/application.json`` within the Pallete Docker container is updated as shown below: .. code-block:: console "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=1:3:224:224 out-dims=1:64 mlperf-run-type=0 mlperf-scenario=0 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config\" silent=true ! fakesink" Update the parameter values as shown below: | **“In-dims” to 1:3:224:224** | **“Out-dims” to 1:64** | **“Mlperf-run-type” to 0** | **“Mlperf-scenario” to 0** | **“Inpath” to "/data/mlperf_resnet50_dataset.dat”** | **“Config” to "/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config\”** This will change the configuration to the current set of values that we want to evaluate. #. After modifying the ``application.json`` file, create an ``mpk``. .. code-block:: console sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk create -s . -d . ℹ Step a65-apps COMPILE completed successfully. ℹ Step COMPILE completed successfully. ℹ Step COPY RESOURCE completed successfully ℹ Step RPM BUILD completed successfully. ✔ Successfully created MPK at '/usr/local/simaai/app_zoo/Gstreamer/MLPerf/project.mpk' By default an ``mpk`` file gets created with the name, ``“project.mpk”``. #. Connect to the device using the IP address. .. code-block:: console sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk device connect -t sima@ ℹ Please enter the password for 10.42.0.241 🔐 : ℹ Connecting to sima@10.42.0.241... ✔ Connection established to 10.42.0.241 . #. Enter the password for the connection. By default the password is **edgeai** unless you already changed the default password to your own password. #. Deploy the ``mpk`` on the device. .. code-block:: console sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk deploy -f project.mpk -t 🚀 Sending MPK to 10.42.0.241... Transfer Progress for project.mpk: 100.00% 🏁 MPK sent successfully! ✔ MPK Deployed! ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 ✔ MPK Deployment is successful for project.mpk. #. Upon successful deployment of the ``mpk``, memory gets allocated in the device (log prints can be seen on the device console) and the ``gst process`` starts in the device (check running process list on device using the ``top`` or ``ps`` commands): .. code-block:: console davinci:/data$ top Mem: 4291844K used, 89864K free, 199220K shrd, 7504K buff, 3604120K cached CPU: 9% usr 10% sys 0% nic 79% idle 0% io 0% irq 0% sirq Load average: 1.60 1.33 0.72 4/162 606 PID PPID USER STAT VSZ %VSZ %CPU COMMAND 546 305 root S 8476m 195% 14% gst-launch-1.0 --gst-plugin-path=/data/simaai/applications/MLPerf/lib fakesrc #. The performance mode test runs for **approximately 14 to 15** minutes. Wait until the test runs are complete. #. Upon completion of the test, memory gets deallocated (log prints can be seen in the log file ``/var/log/simaai.log``) and the ``gst`` process ends. #. Verify the results on the device as **root user** at the ``/data/simaai/applications/MLPerf`` directory. View the test summary in the file ``mlperf_log_summary.txt``. Follow the steps below to verify test results: .. code-block:: console davinci:~$ sudo cat /data/simaai/applications/MLPerf/mlperf_log_summary.txt ================================================ MLPerf Results Summary ================================================ SUT name : Scenario : SingleStream Mode : PerformanceOnly 90th percentile latency (ns) : 892472 Result is : VALID Min duration satisfied : Yes Min queries satisfied : Yes Early stopping satisfied: Yes Early Stopping Result: * Processed at least 64 queries (688100). * Would discard 68230 highest latency queries. * Early stopping 90th percentile estimate: 892664 * Early stopping 99th percentile estimate: 940977 ================================================ Additional Stats ================================================ QPS w/ loadgen overhead : 1146.83 QPS w/o loadgen overhead : 1175.86 Min latency (ns) : 797311 Max latency (ns) : 7575694 Mean latency (ns) : 850443 50.00 percentile latency (ns) : 840563 90.00 percentile latency (ns) : 892472 95.00 percentile latency (ns) : 907110 97.00 percentile latency (ns) : 917172 99.00 percentile latency (ns) : 940319 99.90 percentile latency (ns) : 1001692 ================================================ Test Parameters Used ================================================ samples_per_query : 1 target_qps : 1113.27 target_latency (ns): 0 max_async_queries : 1 min_duration (ms): 600000 max_duration (ms): 0 min_query_count : 50000 max_query_count : 0 qsl_rng_seed : 148687905518835231 sample_index_rng_seed : 520418551913322573 schedule_rng_seed : 811580660758947900 accuracy_log_rng_seed : 0 accuracy_log_probability : 0 accuracy_log_sampling_target : 0 print_timestamps : 0 performance_issue_unique : 0 performance_issue_same : 0 performance_issue_same_index : 0 performance_sample_count : 2048 No warnings encountered during test. No errors encountered during test. Under the **“Additional Stats”** section, compare QPS w/ and w/o loadgen stats with reference values. .. code-block:: QPS w/ loadgen overhead : 1030.15 QPS w/o loadgen overhead : 1052.92 #. Use the ``mpk remove`` command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on :ref:`MPK Tool`. ------------------------- Batch1 Accuracy Mode Test ------------------------- #. Make sure that the **Batch1** performance mode test is run before running the accuracy mode test. #. Make sure to **update** the ``gst`` command in the ``application.json`` as shown in the code below (with ``“mlperf-run-type”=1``) and then save and close the file. .. code-block:: console "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=1:3:224:224 out-dims=1:64 mlperf-run-type=1 mlperf-scenario=0 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b1.config\" silent=true ! fakesink | **Parameters values to change:** | **“Mlperf-run-type” to 1** #. Create ``mpk``, ``deploy``, and wait till the test ends. The test run will take approximately approximately 4 to 5 minutes. #. After the test run is complete, memory deallocation messages can be seen in the log file ``/var/log/simaai.log``. #. For verifying results, use the validation script stored in the ``/data`` directory. .. code-block:: console davinci:~$ cd /data #. Change the file permissions, if required, by running the following command. .. code-block:: console davinci:/data$ sudo chmod 777 ./check_accuracy.sh #. Run the validation script. .. code-block:: console davinci:/data$ ./check_accuracy.sh /data/simaai/applications/MLPerf/mlperf_log_accuracy.json #. The output numbers should match the values shown below. .. code-block:: accuracy=75.698%, good=37849, total=50000 #. Use the ``mpk remove`` command to free up disk space on the device before running a new test. ---------------------------- Batch8 Performance Mode Test ---------------------------- #. Go to the ``MLPerf`` directory in the Docker container. .. code-block:: console sima-user@docker-image-id:# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf #. Make sure the ``gst`` command has been updated in the ``application.json``, as shown below. .. code-block:: console "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=8:3:224:224 out-dims=8:64 mlperf-run-type=0 mlperf-scenario=1 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config\" silent=true ! fakesink" | **Parameters values to change:** | **“In-dims” to 8:3:224:224** | **“Out-dims” to 8:64** | **“Mlperf-run-type” to 0** | **“Mlperf-scenario” to 1** | **“Inpath” to "/data/mlperf_resnet50_dataset.dat”** | **“Config” to "/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config\”** #. Create ``mpk`` in the SDK and deploy. .. code-block:: console sima-user@docker-image-id:/usr/local/lib/simaai/MLPerf# mpk create -s . -d . sima-user@docker-image-id:/usr/local/lib/simaai/MLPerf# mpk deploy -f project.mpk -t #. Check for memory allocation messages in the device console and cross-check the ``gst`` command in the device process tree. .. code-block:: console davinci:~$ top #. Wait till the test run is complete (memory deallocation happens) and check for the ``mlperf summary log`` file in the device root ``(/data/simaai/applications/MLPerf/)`` directory. #. The log summary should appear as shown below. The result files are stored under the ``same directory path /data/simaai/applications/MLPerf/`` directory. .. code-block:: console davinci:/data$ sudo cat ./simaai/applications/MLPerf/mlperf_log_summary.txt ================================================ MLPerf Results Summary ================================================ SUT name : Scenario : MultiStream Mode : PerformanceOnly 99th percentile latency (ns) : 2927200 Result is : VALID Min duration satisfied : Yes Min queries satisfied : Yes Early stopping satisfied: Yes Early Stopping Result: * Processed at least 662 queries (207154). * Would discard 1965 highest latency queries. * Early stopping 99th percentile estimate: 2928160 ================================================ Additional Stats ================================================ Per-query latency: Min latency (ns) : 2815320 Max latency (ns) : 5802920 Mean latency (ns) : 2873437 50.00 percentile latency (ns) : 2872120 90.00 percentile latency (ns) : 2895280 95.00 percentile latency (ns) : 2903880 97.00 percentile latency (ns) : 2910600 99.00 percentile latency (ns) : 2927200 99.90 percentile latency (ns) : 2998560 ================================================ Test Parameters Used ================================================ samples_per_query : 8 target_qps : 348.432 target_latency (ns): 0 max_async_queries : 1 min_duration (ms): 600000 max_duration (ms): 0 min_query_count : 50000 max_query_count : 0 qsl_rng_seed : 148687905518835231 sample_index_rng_seed : 520418551913322573 schedule_rng_seed : 811580660758947900 accuracy_log_rng_seed : 0 accuracy_log_probability : 0 accuracy_log_sampling_target : 0 print_timestamps : 0 performance_issue_unique : 0 performance_issue_same : 0 performance_issue_same_index : 0 performance_sample_count : 1024 1 warning encountered. See detailed log. No errors encountered during test. #. Check the **Additional Stats** section in the **mlperf_log_summary** file for **Mean latency (ns)** and compare against the reference value shown below. .. code-block:: Mean latency (ns) : 2873437 #. Use the ``mpk remove`` command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on :ref:`MPK Tool`. ------------------------- Batch8 Accuracy Mode Test ------------------------- #. Make sure the **Batch8** performance mode test is run before running the **Batch8** accuracy mode test. #. Update the ``application.json`` file and the ``gst`` command, as shown below. That is, the ``mlperf-run-type=1 mlperf-scenario=1`` in the applicaton json. .. code-block:: console "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=8:3:224:224 out-dims=8:64 mlperf-run-type=1 mlperf-scenario=1 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config\" silent=true ! fakesink" | **Parameters values to change:** | **“In-dims” to 8:3:224:224** | **“Out-dims” to 8:64** | **“Mlperf-run-type” to 1** | **“Mlperf-scenario” to 1** | **“Inpath” to "/data/mlperf_resnet50_dataset.dat”** | **“Config” to "/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b8.config\”** #. Follow steps **3** through **8** of the **“Batch1 Accuracy Mode Test”**. Run the validation script. .. code-block:: console davinci:/data$ ./check_accuracy.sh ./simaai/applications/MLPerf/mlperf_log_accuracy.json The output numbers should match the values shown here : ``accuracy=75.990%, good=37995, total=50000`` #. Use the ``mpk remove`` command to free up disk space on the device before running a new test. For details on how to use the mpk commands, see the section on :ref:`MPK Tool`. ----------------------------- Batch14 Performance Mode Test ----------------------------- #. Go to the ``MLPerf`` directory in the Docker container. .. code-block:: console sima-user@docker-image-id:/home# cd /usr/local/simaai/app_zoo/Gstreamer/MLPerf #. Modify the ``application.json`` file to update the parameters, as shown below. .. code-block:: console "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gst-plugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! ml_filter in-dims=14:3:224:224 out-dims=14:64 mlperf-run-type=0 mlperf-scenario=2 toy-mode=false output-path=\"/data/simaai/applications/MLPerf\" inpath=\"/data/mlperf_resnet50_dataset.dat\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config\" silent=true ! fakesink" | **Parameters values to change:** | **“In-dims” to 14:3:224:224** | **“Out-dims” to 14:64** | **“Mlperf-run-type” to 0** | **“Mlperf-scenario” to 2** | **“Inpath” to "/data/mlperf_resnet50_dataset.dat”** | **“Config” to "/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config\”** #. Create the ``mpk`` in SDK and deploy. .. code-block:: console sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk create -s . -d. sima-user@docker-image-id:/usr/local/simaai/app_zoo/Gstreamer/MLPerf# mpk deploy -f project.mpk -t #. Check for memory allocation messages in the device console and cross-check the ``gst`` command in the device process tree by running the following command: .. code-block:: console davinci:~$ top #. Wait until the test is complete (memory deallocation happens) and check for the logs in the device as root in the ``/data/simaai/applications/MLPerf/`` directory. Generally the test takes 20 to 22 minutes. The summary log should appear as shown below. .. code-block:: console davinci:~$ sudo cat /data/simaai/applications/MLPerf/mlperf_log_summary.txt ================================================ MLPerf Results Summary ================================================ SUT name : Scenario : Offline Mode : PerformanceOnly Samples per second: 3397.96 Result is : VALID Min duration satisfied : Yes Min queries satisfied : Yes Early stopping satisfied: Yes ================================================ Additional Stats ================================================ Min latency (ns) : 1529062681 Max latency (ns) : 922611782310 Mean latency (ns) : 462219336486 50.00 percentile latency (ns) : 462310204729 90.00 percentile latency (ns) : 830534473673 95.00 percentile latency (ns) : 876460296204 97.00 percentile latency (ns) : 894974228836 99.00 percentile latency (ns) : 913361873441 99.90 percentile latency (ns) : 921692414874 ================================================ Test Parameters Used ================================================ samples_per_query : 3135000 target_qps : 4750 target_latency (ns): 0 max_async_queries : 1 min_duration (ms): 600000 max_duration (ms): 0 min_query_count : 1 max_query_count : 0 qsl_rng_seed : 148687905518835231 sample_index_rng_seed : 520418551913322573 schedule_rng_seed : 811580660758947900 accuracy_log_rng_seed : 0 accuracy_log_probability : 0 accuracy_log_sampling_target : 0 print_timestamps : 0 performance_issue_unique : 0 performance_issue_same : 0 performance_issue_same_index : 0 performance_sample_count : 50000 1 warning encountered. See detailed log. No errors encountered during test. #. Verify the samples per second against the reference value shown here ``Samples per second: 3397.96``. -------------------------- Batch14 Accuracy Mode Test -------------------------- #. Make sure the **Batch14** performance mode test has been run before running the **Batch14** accuracy mode test. #. Make sure to update the ``gst`` command in the ``application.json`` as shown below. .. code-block:: console "gst": "MLA_OCM=max LD_LIBRARY_PATH=\"/data/simaai/applications/MLPerf/lib\" gst-launch-1.0 --gstplugin-path='/data/simaai/applications/MLPerf/lib' fakesrc ! \ ml_filter indims=14:3:224:224 out-dims=14:64 mlperf-run-type=1 mlperf-scenario=2 toy-mode=false inpath=\"/data/mlperf_resnet50_dataset.dat\" \ output-path=\"/data/simaai/applications/MLPerf\" config=\"/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config\" \ silent=true ! fakesink" | **Parameters values to change:** | **“In-dims” to 14:3:224:224** | **“Out-dims” to 14:64** | **“Mlperf-run-type” to 1** | **“Mlperf-scenario” to 2** | **“Inpath” to "/data/mlperf_resnet50_dataset.dat”** | **“Config” to "/data/simaai/applications/MLPerf/etc/bad_sparse_resnet50_v1_b14.config\”** #. Follow steps **"3 through 8"** of the **“Batch1 Accuracy Mode Test”**.