.. _build_host_app_with_cpp: Build Host App With C++ API ########################### In PCIe mode, the SiMa.ai MLSoC can be paired with a host system through PCIe and a host CPU can offload portions of the ML application to the MLSoC. The APIs are integrated into the host C++ application, which will then communicate with the MLSoC through PCIe. .. note:: In PCIe mode, you can currently use the Machine Learning Accelerator (MLA) to run inference tasks (``quant`` -> ``NN Model`` -> ``dequant``). Additionally, with our support, you can manually generate the MPK in the SDK to enable any valid GStreamer PCIe pipeline. In a future release, this mode will expand to include access to all hardware blocks, such as video codecs, enabling pre- and post-processing operations directly on the MLSoC. This enhancement is part of our ongoing roadmap. Follow the instructions below to build a sample application that uses the ResNet50 model to classify images. .. tabs:: .. tab:: Prerequisites - Follow this :ref:`instruction ` to setup the development system in PCIe mode. - Download the `test image dataset `_. This will be used by the host side application. - Download the optimized `ResNet50 model `_ and make it available to the Palette container environment. To learn more on how to optimize a standard Resnet50 ONNX model refer to this :ref:`link `. .. code-block:: console sima-user@sima-user-machine:~$ mkdir -p ~/workspace/resnet50 sima-user@sima-user-machine:~$ mkdir -p ~/workspace/hostapp sima-user@sima-user-machine:~$ cp resnet50_mpk.tar.gz ~/workspace/resnet50/ sima-user@sima-user-machine:~$ cp test_images.tar.gz ~/workspace/hostapp/ .. tab:: Create an MPK .. note:: The following command is executed inside the Palette container environment. #. Optionally, depends on the input image resolution, edit the ``user_cfg.json`` file to modify the parameters shown below. The sample_cfg.json file is in ``/usr/local/simaai/utils/mpk_parser/user_cfg.json``. .. code-block:: console sima-user@docker-image-id:/home/docker/sima-cli/resnet50$ vi /usr/local/simaai/utils/mpk_parser/user_cfg.json { "img_width": 1920, "img_height": 1080, "input_width": 1920, "input_height": 1080, "output_width": 224, "output_height": 224, "input_depth": 3, "keep_aspect": 0, "norm_channel_params": [[1.0, 0, 0.003921569], [1.0, 0, 0.003921569], [1.0, 0, 0.003921569]], "normalize": 1, "input_type": "RGB", "output_type": "BGR", "scaling_type": "INTER_LINEAR", "padding_type": "CENTER", "ibufname": "input" } #. Using the MPK parser tool create an MPK file from the downloaded model ``resnet50_mpk.tar.gz``. Run the below commands inside the Palette container environment. .. code-block:: console sima-user@docker-image-id:/home/docker/sima-cli/resnet50$ mkdir -p /tmp/resnet50_mpk && tar -xzf resnet50_mpk.tar.gz -C /tmp/resnet50_mpk && rm /tmp/resnet50_mpk/preproc.json /tmp/resnet50_mpk/mla.json /tmp/resnet50_mpk/detess_dequant.json && tar -czf resnet50_mpk.tar.gz -C /tmp/resnet50_mpk . && rm -r /tmp/resnet50_mpk sima-user@docker-image-id:/home/docker/sima-cli/resnet50$ python3 /usr/local/simaai/utils/mpk_parser/m_parser.py -targz resnet50_mpk.tar.gz -project /usr/local/simaai/app_zoo/Gstreamer/CPP_API_TestPipeline -cfg /usr/local/simaai/utils/mpk_parser/user_cfg.json File preproc.json copied to /usr/local/simaai/app_zoo/Gstreamer/CPP_API_TestPipeline/plugins/pre_process/cfg File postproc.json copied to /usr/local/simaai/app_zoo/Gstreamer/CPP_API_TestPipeline/plugins/post_process/cfg File mla.json copied to /usr/local/simaai/app_zoo/Gstreamer/CPP_API_TestPipeline/plugins/process_mla/cfg File ./sima_temp/resnet50_stage1_mla.lm copied to /usr/local/simaai/app_zoo/Gstreamer/CPP_API_TestPipeline/plugins/process_mla/res/process.lm File ./sima_temp/resnet50_mpk.json copied to /usr/local/simaai/app_zoo/Gstreamer/CPP_API_TestPipeline/resources/mpk.json ℹ Compiling a65-apps... ✔ a65-apps compiled successfully. ℹ Compiling Plugins... ✔ Plugins Compiled successfully. ℹ Copying Resources... ✔ Resources Copied successfully. ℹ Building Rpm... ✔ Rpm built successfully. ℹ Creating mpk file... ✔ Mpk file created successfully at /home/docker/sima-cli/resnet50/project.mpk . MPK Created successfully. The generated ``project.mpk`` file will be used by the host side application later to deploy to the DevKit. .. tab:: Create a Host Side Application Once the ResNet50 MPK is created, the pipeline is deployed and run on the DevKit. Executing the MPK includes the following tasks: #. Read the image → Pre-process (resize, normalise) → MLSoC CPP Sync Inference → post-process (argmax) → Overlay → save output. #. Save the output images in the output folder. #. Compile the example application on the Host PC outside the Docker SDK. #. You can download the Resnet50 example :download:`resnet50_pcie_application.tar.xz `. .. note:: The following command is executed on the host side. To avoid permission conflicts with the Palette environment, it is recommended to create a separate project folder dedicated to the host application. .. code-block:: console sima-user@sima-user-machine:~/workspace/hostapp$ tar -xvf test_images.tar.xz sima-user@sima-user-machine:~/workspace/hostapp$ tar -xvf resnet50_pcie_application.tar.xz CMakeLists.txt imagenet1000_clsidx_to_labels.txt main.cpp resnet50_project.mpk sima-user@sima-user-machine:~/workspace/hostapp$ mkdir build && cd build sima-user@sima-user-machine:~/workspace/hostapp/build$ cmake ../ -- The C compiler identification is GNU 11.4.0 -- The CXX compiler identification is GNU 11.4.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found OpenCV: /usr/local (found version "4.6.0") -- Configuring done -- Generating done -- Build files have been written to: /home/sima-user/workspace/hostapp/build sima-user@sima-user-machine:~/workspace/hostapp/build$ make [ 50%] Building CXX object CMakeFiles/test_img.dir/main.cpp.o [100%] Linking CXX executable test_img [100%] Built target test_img .. tab:: Execute the Host Side Application .. code-block:: console sima-user@sima-user-machine:~/workspace/hostapp/build$ ./test_img ../../resnet50/project.mpk Directory created or already exists: ./../output SiMaDevicePtr for GUID : sima_mla_c0 is : 0x5627ab5bf9b0 sima_send_mgmt_file: File Name: ../../resnet50/project.mpk sima_send_mgmt_file: File size 20184712 sima_send_mgmt_file: Total Bytes sent 20184712 in 1 seconds Opening /dev/sima_mla_c0 si_mla_create_data_queues: Data completion queue successfully created si_mla_create_data_queues: Data work queue successfully created si_mla_create_data_queues: Data receive queue successfully created loadModel is successful with modelPtr: 0x5627ab5c8b00 runInferenceSynchronousloadModel_cppsdkpipeline modelPtr->inputShape.size:1 modelPtr->inputShape: 224 224 3 modelPtr->outputShape.size:1 modelPtr->outputShape: 1 1 1000 total Images:7 ./../test_images/000000009448.jpg starting the run Synchronous Inference Time taken per iteration: 378 milliseconds Predicted label: 879: 'umbrella', Image with predicted label saved to: ./../output/000000009448.jpg ./../test_images/000000007784.jpg starting the run Synchronous Inference Time taken per iteration: 4 milliseconds Predicted label: 701: 'parachute, chute', Image with predicted label saved to: ./../output/000000007784.jpg ... ... ... .. tab:: Inside The Host Side App The application workflow includes the following steps: - Enumerating available SiMa devices - Loading and running inference models - Preprocessing images for inference - Handling the inference results and printing them .. dropdown:: Device Initialization :animate: fade-in :color: secondary :open: Before performing inference, the application initializes and `enumerates <../api_reference/pcie_host_apis/cpp_api_references.html#_CPPv4N6simaai12SimaMlsocApi20enumerateDeviceGuidsEv>`_ available SiMa MLSoC devices using the `SimaMlsocApi <../api_reference/pcie_host_apis/cpp_api_references.html#_CPPv4N6simaai12SimaMlsocApiE>`_ interface: .. code:: CPP shared_ptr simaDeviceInst = simaai::SimaMlsocApi::getInstance(); vector guids = simaDeviceInst->enumerateDeviceGuids(); simaDeviceInst->setLogVerbosity(simaai::SiMaLogLevel::debug); For each device found, the application opens the device and prepares it for inference: .. code:: CPP shared_ptr SiMaDevicePtr = simaDeviceInst->openDevice(guids[i]); .. dropdown:: Model Loading :animate: fade-in :color: secondary :open: Once a device is initialized, the application loads a pre-trained model onto the MLSoC: .. code:: CPP simaai::SiMaModel model; std::vector in_shape{224,224,3}; std::vector out_shape{1,1,1000}; model.numInputTensors = 1; model.numOutputTensors = 1; model.outputBatchSize = 1; model.inputBatchSize = 1; model.inputShape.emplace_back(in_shape); model.outputShape.emplace_back(out_shape); shared_ptr modelPtr = simaDeviceInst->load(SiMaDevicePtr, model_path, model); .. dropdown:: Image Preprocessing :animate: fade-in :color: secondary :open: As Resnet50 model expects 224x224 input size, images are preprocessed before inference, including resizing, normalization, and channel reordering: .. code:: CPP cv::Mat preprocess(const cv::Mat& input_image) { cv::Size target_size(224, 224); double width_ratio = static_cast(target_size.width) / input_image.cols; double height_ratio = static_cast(target_size.height) / input_image.rows; cv::Mat resized_image; if (width_ratio < height_ratio) { int new_height = static_cast(input_image.rows * width_ratio); cv::resize(input_image, resized_image, cv::Size(target_size.width, new_height)); int top_padding = (target_size.height - new_height) / 2; int bottom_padding = target_size.height - new_height - top_padding; cv::copyMakeBorder(resized_image, resized_image, top_padding, bottom_padding, 0, 0, cv::BORDER_CONSTANT, cv::Scalar(0, 0, 0)); } else { int new_width = static_cast(input_image.cols * height_ratio); cv::resize(input_image, resized_image, cv::Size(new_width, target_size.height)); int left_padding = (target_size.width - new_width) / 2; int right_padding = target_size.width - new_width - left_padding; cv::copyMakeBorder(resized_image, resized_image, 0, 0, left_padding, right_padding, cv::BORDER_CONSTANT, cv::Scalar(0, 0, 0)); } resized_image.convertTo(resized_image, CV_32FC3, 1.0 / 255.0); cv::Scalar mean(0.485, 0.456, 0.406); cv::Scalar std_dev(0.229, 0.224, 0.225); cv::subtract(resized_image, mean, resized_image); cv::divide(resized_image, std_dev, resized_image); cv::cvtColor(resized_image, resized_image, cv::COLOR_BGR2RGB); return resized_image; } .. dropdown:: Inference Execution :animate: fade-in :color: secondary :open: The preprocessed image is loaded into an input tensor, and inference is executed `synchronously <../api_reference/pcie_host_apis/cpp_api_references.html#_CPPv4N6simaai12SimaMlsocApi14runSynchronousEK10shared_ptrI9SiMaModelERK14SiMaTensorListRK12SiMaMetaDataRK14SiMaTensorList>`_: .. code:: CPP memcpy(inputTensorsList[0].getPtr().get(), preprocessed_image.data, inputTensorsList[0].getSizeInBytes()); simaai::SiMaErrorCode ret = simaDeviceInst->runSynchronous(modelPtr, inputTensorsList, metaData, outputTensorsList); if (ret != simaai::success) { cout << "runInference Failure" << endl; } .. dropdown:: Post-processing and Result Handling :animate: fade-in :color: secondary :open: The application extracts the classification result from the inference output tensor and saves the image with the predicted label: .. code:: CPP cv::Mat output(1, 1000, CV_32FC1, (char*)outputTensorsList[0].getPtr().get()); double max_val; cv::Point max_loc; cv::minMaxLoc(output, nullptr, &max_val, nullptr, &max_loc); int predicted_val = max_loc.x; if (predicted_val >= 0 && predicted_val < labels.size()) { std::string predicted_label = labels[predicted_val]; writeTextToImage(image, image_path, predicted_label, outputFolder); } .. dropdown:: Model and Device Cleanup :animate: fade-in :color: secondary :open: After inference is complete, the model is unloaded, and the device is disconnected: .. code:: CPP simaDeviceInst->unload(modelPtr); simaDeviceInst->closeDevice(SiMaDevicePtr);