ModelSDK

ModelSDK is an integral component of the Palette software. Developers can use the ModelSDK to prepare machine learning models for deployment on the MLSoC. The process of preparing a model includes converting it to use lower-precision data types on which the MLSoC can compute much more efficiently. Developers have several options to do this conversion, depending on the computational performance and numerical accuracy they want their model to attain. Developers can also use the ModelSDK to execute, inspect, and analyze models as part of their development process.

For optimal ML performance, the ModelSDK supports two types of quantization:

For most models, PTQ provides an efficient and straightforward way to reduce model size and improve inference speed with minimal accuracy loss. In some cases if the PTQ approach does not provide the required accuracy, QAT allows models to be trained with quantization constraints, fine-tuning the hyper-parameters for better accuracy on the SiMa MLSoC.

The ModelSDK also supports deploying a model without quantization, using the arm_only option in the PTQ workflow. While this is not recommended for production, it can be used during development for examining how quantization affects a pipeline’s performance.

While the SiMa toolchain simplifies deployment, certain ONNX operators, custom layers, or large models may need additional adjustments. Graph Surgery tools help adapt models for full compatibility with SiMa MLSoC.

Note

For cases where additional customization is needed, SiMa.ai Professional Services can assist with advanced optimizations to ensure seamless deployment. Our team can help fine-tune complex models, resolve compatibility challenges, and maximize performance while keeping project timelines efficient.