GenAI Model Compilation

Introduction

The GenAI Model Compilation feature streamlines the process of compiling GenAI models. For a select set of Llama, Llava, Gemma, and PaliGemma models from Hugging Face, the SDK automatically generates all required .elf files along with the Python orchestration script, enabling direct execution on the Sima.ai Modalix platform.

Sima has precompiled several popular LLM models and published them on Hugging Face. The developer can download these models using the following commands and explore them in the LLiMa demo application.

modalix:~$ cd /media/nvme && mkdir llima && cd llima
modalix:~$ sima-cli install -v 1.7.0 samples/llima -t select

Wait until installation completes then run:

modalix:~$ cd simaai-genai-demo && ./run.sh

This command prompts the developer to select and download a specific precompiled model for evaluating the Sima.ai Modalix platform. If the developer would like to compile and deploy a custom model on the Modalix platform, please continue reading.

Supported Models

The following table shows the supported model architectures and their capabilities:

Model Architecture

Type

Supported Versions

LLAVA

Multimodal (Vision + Language)

1, 2

PaliGEMMA

Multimodal (Vision + Language)

1, 2

LLAMA

Language Only

2, 3

GEMMA

Language Only

1, 2, 3

Limitations

Limitation Type

Description

Model Configuration

Only default configurations are supported

Model Parameters

Only models with parameter count less than 10B are supported.

Model Files

Models must be downloaded from Hugging Face and contain: config.json, tokenizer.model and weights in safetensors format

Gemma3 VLM

Supported as language-only models (vision capabilities disabled)

LLAMA 3.2 Vision

Vision models are not supported

System Requirements

Palette SDK needs to be installed on a machine that matches the following requirements.

Parameter

Description

Operating System

Ubuntu 22.04 LTS

Memory

128GB or more is recommended.

Storage

1TB available space

Note

With 128GB machine, compilation can take several hours to complete depends on the type of model. 64GB may work for models that do not have vision capabilities.

Prerequisites

  • Ensure that the latest sima-cli version is installed in the Palette SDK.

  • Have a valid Developer Portal account to download assets from docs.sima.ai.

  • Have a valid Hugging Face account to download open-source models.

  • Some models, such as google/paligemma, require accepting a license agreement on Hugging Face. Make sure to review and accept the license before attempting to download these models.

  • Authorize the CLI to access Hugging Face using an user access token and Hugging Face-cli. Note, installing sima-cli automatically installs Hugging Face-cli.

Sample Code

Download the sample and the google/paligemma model with the following commands in the Palette SDK:

sima-user@docker-image-id:~$ cd /home/docker/sima-cli && mkdir genai && cd genai
sima-user@docker-image-id:~$ sima-cli install -v 1.7.0 samples/vlm-codegen

The sima_utils.transformer.model() package provides everything needed to take open-source models (from Hugging Face) and run them efficiently on the SiMa.ai Modalix platform.

The sample script tut_auto_llm.py demonstrates a complete workflow for compiling and evaluating a Vision Language Model. It covers:

This workflow allows developers to go from a cached Hugging Face model to SiMa-ready binaries and validate correctness before deploying to the DevKit.

After a successful compilation, the console will display output confirming that the model artifacts have been generated.

   sima-user@docker-image-id:/home/docker/sima-ai/genai$ python tut_auto_llm.py

   Calibration Progress: |██████████████████████████████| 100.0% 1|1 Complete.  1/1
   Running Calibration ...DONE
   Running quantization ...DONE