Introduction to LLiMa
Overview
The GenAI Model Compilation feature streamlines the process of compiling GenAI models based on HF safetensors or GGUF model format. For a wide set of different models like Llama, Gemma, Phi, Qwen or Mistral from Hugging Face, the SDK automatically generates all required binary/elf files along with the Python orchestration script, enabling direct execution on the Sima.ai Modalix platform.
To quickly get started, Sima has precompiled several popular LLM models and published them on Hugging Face. You can download and run these models immediately using the following commands:
modalix:~$ cd /media/nvme && mkdir llima && cd llima
modalix:~$ sima-cli install -v 2.0.0 samples/llima -t select
Wait until installation completes then run:
modalix:~$ cd simaai-genai-demo && ./run.sh
This command prompts you to select and download a specific precompiled model for evaluating the Sima.ai Modalix platform. More information can be found in the LLiMa demo application.
Supported Models
The following table shows the supported model architectures and their capabilities:
Model Architecture |
Type |
Supported Sizes |
|---|---|---|
LLM |
||
LLM |
||
LLM |
1b, 3b |
|
LLM |
2b, 7b |
|
LLM |
2b, 9b |
|
LLM |
||
LLM |
||
LLM |
||
LLM |
||
LLM |
||
VLM |
||
VLM |
||
VLM |
Limitations
Limitation Type |
Description |
|---|---|
Model Architecture |
Only models based on the architectures listed above are supported. |
Model Parameters |
Only models with parameter count less than 10B are supported. |
HF Models |
Models must be downloaded from Hugging Face and contain: |
GGUF Models |
GGUF format is supported for LLMs only. VLMs must be compiled from the Hugging Face safetensors format. |
Gemma3 VLM |
Supported with modfied SigLip 448 vision encoder |
LLAMA 3.2 Vision |
Vision models are not supported |