Introduction to LLiMa

Overview

The GenAI Model Compilation feature streamlines the process of compiling GenAI models based on HF safetensors or GGUF model format. For a wide set of different models like Llama, Gemma, Phi, Qwen or Mistral from Hugging Face, the SDK automatically generates all required binary/elf files along with the Python orchestration script, enabling direct execution on the Sima.ai Modalix platform.

To quickly get started, Sima has precompiled several popular LLM models and published them on Hugging Face. You can download and run these models immediately using the following commands:

modalix:~$ cd /media/nvme && mkdir llima && cd llima
modalix:~$ sima-cli install -v 2.0.0 samples/llima -t select

Wait until installation completes then run:

modalix:~$ cd simaai-genai-demo && ./run.sh

This command prompts you to select and download a specific precompiled model for evaluating the Sima.ai Modalix platform. More information can be found in the LLiMa demo application.

Supported Models

The following table shows the supported model architectures and their capabilities:

Model Architecture

Type

Supported Sizes

Llama 2

LLM

7b

Llama 3.1

LLM

8b

Llama 3.2

LLM

1b, 3b

Gemma 1

LLM

2b, 7b

Gemma 2

LLM

2b, 9b

Gemma 3

LLM

1b, 4b

Phi 3.5 mini

LLM

3.8b

Qwen 2.5

LLM

0.5b, 1.5b, 3b, 7b

Qwen 3

LLM

0.6b, 1.7b, 4b, 8b

Mistral 1

LLM

7b

Llava 1.5

VLM

7b

PaliGemma

VLM

3b

Gemma 3

VLM

4b

Limitations

Limitation Type

Description

Model Architecture

Only models based on the architectures listed above are supported.

Model Parameters

Only models with parameter count less than 10B are supported.

HF Models

Models must be downloaded from Hugging Face and contain: config.json, tokenizer.json, tokenizer_config.json, generation_config.json and weights in safetensors format

GGUF Models

GGUF format is supported for LLMs only. VLMs must be compiled from the Hugging Face safetensors format.

Gemma3 VLM

Supported with modfied SigLip 448 vision encoder

LLAMA 3.2 Vision

Vision models are not supported