Model Deployment

Overview

After compilation, models need to be deployed to the Modalix device for execution. The ModelSDK container provides the llima-deploy utility to streamline this process:

sima-user@docker-image-id:/home/docker$ llima-deploy <source_directory> <destination_directory>

Where:

source_directory - Path to the compiled model directory (contains sima_files/ with devkit/ and mpk/ subdirectories)
destination_directory - Target directory on the Modalix device (or local path for rsync deployment)

When you run this command, the deployment tool performs three key steps:

Validates that the source directory contains required files (sima_files/devkit/ and sima_files/mpk/)
Extracts ELF files from MPK archives (*.tar.gz)
Syncs the following to the destination using rsync: - devkit/ - Runtime orchestration files - elf_files/ - Extracted binary files

The tool uses rsync internally for efficient file transfer and will skip files that are already up-to-date.

Requirements

Warning

rsync must be installed before deployment!

The deployment tool will not work without rsync installed in your ModelSDK container. Please complete the installation steps below before attempting to deploy any models.

Before using the deployment tool, ensure that rsync is installed in your ModelSDK container. The deployment tool depends on rsync for efficient file synchronization. Inside the ModelSDK container, run:

$ sudo apt-get update
$ sudo apt-get install rsync

To verify rsync is installed correctly, run:

$ rsync --version

Deployment Workflow

After compiling your model with llima-compile, you’ll have a directory structure like:

Llama-3.2-3B-Instruct_out/
├── onnx_files/
└── sima_files/
    ├── devkit/
    └── mpk/

To deploy this to the Modalix device, you have two options:

Option A: Direct deployment to Modalix device

If your host machine has network access to the Modalix device:

sima-user@docker-image-id:/home/docker$ llima-deploy Llama-3.2-3B-Instruct_out sima@192.168.1.20:/media/nvme/llima/llama3_2

Option B: Deploy to local directory for manual transfer

sima-user@docker-image-id:/home/docker$ llima-deploy Llama-3.2-3B-Instruct_out llama3_2
sima-user@docker-image-id:/home/docker$ scp -r llama3_2 sima@192.168.1.20:/media/nvme/llima/

Note

Replace 192.168.1.20 with the actual IP address of your Modalix device, if it was changed.

Once deployed, SSH into the Modalix device and run the model:

Troubleshooting

Error: “devkit directory cannot be found”

Ensure the source directory is the output directory from llima-compile, which should contain sima_files subdirectory.

Error: “mpk directory cannot be found”

Verify that compilation completed successfully. The sima_files/mpk/ directory should contain .tar.gz files.

Slow deployment

Use rsync with compression: The tool uses rsync -aP by default
Deploy to NVMe storage on Modalix for faster model loading
Consider deploying only changed files using --resume during compilation