Model Deployment
Overview
After compilation, models need to be deployed to the Modalix device for execution. The ModelSDK container provides the llima-deploy utility to streamline this process:
sima-user@docker-image-id:/home/docker$ llima-deploy <source_directory> <destination_directory>
Where:
source_directory- Path to the compiled model directory (containssima_files/withdevkit/,mpk/, and optionallynpy_files/subdirectories)destination_directory- Target directory on the Modalix device (or local path for rsync deployment)
When you run this command, the deployment tool performs three key steps:
Validates that the source directory contains required files (
sima_files/devkit/andsima_files/mpk/)Extracts ELF files from MPK archives (
*.tar.gz)Syncs the following to the destination using
rsync: -devkit/- Runtime orchestration files -elf_files/- Extracted binary files -npy_files/- LoRA adapter weights (automatically included if present)
The tool uses rsync internally for efficient file transfer and will skip files that are already up-to-date.
Deployment Workflow
After compiling your model with llima-compile, you’ll have a directory structure like:
Llama-3.2-3B-Instruct_out/
├── onnx_files/
└── sima_files/
├── devkit/
└── mpk/
To deploy this to the Modalix device, you have two options:
Option A: Direct deployment to Modalix device
If your host machine has network access to the Modalix device:
sima-user@docker-image-id:/home/docker$ llima-deploy Llama-3.2-3B-Instruct_out sima@192.168.1.20:/media/nvme/llima/llama3_2
Option B: Deploy to local directory for manual transfer
sima-user@docker-image-id:/home/docker$ llima-deploy Llama-3.2-3B-Instruct_out llama3_2
sima-user@docker-image-id:/home/docker$ scp -r llama3_2 sima@192.168.1.20:/media/nvme/llima/
Note
Replace 192.168.1.20 with the actual IP address of your Modalix device, if it was changed.
Once deployed, SSH into the Modalix device and run the model:
modalix:~$ ssh sima@192.168.1.20
Then run the model using the llima CLI or the full GenAI demo script (see Runtime & Orchestration):
modalix:~$ llima run <model_name>
modalix:/media/nvme/llima$ ./run.sh
Troubleshooting
Error: “devkit directory cannot be found”
Ensure the source directory is the output directory from llima-compile, which should contain sima_files subdirectory.
Error: “mpk directory cannot be found”
Verify that compilation completed successfully. The sima_files/mpk/ directory should contain .tar.gz files.
Slow deployment
Use
rsyncwith compression: The tool usesrsync -aPby defaultDeploy to NVMe storage on Modalix for faster model loading
Consider deploying only changed files using
--resumeduring compilation