.. _Model Deployment: Model Deployment ================ Overview -------- After compilation, models need to be deployed to the Modalix device for execution. The ModelSDK container provides the ``llima-deploy`` utility to streamline this process: .. code-block:: console sima-user@docker-image-id:/home/docker$ llima-deploy Where: - ``source_directory`` - Path to the compiled model directory (contains ``sima_files/`` with ``devkit/`` and ``mpk/`` subdirectories) - ``destination_directory`` - Target directory on the Modalix device (or local path for rsync deployment) When you run this command, the deployment tool performs three key steps: 1. **Validates** that the source directory contains required files (``sima_files/devkit/`` and ``sima_files/mpk/``) 2. **Extracts** ELF files from MPK archives (``*.tar.gz``) 3. **Syncs** the following to the destination using ``rsync``: - ``devkit/`` - Runtime orchestration files - ``elf_files/`` - Extracted binary files The tool uses ``rsync`` internally for efficient file transfer and will skip files that are already up-to-date. Requirements ------------ .. warning:: **rsync must be installed before deployment!** The deployment tool will not work without ``rsync`` installed in your ModelSDK container. Please complete the installation steps below before attempting to deploy any models. Before using the deployment tool, ensure that ``rsync`` is installed in your ModelSDK container. The deployment tool depends on ``rsync`` for efficient file synchronization. Inside the **ModelSDK container**, run: .. code-block:: console $ sudo apt-get update $ sudo apt-get install rsync To verify rsync is installed correctly, run: .. code-block:: console $ rsync --version Deployment Workflow ------------------- After compiling your model with ``llima-compile``, you'll have a directory structure like: .. code-block:: text Llama-3.2-3B-Instruct_out/ ├── onnx_files/ └── sima_files/ ├── devkit/ └── mpk/ To deploy this to the Modalix device, you have two options: **Option A: Direct deployment to Modalix device** If your host machine has network access to the Modalix device: .. code-block:: console sima-user@docker-image-id:/home/docker$ llima-deploy Llama-3.2-3B-Instruct_out sima@192.168.1.20:/media/nvme/llima/llama3_2 **Option B: Deploy to local directory for manual transfer** .. code-block:: console sima-user@docker-image-id:/home/docker$ llima-deploy Llama-3.2-3B-Instruct_out llama3_2 sima-user@docker-image-id:/home/docker$ scp -r llama3_2 sima@192.168.1.20:/media/nvme/llima/ .. note:: Replace ``192.168.1.20`` with the actual IP address of your Modalix device, if it was changed. Once deployed, SSH into the Modalix device and run the model: .. code-block:: console modalix:~$ ssh sima@192.168.1.20 modalix:~$ cd /media/nvme/llima/simaai-genai-demo modalix:~$ ./run.sh Troubleshooting --------------- **Error: "devkit directory cannot be found"** Ensure the source directory is the output directory from ``llima-compile``, which should contain ``sima_files`` subdirectory. **Error: "mpk directory cannot be found"** Verify that compilation completed successfully. The ``sima_files/mpk/`` directory should contain ``.tar.gz`` files. **Slow deployment** - Use ``rsync`` with compression: The tool uses ``rsync -aP`` by default - Deploy to NVMe storage on Modalix for faster model loading - Consider deploying only changed files using ``--resume`` during compilation