.. _Model Deployment:

Model Deployment
================

Overview
--------

After compilation, models need to be deployed to the Modalix device for execution. The ModelSDK container provides the ``llima-deploy`` utility to streamline this process:

.. code-block:: console

   sima-user@docker-image-id:/home/docker$ llima-deploy <source_directory> <destination_directory>

Where:

- ``source_directory`` - Path to the compiled model directory (contains ``sima_files/`` with ``devkit/`` and ``mpk/`` subdirectories)
- ``destination_directory`` - Target directory on the Modalix device (or local path for rsync deployment)

When you run this command, the deployment tool performs three key steps:

1. **Validates** that the source directory contains required files (``sima_files/devkit/`` and ``sima_files/mpk/``)
2. **Extracts** ELF files from MPK archives (``*.tar.gz``)
3. **Syncs** the following to the destination using ``rsync``:
   - ``devkit/`` - Runtime orchestration files
   - ``elf_files/`` - Extracted binary files

The tool uses ``rsync`` internally for efficient file transfer and will skip files that are already up-to-date.

Requirements
------------

.. warning::

   **rsync must be installed before deployment!**
   
   The deployment tool will not work without ``rsync`` installed in your ModelSDK container. Please complete the installation steps below before attempting to deploy any models.

Before using the deployment tool, ensure that ``rsync`` is installed in your ModelSDK container. The deployment tool depends on ``rsync`` for efficient file synchronization.
Inside the **ModelSDK container**, run:

.. code-block:: console

   $ sudo apt-get update
   $ sudo apt-get install rsync


To verify rsync is installed correctly, run:

.. code-block:: console

   $ rsync --version

Deployment Workflow
-------------------

After compiling your model with ``llima-compile``, you'll have a directory structure like:

.. code-block:: text

   Llama-3.2-3B-Instruct_out/
   ├── onnx_files/
   └── sima_files/
       ├── devkit/
       └── mpk/

To deploy this to the Modalix device, you have two options:

**Option A: Direct deployment to Modalix device**

If your host machine has network access to the Modalix device:

.. code-block:: console

   sima-user@docker-image-id:/home/docker$ llima-deploy Llama-3.2-3B-Instruct_out sima@192.168.1.20:/media/nvme/llima/llama3_2


**Option B: Deploy to local directory for manual transfer**

.. code-block:: console

   sima-user@docker-image-id:/home/docker$ llima-deploy Llama-3.2-3B-Instruct_out llama3_2
   sima-user@docker-image-id:/home/docker$ scp -r llama3_2 sima@192.168.1.20:/media/nvme/llima/


.. note::

   Replace ``192.168.1.20`` with the actual IP address of your Modalix device, if it was changed.

Once deployed, SSH into the Modalix device and run the model:

.. code-block:: console
   modalix:~$ ssh sima@192.168.1.20
   modalix:~$ cd /media/nvme/llima/simaai-genai-demo
   modalix:~$ ./run.sh


Troubleshooting
---------------

**Error: "devkit directory cannot be found"**

Ensure the source directory is the output directory from ``llima-compile``, which should contain ``sima_files`` subdirectory.

**Error: "mpk directory cannot be found"**

Verify that compilation completed successfully. The ``sima_files/mpk/`` directory should contain ``.tar.gz`` files.

**Slow deployment**

- Use ``rsync`` with compression: The tool uses ``rsync -aP`` by default
- Deploy to NVMe storage on Modalix for faster model loading
- Consider deploying only changed files using ``--resume`` during compilation