.. _model_executor_python_api: Python API ########## **Module:** ``simaai_model_executor`` .. code-block:: python import simaai_model_executor as me executor = me.ModelExecutor() Constants --------- .. list-table:: :widths: 40 15 45 :header-rows: 1 * - **Name** - **Value** - **Description** * - ``DEFAULT_MODEL_EXECUTOR_DURATION_SECONDS`` - ``30`` - Default profiling duration in seconds * - ``DEFAULT_MODEL_EXECUTOR_OUTPUT_DIR`` - ``"/var/tmp"`` - Default output directory for profiling results Enumerations ------------ .. list-table:: :widths: 30 70 :header-rows: 1 * - **Enum** - **Values** * - ``me.KernelType`` - ``EV74``, ``A65`` * - ``me.ColorFormat`` - ``COLOR_FORMAT_RGB``, ``COLOR_FORMAT_BGR``, ``COLOR_FORMAT_IYUV``, ``COLOR_FORMAT_NV12``, ``COLOR_FORMAT_GRAY`` Input Format ------------ The Python binding accepts inputs as either: - A single ``numpy.ndarray`` — for single-input models. - A ``dict[str, numpy.ndarray]`` — for multi-input models, where keys are input tensor names. The **dtype** must match initialization: - ``float32`` — when ``init()`` was called **without** ``mean``/``stddev``. - ``uint8`` — when ``init()`` was called **with** ``mean`` and ``stddev``; normalization is applied internally. Arrays must be **C-contiguous** and **native byte order**. Methods ------- init() ~~~~~~ .. code-block:: python executor.init( tarGzFilePath, # str — path to .tar.gz model archive kernelType=me.KernelType.EV74, # me.KernelType.EV74 or me.KernelType.A65 mean=[], # list[float] — per-channel means (empty = skip) stddev=[], # list[float] — per-channel std devs (empty = skip) interpolationType=1, # int — 1=BILINEAR, 2=BICUBIC, 3=NEAREST, 4=AREA resizePreservingAspectRatio=False, # bool paddingPosition=0, # int — 0=CENTER, 1=TOP, 2=BOTTOM ) initBoxdecode() ~~~~~~~~~~~~~~~ Use for models with on-device NMS and top-k post-processing. .. code-block:: python executor.initBoxdecode( tarGzFilePath, # str kernelType=me.KernelType.EV74, mean=[], stddev=[], interpolationType=1, resizePreservingAspectRatio=False, paddingPosition=0, decodeType="", # str — "yolov5", "ssd", "" = auto-detect topk=0, # int — max detections after NMS (0 = no limit) numClasses=0, # int detectionThreshold=-1.0, # float — negative = use model default nmsIouThreshold=-1.0, # float — negative = use model default originalWidth=0, # int — 0 = use tensor width originalHeight=0, # int — 0 = use tensor height sigmoidOnProbabilities=-1, # int — 1=yes, 0=no, -1=auto ) runSynchronous() ~~~~~~~~~~~~~~~~ Blocks until inference completes. Returns the first output tensor. .. code-block:: python output = executor.runSynchronous(inputs) # inputs: numpy.ndarray (float32 or uint8) or dict[str, numpy.ndarray] # returns: numpy.ndarray (float32) — first output tensor only .. note:: Only the first output tensor is returned. Use ``runAsynchronous()`` or the C++ API to retrieve all outputs from multi-output models. Example: .. code-block:: python import numpy as np frame = np.random.rand(1, 224, 224, 3).astype(np.float32) output = executor.runSynchronous(frame) # Multi-input model output = executor.runSynchronous({"input_image": frame, "mask": mask_array}) runAsynchronous() ~~~~~~~~~~~~~~~~~ Non-blocking. Returns immediately after enqueuing. The callback is invoked on a dedicated worker thread. .. code-block:: python pushed = executor.runAsynchronous( inputs, # numpy.ndarray or dict[str, numpy.ndarray] metaData, # None, bool, int, float, str, list, or dict callback, # callable(output, metaData, ok) -> None ) # returns: bool — True if enqueued, False if executor is stopping The callback receives: - ``output`` — ``numpy.ndarray`` (single output) or ``list[numpy.ndarray]`` (multiple outputs) - ``metaData`` — the value passed to ``runAsynchronous()``, converted back to Python - ``ok`` — ``bool``, ``False`` on failure Example: .. code-block:: python import threading done = threading.Event() result = {} def callback(output, meta, ok): if ok: result["output"] = output result["meta"] = meta done.set() executor.runAsynchronous(frame, {"frame_id": 1}, callback) done.wait(timeout=10) profileModel() ~~~~~~~~~~~~~~ Runs synthetic inference for a fixed duration and returns JSON-encoded KPI metrics. .. code-block:: python kpi_json_str = executor.profileModel( duration_seconds=30, # int (default: 30) output_directory="/var/tmp", # str (default: "/var/tmp") run_synchronous=False, # bool — False = async mode (default) ) # returns: str — JSON-encoded results import json kpi = json.loads(kpi_json_str) # keys: total_frames, throughput_fps, # latency_min_ms, latency_max_ms, latency_avg_ms, # latency_p50_ms, latency_p95_ms, latency_p99_ms