sima_utils.transformer.model
============================

.. py:module:: sima_utils.transformer.model


Submodules
----------

.. toctree::
   :maxdepth: 1

   /pages/api_reference/python-autoapi/sima_utils/transformer/model/base/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_cache_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_part_base/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_post_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_pre_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/vision_language_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/vision_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_decoder_cache_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_decoder_init_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_decoder_post_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_decoder_pre_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_encoder_model/index
   /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_model/index


Classes
-------

.. autoapisummary::

   sima_utils.transformer.model.EvalMode
   sima_utils.transformer.model.FileGenPrecision
   sima_utils.transformer.model.FileGenMode
   sima_utils.transformer.model.VisionLanguageModel
   sima_utils.transformer.model.WhisperModel


Package Contents
----------------

.. py:class:: EvalMode


   Model evaluation mode.


   .. py:attribute:: HF
      :value: 'hf'


   .. py:attribute:: ONNX
      :value: 'onnx'


   .. py:attribute:: SDK
      :value: 'sdk'


.. py:class:: FileGenPrecision


   Precision used when generating files.


   .. py:attribute:: BF16
      :value: 'bf16'


   .. py:attribute:: A_BF16_W_INT8
      :value: 'a_bf16_w_int8'


   .. py:attribute:: A_BF16_W_INT4
      :value: 'a_bf16_w_int4'


.. py:class:: FileGenMode


   File generation mode.


   .. py:attribute:: ONNX


   .. py:attribute:: MODEL_SDK_QUANTIZE


   .. py:attribute:: MODEL_SDK_COMPILE


   .. py:attribute:: DEVKIT


   .. py:attribute:: ALL


.. py:class:: VisionLanguageModel


   Vision-language model implementation.


   .. py:method:: from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, max_num_tokens: int, system_prompt: str | None = None, override_language_group_size: int | None = None, override_language_group_offsets: list[int] | None = None, override_language_future_token_mask_size: int = 1) -> VisionLanguageModel
      :staticmethod:


      Creates a VisionLanguageModel object from cached Hugging Face model.

      :param model_name: Model name. This is used as a file name prefix for the generated onnx
                         and model sdk files.
      :param hf_cache_path: Path to the cached Hugging Face model.
      :param onnx_path: Path to the generated ONNX files.
      :param sima_path: Path to the generated SiMa files.
      :param max_num_tokens: Maximum number of tokens, including both input and output tokens.
      :param system_prompt: System prompt.

      :returns: A VisionLanguageModel object for file generation or evaluation.


   .. py:method:: gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode = FileGenMode.ALL, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False)

      Generates files based on the provided file generation mode.

      :param gen_mode: File generation mode.
      :param precision: The precision to be used for Model SDK quantization mode.
      :param log_level: Logging level.
      :param part: Name of the part to be generated.
      :param part_idx: Specific index of the part to be generated. For pre/post model, the index is
                       the layer index; for cache model, the index is the token index.
      :param resume: Generate the files only if it does not exist.


   .. py:method:: evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, query: str, image: pathlib.Path | str | numpy.ndarray | None) -> str

      Evaluates the model with the input query and the image in the specified mode.

      :param eval_mode: Evaluation mode.
      :param query: User query.
      :param image: Path to the image or preprocessed image in numpy array.

      :returns: Generated output text.


   .. py:method:: run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) -> list[numpy.ndarray]

      Runs the model based on the evaluation mode.


   .. py:method:: get_language_embeddings_tensor() -> numpy.ndarray


.. py:class:: WhisperModel


   Whisper model implementation.


   .. py:attribute:: use_future_token_mask
      :type:  bool


   .. py:method:: from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, use_future_token_mask: bool) -> WhisperModel
      :staticmethod:


      Creates a WhisperModel object from cached Hugging Face model.

      :param model_name: Model name. This is used as a file name prefix for the generated onnx
                         and model sdk files.
      :param hf_cache_path: Path to the cached Hugging Face model.
      :param onnx_path: Path to the generated ONNX files.
      :param sima_path: Path to the generated SiMa files.

      :returns: A WhisperModel object for file generation or evaluation.


   .. py:method:: gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False)

      Generates files based on the provided file generation mode.

      :param gen_mode: File generation mode.
      :param Precision: The precision to be used for Model SDK quantization mode.
      :param log_level: Logging level.
      :param part: Name of the part to be generated.
      :param part_idx: Specific index of the part to be generated. For pre/post model, the index is
                       the layer index; for cache model, the index is the token index.
      :param resume: Generate the files if missing.


   .. py:method:: evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, audio: pathlib.Path | str | numpy.ndarray, language: str | None = None) -> str

      Evaluates the model with the input audio in the specified mode.

      :param eval_mode: Evaluation mode.
      :param audio: Path to the audio or preprocessed audio in numpy array.

      :returns: Generated output text.


   .. py:method:: run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) -> list[numpy.ndarray]


   .. py:method:: get_token_embeddings_tensor() -> numpy.ndarray


   .. py:method:: get_position_embeddings_tensor() -> numpy.ndarray


   .. py:method:: gen_devkit_files(resume: bool = False)

      Generates files for devkit.