sima_utils.transformer.model.whisper_model
==========================================

.. py:module:: sima_utils.transformer.model.whisper_model


Classes
-------

.. autoapisummary::

   sima_utils.transformer.model.whisper_model.WhisperModel


Module Contents
---------------

.. py:class:: WhisperModel


   Whisper model implementation.


   .. py:attribute:: use_future_token_mask
      :type:  bool


   .. py:method:: from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, use_future_token_mask: bool) -> WhisperModel
      :staticmethod:


      Creates a WhisperModel object from cached Hugging Face model.

      :param model_name: Model name. This is used as a file name prefix for the generated onnx
                         and model sdk files.
      :param hf_cache_path: Path to the cached Hugging Face model.
      :param onnx_path: Path to the generated ONNX files.
      :param sima_path: Path to the generated SiMa files.

      :returns: A WhisperModel object for file generation or evaluation.


   .. py:method:: gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False)

      Generates files based on the provided file generation mode.

      :param gen_mode: File generation mode.
      :param Precision: The precision to be used for Model SDK quantization mode.
      :param log_level: Logging level.
      :param part: Name of the part to be generated.
      :param part_idx: Specific index of the part to be generated. For pre/post model, the index is
                       the layer index; for cache model, the index is the token index.
      :param resume: Generate the files if missing.


   .. py:method:: evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, audio: pathlib.Path | str | numpy.ndarray, language: str | None = None) -> str

      Evaluates the model with the input audio in the specified mode.

      :param eval_mode: Evaluation mode.
      :param audio: Path to the audio or preprocessed audio in numpy array.

      :returns: Generated output text.


   .. py:method:: run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) -> list[numpy.ndarray]


   .. py:method:: get_token_embeddings_tensor() -> numpy.ndarray


   .. py:method:: get_position_embeddings_tensor() -> numpy.ndarray


   .. py:method:: gen_devkit_files(resume: bool = False)

      Generates files for devkit.