sima_utils.transformer.model.whisper_model ========================================== .. py:module:: sima_utils.transformer.model.whisper_model Classes ------- .. autoapisummary:: sima_utils.transformer.model.whisper_model.WhisperModel Module Contents --------------- .. py:class:: WhisperModel Whisper model implementation. .. py:attribute:: use_future_token_mask :type: bool .. py:method:: from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, use_future_token_mask: bool) -> WhisperModel :staticmethod: Creates a WhisperModel object from cached Hugging Face model. :param model_name: Model name. This is used as a file name prefix for the generated onnx and model sdk files. :param hf_cache_path: Path to the cached Hugging Face model. :param onnx_path: Path to the generated ONNX files. :param sima_path: Path to the generated SiMa files. :returns: A WhisperModel object for file generation or evaluation. .. py:method:: gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False) Generates files based on the provided file generation mode. :param gen_mode: File generation mode. :param Precision: The precision to be used for Model SDK quantization mode. :param log_level: Logging level. :param part: Name of the part to be generated. :param part_idx: Specific index of the part to be generated. For pre/post model, the index is the layer index; for cache model, the index is the token index. :param resume: Generate the files if missing. .. py:method:: evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, audio: pathlib.Path | str | numpy.ndarray, language: str | None = None) -> str Evaluates the model with the input audio in the specified mode. :param eval_mode: Evaluation mode. :param audio: Path to the audio or preprocessed audio in numpy array. :returns: Generated output text. .. py:method:: run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) -> list[numpy.ndarray] .. py:method:: get_token_embeddings_tensor() -> numpy.ndarray .. py:method:: get_position_embeddings_tensor() -> numpy.ndarray .. py:method:: gen_devkit_files(resume: bool = False) Generates files for devkit.