sima_utils.transformer.model.whisper_model

Classes

WhisperModel

Whisper model implementation.

Module Contents

class sima_utils.transformer.model.whisper_model.WhisperModel

Whisper model implementation.

use_future_token_mask: bool
static from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, use_future_token_mask: bool) WhisperModel

Creates a WhisperModel object from cached Hugging Face model.

Parameters:
  • model_name – Model name. This is used as a file name prefix for the generated onnx and model sdk files.

  • hf_cache_path – Path to the cached Hugging Face model.

  • onnx_path – Path to the generated ONNX files.

  • sima_path – Path to the generated SiMa files.

Returns:

A WhisperModel object for file generation or evaluation.

gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False)

Generates files based on the provided file generation mode.

Parameters:
  • gen_mode – File generation mode.

  • Precision – The precision to be used for Model SDK quantization mode.

  • log_level – Logging level.

  • part – Name of the part to be generated.

  • part_idx – Specific index of the part to be generated. For pre/post model, the index is the layer index; for cache model, the index is the token index.

  • resume – Generate the files if missing.

evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, audio: pathlib.Path | str | numpy.ndarray, language: str | None = None) str

Evaluates the model with the input audio in the specified mode.

Parameters:
  • eval_mode – Evaluation mode.

  • audio – Path to the audio or preprocessed audio in numpy array.

Returns:

Generated output text.

run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) list[numpy.ndarray]
get_token_embeddings_tensor() numpy.ndarray
get_position_embeddings_tensor() numpy.ndarray
gen_devkit_files(resume: bool = False)

Generates files for devkit.