sima_utils.transformer.model
Submodules
- sima_utils.transformer.model.base
- sima_utils.transformer.model.language_cache_model
- sima_utils.transformer.model.language_model
- sima_utils.transformer.model.language_part_base
- sima_utils.transformer.model.language_post_model
- sima_utils.transformer.model.language_pre_model
- sima_utils.transformer.model.vision_language_model
- sima_utils.transformer.model.vision_model
- sima_utils.transformer.model.whisper_decoder_cache_model
- sima_utils.transformer.model.whisper_decoder_init_model
- sima_utils.transformer.model.whisper_decoder_post_model
- sima_utils.transformer.model.whisper_decoder_pre_model
- sima_utils.transformer.model.whisper_encoder_model
- sima_utils.transformer.model.whisper_model
Classes
Model evaluation mode. |
|
Precision used when generating files. |
|
File generation mode. |
|
Vision-language model implementation. |
|
Whisper model implementation. |
Package Contents
- class sima_utils.transformer.model.EvalMode
Model evaluation mode.
- HF = 'hf'
- ONNX = 'onnx'
- SDK = 'sdk'
- class sima_utils.transformer.model.FileGenPrecision
Precision used when generating files.
- BF16 = 'bf16'
- A_BF16_W_INT8 = 'a_bf16_w_int8'
- A_BF16_W_INT4 = 'a_bf16_w_int4'
- class sima_utils.transformer.model.FileGenMode
File generation mode.
- ONNX
- MODEL_SDK_QUANTIZE
- MODEL_SDK_COMPILE
- DEVKIT
- ALL
- class sima_utils.transformer.model.VisionLanguageModel
Vision-language model implementation.
- static from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, max_num_tokens: int, system_prompt: str | None = None, override_language_group_size: int | None = None, override_language_group_offsets: list[int] | None = None, override_language_future_token_mask_size: int = 1) VisionLanguageModel
Creates a VisionLanguageModel object from cached Hugging Face model.
- Parameters:
model_name – Model name. This is used as a file name prefix for the generated onnx and model sdk files.
hf_cache_path – Path to the cached Hugging Face model.
onnx_path – Path to the generated ONNX files.
sima_path – Path to the generated SiMa files.
max_num_tokens – Maximum number of tokens, including both input and output tokens.
system_prompt – System prompt.
- Returns:
A VisionLanguageModel object for file generation or evaluation.
- gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode = FileGenMode.ALL, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False)
Generates files based on the provided file generation mode.
- Parameters:
gen_mode – File generation mode.
precision – The precision to be used for Model SDK quantization mode.
log_level – Logging level.
part – Name of the part to be generated.
part_idx – Specific index of the part to be generated. For pre/post model, the index is the layer index; for cache model, the index is the token index.
resume – Generate the files only if it does not exist.
- evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, query: str, image: pathlib.Path | str | numpy.ndarray | None) str
Evaluates the model with the input query and the image in the specified mode.
- Parameters:
eval_mode – Evaluation mode.
query – User query.
image – Path to the image or preprocessed image in numpy array.
- Returns:
Generated output text.
- run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) list[numpy.ndarray]
Runs the model based on the evaluation mode.
- get_language_embeddings_tensor() numpy.ndarray
- class sima_utils.transformer.model.WhisperModel
Whisper model implementation.
- use_future_token_mask: bool
- static from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, use_future_token_mask: bool) WhisperModel
Creates a WhisperModel object from cached Hugging Face model.
- Parameters:
model_name – Model name. This is used as a file name prefix for the generated onnx and model sdk files.
hf_cache_path – Path to the cached Hugging Face model.
onnx_path – Path to the generated ONNX files.
sima_path – Path to the generated SiMa files.
- Returns:
A WhisperModel object for file generation or evaluation.
- gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False)
Generates files based on the provided file generation mode.
- Parameters:
gen_mode – File generation mode.
Precision – The precision to be used for Model SDK quantization mode.
log_level – Logging level.
part – Name of the part to be generated.
part_idx – Specific index of the part to be generated. For pre/post model, the index is the layer index; for cache model, the index is the token index.
resume – Generate the files if missing.
- evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, audio: pathlib.Path | str | numpy.ndarray, language: str | None = None) str
Evaluates the model with the input audio in the specified mode.
- Parameters:
eval_mode – Evaluation mode.
audio – Path to the audio or preprocessed audio in numpy array.
- Returns:
Generated output text.
- run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) list[numpy.ndarray]
- get_token_embeddings_tensor() numpy.ndarray
- get_position_embeddings_tensor() numpy.ndarray
- gen_devkit_files(resume: bool = False)
Generates files for devkit.