sima_utils.transformer.model ============================ .. py:module:: sima_utils.transformer.model Submodules ---------- .. toctree:: :maxdepth: 1 /pages/api_reference/python-autoapi/sima_utils/transformer/model/base/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_cache_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_part_base/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_post_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/language_pre_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/vision_language_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/vision_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_decoder_cache_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_decoder_init_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_decoder_post_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_decoder_pre_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_encoder_model/index /pages/api_reference/python-autoapi/sima_utils/transformer/model/whisper_model/index Classes ------- .. autoapisummary:: sima_utils.transformer.model.EvalMode sima_utils.transformer.model.FileGenPrecision sima_utils.transformer.model.FileGenMode sima_utils.transformer.model.VisionLanguageModel sima_utils.transformer.model.WhisperModel Package Contents ---------------- .. py:class:: EvalMode Model evaluation mode. .. py:attribute:: HF :value: 'hf' .. py:attribute:: ONNX :value: 'onnx' .. py:attribute:: SDK :value: 'sdk' .. py:class:: FileGenPrecision Precision used when generating files. .. py:attribute:: BF16 :value: 'bf16' .. py:attribute:: A_BF16_W_INT8 :value: 'a_bf16_w_int8' .. py:attribute:: A_BF16_W_INT4 :value: 'a_bf16_w_int4' .. py:class:: FileGenMode File generation mode. .. py:attribute:: ONNX .. py:attribute:: MODEL_SDK_QUANTIZE .. py:attribute:: MODEL_SDK_COMPILE .. py:attribute:: DEVKIT .. py:attribute:: ALL .. py:class:: VisionLanguageModel Vision-language model implementation. .. py:method:: from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, max_num_tokens: int, system_prompt: str | None = None, override_language_group_size: int | None = None, override_language_group_offsets: list[int] | None = None, override_language_future_token_mask_size: int = 1) -> VisionLanguageModel :staticmethod: Creates a VisionLanguageModel object from cached Hugging Face model. :param model_name: Model name. This is used as a file name prefix for the generated onnx and model sdk files. :param hf_cache_path: Path to the cached Hugging Face model. :param onnx_path: Path to the generated ONNX files. :param sima_path: Path to the generated SiMa files. :param max_num_tokens: Maximum number of tokens, including both input and output tokens. :param system_prompt: System prompt. :returns: A VisionLanguageModel object for file generation or evaluation. .. py:method:: gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode = FileGenMode.ALL, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False) Generates files based on the provided file generation mode. :param gen_mode: File generation mode. :param precision: The precision to be used for Model SDK quantization mode. :param log_level: Logging level. :param part: Name of the part to be generated. :param part_idx: Specific index of the part to be generated. For pre/post model, the index is the layer index; for cache model, the index is the token index. :param resume: Generate the files only if it does not exist. .. py:method:: evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, query: str, image: pathlib.Path | str | numpy.ndarray | None) -> str Evaluates the model with the input query and the image in the specified mode. :param eval_mode: Evaluation mode. :param query: User query. :param image: Path to the image or preprocessed image in numpy array. :returns: Generated output text. .. py:method:: run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) -> list[numpy.ndarray] Runs the model based on the evaluation mode. .. py:method:: get_language_embeddings_tensor() -> numpy.ndarray .. py:class:: WhisperModel Whisper model implementation. .. py:attribute:: use_future_token_mask :type: bool .. py:method:: from_hf_cache(model_name: str, hf_cache_path: pathlib.Path | str, onnx_path: pathlib.Path | str, sima_path: pathlib.Path | str, use_future_token_mask: bool) -> WhisperModel :staticmethod: Creates a WhisperModel object from cached Hugging Face model. :param model_name: Model name. This is used as a file name prefix for the generated onnx and model sdk files. :param hf_cache_path: Path to the cached Hugging Face model. :param onnx_path: Path to the generated ONNX files. :param sima_path: Path to the generated SiMa files. :returns: A WhisperModel object for file generation or evaluation. .. py:method:: gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False) Generates files based on the provided file generation mode. :param gen_mode: File generation mode. :param Precision: The precision to be used for Model SDK quantization mode. :param log_level: Logging level. :param part: Name of the part to be generated. :param part_idx: Specific index of the part to be generated. For pre/post model, the index is the layer index; for cache model, the index is the token index. :param resume: Generate the files if missing. .. py:method:: evaluate(eval_mode: sima_utils.transformer.model.base.EvalMode, audio: pathlib.Path | str | numpy.ndarray, language: str | None = None) -> str Evaluates the model with the input audio in the specified mode. :param eval_mode: Evaluation mode. :param audio: Path to the audio or preprocessed audio in numpy array. :returns: Generated output text. .. py:method:: run_model(eval_mode: sima_utils.transformer.model.base.EvalMode, ifms: list[numpy.ndarray]) -> list[numpy.ndarray] .. py:method:: get_token_embeddings_tensor() -> numpy.ndarray .. py:method:: get_position_embeddings_tensor() -> numpy.ndarray .. py:method:: gen_devkit_files(resume: bool = False) Generates files for devkit.