sima_utils.transformer.model.vision_model

Classes

VisionModel

Vision model implementation.

VisionLayerModel

Vision model for each transformer layer with embedding or multimodal projection.

Module Contents

class sima_utils.transformer.model.vision_model.VisionModel

Vision model implementation.

is_single_vision_model: bool = True
actual_num_hidden_layers: int
gen_files(gen_mode: sima_utils.transformer.model.base.FileGenMode, *, precision: sima_utils.transformer.model.base.FileGenPrecision | dict[str, sima_utils.transformer.model.base.FileGenPrecision] | None = None, log_level: int = logging.NOTSET, num_processes: int = 1, part: str | None = None, part_idx: int | None = None, resume: bool = False)

Generates files based on the provided file generation mode.

Parameters:
  • gen_mode – File generation mode.

  • precision – The precision to be used for Model SDK quantization mode.

  • log_level – Logging level.

  • resume – Set to generate only when the file cannot be found.

class sima_utils.transformer.model.vision_model.VisionLayerModel

Vision model for each transformer layer with embedding or multimodal projection.

layer_idx: int
num_layers: int
include_embeddings: bool
include_mm_proj: bool
gen_onnx_files()

Generates ONNX files.