sima_utils.transformer.model.base

Classes

FileGenMode

File generation mode.

FileGenPrecision

Precision used when generating files.

EvalMode

Model evaluation mode.

BaseModel

Base implementation for visual-language model file generation.

Module Contents

class sima_utils.transformer.model.base.FileGenMode

File generation mode.

ONNX
MODEL_SDK_QUANTIZE
MODEL_SDK_COMPILE
DEVKIT
ALL
class sima_utils.transformer.model.base.FileGenPrecision

Precision used when generating files.

BF16 = 'bf16'
A_BF16_W_INT8 = 'a_bf16_w_int8'
A_BF16_W_INT4 = 'a_bf16_w_int4'
class sima_utils.transformer.model.base.EvalMode

Model evaluation mode.

HF = 'hf'
ONNX = 'onnx'
SDK = 'sdk'
class sima_utils.transformer.model.base.BaseModel

Base implementation for visual-language model file generation.

cfg

Configuration of the model.

model_name

Name of the model. This will be used to determine the generated files’ names.

onnx_path

Path to store the ONNX files.

sima_path

Path to store the SiMa-specific files.

hf_model

LocalHuggingFaceModel object for obtaining the parameters to generate ONNX files.

onnx_file_name

File name of the generated ONNX file.

weight_prefix

The prefix of weight tensor names in the source model.

cfg: sima_utils.transformer.vlm_config.BaseConfig
model_name: str
onnx_path: pathlib.Path = 'onnx_files'
sima_path: pathlib.Path = 'sima_files'
hf_model: sima_utils.transformer.hf_transformer.LocalHuggingFaceModel | None = None
vlm_helper: sima_utils.transformer.vlm_config.VlmHelper | None = None
gen_files(gen_mode: FileGenMode, *, precision: FileGenPrecision = FileGenPrecision.BF16, log_level: int = logging.NOTSET, resume: bool = False)

Generates files based on the provided file generation mode.

Parameters:
  • gen_mode – File generation mode.

  • precision – The precision to be used for Model SDK quantization mode.

  • log_level – Logging level.

  • resume – Set to generate only when the file cannot be found.

run_model(eval_mode: EvalMode, ifms: list[numpy.ndarray]) list[numpy.ndarray]

Runs the model based on the evaluation mode.

property vision_model_name: str
property language_model_name: str
property onnx_file_name: pathlib.Path
property sima_model_sdk_path: pathlib.Path

Path to the generated quantized Model SDK files.

property sima_mpk_path: pathlib.Path

Path to the generated MPK files.

property sdk_file_name: pathlib.Path

Path to the generated quantized Model SDK file.

property mpk_file_name: pathlib.Path

Path to the generated quantized Model SDK file.

property sima_devkit_path: pathlib.Path

Path to the generated files for DEVKIT.

get_gen_file_name(gen_mode: FileGenMode) pathlib.Path
gen_onnx_files()

Generates ONNX files.

gen_model_sdk_files(precision: FileGenPrecision, log_level: int)

Generates quantized Model SDK files.

Parameters:
  • precision – Precision used for quantization.

  • log_level – Logging level.

gen_mpk_files(log_level: int) afe.apis.model.Model

Generates MPK files.

Parameters:

log_level – Logging level.

gen_devkit_files(resume: bool = False)

Generates files for devkit.

check_hf_param(name: str) bool

Checks if a parameter tensor exists in the LocalHuggingFaceModel object.

Parameters:

name – Full name of the parameter.

Returns:

True if the parameter tensor exists.

get_hf_param(name: str) numpy.ndarray

Gets the parameter tensor from the LocalHuggingFaceModel object.

Parameters:

name – Full name of the parameter.

Returns:

The parameter tensor in numpy array.

create_onnx_builder()

Creates onnx builder.

gen_files_from_model_list(model_list: list[tuple[BaseModel, FileGenPrecision]], gen_mode: FileGenMode, num_processes: int, log_level: int, resume: bool)