afe.apis.loaded_netο
Attributesο
Classesο
Functionsο
|
Load a machine learning model into the SiMa Model SDK for further processing such as quantization or compilation. |
Module Contentsο
- class afe.apis.loaded_net.LoadedNet(mod: afe._tvm._defines.TVMIRModule, layout: str, target: sima_utils.common.Platform, *, output_labels: list[str] | None, model_path: str | None)[source]ο
- execute(inputs: afe.apis.defines.InputValues, *, log_level: int = logging.NOTSET) list[numpy.ndarray] [source]ο
Execute the loaded network using a software implementation of operators.
This method runs the network with a single set of input tensor values and returns the corresponding output tensor values. The execution does not simulate processor behavior but instead uses TVM operators for both FP32 and quantized models. Input and output tensors are automatically transposed if the model layout requires it.
- Parameters:
inputs (InputValues) β A dictionary mapping input names to their corresponding tensor data. Input tensors must be in channel-last layout (e.g., NHWC or NDHWC).
log_level (Optional[int], optional) β Sets the logging level for this API call. Defaults to
logging.NOTSET
.
- Returns:
A list of output tensors resulting from the model execution.
- Return type:
list[np.ndarray]
- Raises:
UserFacingException β If an error occurs during the execution process.
- Execution Details:
Inputs are automatically transposed to match the modelβs expected layout if necessary.
Outputs are also transposed back to channel-last layout for consistency with API requirements.
Supports 4D (NCHW/NHWC) and 5D (NCDHW/NDHWC) tensor formats.
- quantize(calibration_data: Iterable[afe.apis.defines.InputValues], quantization_config: afe.apis.defines.QuantizationParams, *, automatic_layout_conversion: bool = False, arm_only: bool = False, simulated_arm: bool = False, model_name: str | None = None, log_level: int = logging.NOTSET) afe.apis.model.Model [source]ο
Quantize the loaded neural network model using the provided calibration data and quantization configuration.
If
arm_only
isFalse
, the model is calibrated and quantized for efficient execution on the SiMa MLSoC.If
arm_only
isTrue
, quantization is skipped, and the model is compiled for ARM executionβuseful for testing.- Parameters:
calibration_data (Iterable[InputValues]) β Dataset for calibration. Each sample is a dictionary mapping input names to calibration data.
quantization_config (QuantizationParams) β Parameters controlling the calibration and quantization process.
automatic_layout_conversion (bool, optional) β Enable automatic layout conversion during processing. Defaults to
False
.arm_only (bool, optional) β Skip quantization and compile for ARM. Useful for testing. Defaults to
False
.simulated_arm (bool, optional) β Reserved for internal use. Simulates ARM backend behavior without compilation. Defaults to
False
.model_name (Optional[str], optional) β Name for the returned quantized model. Defaults to
None
.log_level (int, optional) β Logging level for this API call. Defaults to
logging.NOTSET
.
- Returns:
The quantized model instance or an ARM-prepared model if
arm_only
isTrue
.- Return type:
- Raises:
ValueError β If an invalid combination of parameters is provided (e.g., both
arm_only
andsimulated_arm
set toTrue
).UserFacingException β If an error occurs during calibration or quantization.
Example
# Load pre-processed calibration data dataset_f = np.load('preprocessed_data.npz') data = dataset_f['x'] # Prepare calibration data as a list of dictionaries calib_data = [] calib_images = 100 for i in range(calib_images): inputs = {'input_1': data[i]} calib_data.append(inputs) # Quantize the model quant_model = loaded_net.quantize( calibration_data=calib_data, quantization_config=default_quantization, model_name='quantized_model' )
- quantize_with_accuracy_feedback(calibration_data: Iterable[afe.apis.defines.InputValues], evaluation_data: Iterable[tuple[afe.apis.defines.InputValues, GroundTruth]], quantization_config: afe.apis.defines.QuantizationParams, *, accuracy_score: afe.driver.statistic.Statistic[tuple[list[numpy.ndarray], GroundTruth], float], target_accuracy: float, automatic_layout_conversion: bool = False, max_optimization_steps: int | None = None, model_name: str | None = None, log_level: int = logging.NOTSET) afe.apis.model.Model [source]ο
Quantizes the model with accuracy feedback using a mixed-precision approach.
This method performs quantization with iterative accuracy feedback to ensure the final model meets the specified target accuracy. The process involves calibrating the model, evaluating its accuracy, and adjusting precision through multiple optimization steps if necessary.
- Parameters:
calibration_data (Iterable[InputValues]) β Required. The dataset used for model calibration. Each sample is a dictionary mapping input names to corresponding calibration data.
evaluation_data (Iterable[tuple[InputValues, GroundTruth]]) β Required. The dataset used to evaluate model accuracy, where each element is a tuple containing input data and corresponding ground truth.
quantization_config (QuantizationParams) β Required. Configuration parameters that define how the quantization process is performed.
accuracy_score (Statistic[tuple[list[np.ndarray], GroundTruth], float]) β Required. The evaluation metric used to calculate accuracy during the quantization process.
target_accuracy (float) β Required. The target accuracy value that the quantized model must achieve.
automatic_layout_conversion (bool, optional) β Enables automatic layout conversion during processing. Defaults to
False
.max_optimization_steps (Optional[int], optional) β Maximum number of optimization steps for mixed-precision quantization. Must be greater than 1. Defaults to
_MIXED_PRECISION_SEARCH_LIMIT
if not specified.model_name (Optional[str], optional) β The name for the resulting quantized model. Defaults to
None
.log_level (Optional[int], optional) β Sets the logging level for the process. Defaults to
logging.NOTSET
.
- Returns:
The quantized model along with its corresponding floating-point model.
- Return type:
- Raises:
UserFacingException β
If activation quantization parameters are unsupported (only 8-bit precision is supported).
If
max_optimization_steps
is less than or equal to 1.If an error occurs during the mixed-precision quantization process.
- convert_to_sima_quantization(*, requantization_mode: afe.ir.defines.RequantizationMode = RequantizationMode.sima, model_name: str | None = None, log_level: int = logging.NOTSET) afe.apis.model.Model [source]ο
- afe.apis.loaded_net.load_model(params: afe.load.importers.general_importer.ImporterParams, *, target: sima_utils.common.Platform = gen1_target, log_level: int = logging.NOTSET) LoadedNet [source]ο
Load a machine learning model into the SiMa Model SDK for further processing such as quantization or compilation.
This function validates the input parameters, detects the model format from the provided file paths, and ensures that the required fields (like input shapes, input names, output names) are populated according to the model type. If the model is successfully validated and imported, a LoadedNet instance is returned for downstream use.
- Parameters:
params (ImporterParams) β Import parameters including model file paths, input shapes, input types, names, and other configurations.
target (Platform, optional) β Target platform for which the model should be loaded. Defaults to gen1_target.
log_level (int, optional) β Logging level for the loading process. Defaults to logging.NOTSET.
- Returns:
An object representing the successfully loaded model, ready for quantization, compilation, or other SDK operations.
- Return type:
- Raises:
UserFacingException β
If no model file paths are provided.
If the detected model format does not match the expected format.
If required parameters for the detected model format are missing or invalid.
If the model format is unsupported.
- Supported Model Formats and Required Parameters:
- ONNX, TFLite, Caffe, Caffe2:
Requires non-empty input_types and input_shapes.
- PyTorch:
Requires non-empty input_names and input_shapes.
- TensorFlow (v1 & v2):
Requires non-empty output_names and input_shapes.
- Keras:
Requires non-empty input_shapes.
Example
>>> params = ImporterParams( >>> file_paths=["model.onnx"], >>> input_shapes={"input_1": (1, 3, 224, 224)}, >>> input_types={"input_1": "float32"} >>> ) >>> loaded_model = load_model(params)