afe.apis.loaded_net =================== .. py:module:: afe.apis.loaded_net Attributes ---------- .. autoapisummary:: afe.apis.loaded_net.GroundTruth Classes ------- .. autoapisummary:: afe.apis.loaded_net.LoadedNet Functions --------- .. autoapisummary:: afe.apis.loaded_net.load_model Module Contents --------------- .. py:data:: GroundTruth .. py:class:: LoadedNet(mod: afe._tvm._defines.TVMIRModule, layout: str, target: sima_utils.common.Platform, *, output_labels: list[str] | None, model_path: str | None) .. py:method:: execute(inputs: afe.apis.defines.InputValues, *, log_level: int = logging.NOTSET) -> list[numpy.ndarray] Execute the loaded network using a software implementation of operators. This method runs the network with a single set of input tensor values and returns the corresponding output tensor values. The execution does not simulate processor behavior but instead uses TVM operators for both FP32 and quantized models. Input and output tensors are automatically transposed if the model layout requires it. :param inputs: A dictionary mapping input names to their corresponding tensor data. Input tensors must be in channel-last layout (e.g., NHWC or NDHWC). :type inputs: InputValues :param log_level: Sets the logging level for this API call. Defaults to ``logging.NOTSET``. :type log_level: Optional[int], optional :returns: A list of output tensors resulting from the model execution. :rtype: list[np.ndarray] :raises UserFacingException: If an error occurs during the execution process. Execution Details: - Inputs are automatically transposed to match the model's expected layout if necessary. - Outputs are also transposed back to channel-last layout for consistency with API requirements. - Supports 4D (NCHW/NHWC) and 5D (NCDHW/NDHWC) tensor formats. .. py:method:: quantize(calibration_data: Iterable[afe.apis.defines.InputValues], quantization_config: afe.apis.defines.QuantizationParams, *, automatic_layout_conversion: bool = False, arm_only: bool = False, simulated_arm: bool = False, model_name: str | None = None, log_level: int = logging.NOTSET) -> afe.apis.model.Model Quantize the loaded neural network model using the provided calibration data and quantization configuration. If ``arm_only`` is ``False``, the model is calibrated and quantized for efficient execution on the SiMa MLSoC. If ``arm_only`` is ``True``, quantization is skipped, and the model is compiled for ARM execution—useful for testing. :param calibration_data: Dataset for calibration. Each sample is a dictionary mapping input names to calibration data. :type calibration_data: Iterable[InputValues] :param quantization_config: Parameters controlling the calibration and quantization process. :type quantization_config: QuantizationParams :param automatic_layout_conversion: Enable automatic layout conversion during processing. Defaults to ``False``. :type automatic_layout_conversion: bool, optional :param arm_only: Skip quantization and compile for ARM. Useful for testing. Defaults to ``False``. :type arm_only: bool, optional :param simulated_arm: Reserved for internal use. Simulates ARM backend behavior without compilation. Defaults to ``False``. :type simulated_arm: bool, optional :param model_name: Name for the returned quantized model. Defaults to ``None``. :type model_name: Optional[str], optional :param log_level: Logging level for this API call. Defaults to ``logging.NOTSET``. :type log_level: int, optional :returns: The quantized model instance or an ARM-prepared model if ``arm_only`` is ``True``. :rtype: Model :raises ValueError: If an invalid combination of parameters is provided (e.g., both ``arm_only`` and ``simulated_arm`` set to ``True``). :raises UserFacingException: If an error occurs during calibration or quantization. .. rubric:: Example .. code-block:: python # Load pre-processed calibration data dataset_f = np.load('preprocessed_data.npz') data = dataset_f['x'] # Prepare calibration data as a list of dictionaries calib_data = [] calib_images = 100 for i in range(calib_images): inputs = {'input_1': data[i]} calib_data.append(inputs) # Quantize the model quant_model = loaded_net.quantize( calibration_data=calib_data, quantization_config=default_quantization, model_name='quantized_model' ) .. py:method:: quantize_with_accuracy_feedback(calibration_data: Iterable[afe.apis.defines.InputValues], evaluation_data: Iterable[tuple[afe.apis.defines.InputValues, GroundTruth]], quantization_config: afe.apis.defines.QuantizationParams, *, accuracy_score: afe.driver.statistic.Statistic[tuple[list[numpy.ndarray], GroundTruth], float], target_accuracy: float, automatic_layout_conversion: bool = False, max_optimization_steps: int | None = None, model_name: str | None = None, log_level: int = logging.NOTSET) -> afe.apis.model.Model Quantizes the model with accuracy feedback using a mixed-precision approach. This method performs quantization with iterative accuracy feedback to ensure the final model meets the specified target accuracy. The process involves calibrating the model, evaluating its accuracy, and adjusting precision through multiple optimization steps if necessary. :param calibration_data: Required. The dataset used for model calibration. Each sample is a dictionary mapping input names to corresponding calibration data. :type calibration_data: Iterable[InputValues] :param evaluation_data: Required. The dataset used to evaluate model accuracy, where each element is a tuple containing input data and corresponding ground truth. :type evaluation_data: Iterable[tuple[InputValues, GroundTruth]] :param quantization_config: Required. Configuration parameters that define how the quantization process is performed. :type quantization_config: QuantizationParams :param accuracy_score: Required. The evaluation metric used to calculate accuracy during the quantization process. :type accuracy_score: Statistic[tuple[list[np.ndarray], GroundTruth], float] :param target_accuracy: Required. The target accuracy value that the quantized model must achieve. :type target_accuracy: float :param automatic_layout_conversion: Enables automatic layout conversion during processing. Defaults to ``False``. :type automatic_layout_conversion: bool, optional :param max_optimization_steps: Maximum number of optimization steps for mixed-precision quantization. Must be greater than 1. Defaults to ``_MIXED_PRECISION_SEARCH_LIMIT`` if not specified. :type max_optimization_steps: Optional[int], optional :param model_name: The name for the resulting quantized model. Defaults to ``None``. :type model_name: Optional[str], optional :param log_level: Sets the logging level for the process. Defaults to ``logging.NOTSET``. :type log_level: Optional[int], optional :returns: The quantized model along with its corresponding floating-point model. :rtype: Model :raises UserFacingException: - If activation quantization parameters are unsupported (only 8-bit precision is supported). - If ``max_optimization_steps`` is less than or equal to 1. - If an error occurs during the mixed-precision quantization process. .. py:method:: convert_to_sima_quantization(*, requantization_mode: afe.ir.defines.RequantizationMode = RequantizationMode.sima, model_name: str | None = None, log_level: int = logging.NOTSET) -> afe.apis.model.Model .. py:function:: load_model(params: afe.load.importers.general_importer.ImporterParams, *, target: sima_utils.common.Platform = gen1_target, log_level: int = logging.NOTSET) -> LoadedNet Load a machine learning model into the SiMa Model SDK for further processing such as quantization or compilation. This function validates the input parameters, detects the model format from the provided file paths, and ensures that the required fields (like input shapes, input names, output names) are populated according to the model type. If the model is successfully validated and imported, a `LoadedNet` instance is returned for downstream use. :param params: Import parameters including model file paths, input shapes, input types, names, and other configurations. :type params: ImporterParams :param target: Target platform for which the model should be loaded. Defaults to `gen1_target`. :type target: Platform, optional :param log_level: Logging level for the loading process. Defaults to `logging.NOTSET`. :type log_level: int, optional :returns: An object representing the successfully loaded model, ready for quantization, compilation, or other SDK operations. :rtype: LoadedNet :raises UserFacingException: - If no model file paths are provided. - If the detected model format does not match the expected format. - If required parameters for the detected model format are missing or invalid. - If the model format is unsupported. Supported Model Formats and Required Parameters: - ONNX, TFLite, Caffe, Caffe2: Requires non-empty `input_types` and `input_shapes`. - PyTorch: Requires non-empty `input_names` and `input_shapes`. - TensorFlow (v1 & v2): Requires non-empty `output_names` and `input_shapes`. - Keras: Requires non-empty `input_shapes`. .. rubric:: Example >>> params = ImporterParams( >>> file_paths=["model.onnx"], >>> input_shapes={"input_1": (1, 3, 224, 224)}, >>> input_types={"input_1": "float32"} >>> ) >>> loaded_model = load_model(params)