afe.apis.modelο
Classesο
Module Contentsο
- class afe.apis.model.Model(net: afe.ir.net.AwesomeNet, fp32_net: afe.ir.net.AwesomeNet | None = None)ο
- execute(inputs: afe.apis.defines.InputValues, *, fast_mode: bool = False, log_level: int | None = logging.NOTSET, keep_layer_outputs: list[afe.ir.defines.NodeName] | str | None = None, output_file_path: str | None = None) List[numpy.ndarray]ο
- Run input data through the quantized model. - Parameters:
- inputs β Dictionary of placeholder node names (str) to the input data. 
- fast_mode β If True, use a fast implementation of operators. If False, use an implementation that exactly matches execution on the MLA. 
- log_level β Logging level. 
- keep_layer_outputs β List of quantized model layer output names that should be saved. Each element of a list must have a valid name, according to the model layer outputs. Iff βallβ, all intermediate results are saved. 
- output_file_path β Location where the layer outputs should be saved. If defined, keep_layer_outputs argument must also be provided by the user. 
 
 - Returns: Outputs of quantized model.
- Also, saves the requested intermediate results inside output_file_path location. 
 
 - save(model_name: str, output_directory: str = '', *, log_level: int | None = logging.NOTSET) Noneο
- Save the quantized model and its floating-point counterpart (if available) to the specified directory. - Defaults to the current working directory if no output directory is provided. - Parameters:
- model_name (str) β Name for the saved quantized model file with a .sima extension. 
- output_directory (str, optional) β Directory to save the model files. Defaults to the current working directory. 
- log_level (Optional[int], optional) β Logging level for the operation. Defaults to logging.NOTSET. 
 
- Raises:
- UserFacingException β If an error occurs during the save process. 
 - Exampleο- >>> model = Model(quantized_net, fp32_net) >>> model.save("my_model", output_directory="models/") 
 - static load(model_name: str, network_directory: str = '', *, log_level: int | None = logging.NOTSET) Modelο
 - compile(output_path: str, batch_size: int = 1, compress: bool = True, log_level: int | None = logging.NOTSET, tessellate_parameters: afe.backends.mpk.interface.TessellateParameters | None = None, l2_caching_mode: afe.backends.mpk.interface.L2CachingMode = L2CachingMode.NONE, **kwargs) Noneο
- Compile the quantized model into a .tar.gz package for deployment in an MPK package. - The compiled package includes the binary model and a JSON structure file, saved in output_path as <model_name>_mpk.tar.gz. Batch size can be specified, though compiler optimizations may adjust it for optimal performance. The first dimension of input tensors must represent batch size. - Parameters:
- output_path (str) β Directory to save the compiled package. Created if it doesnβt exist. 
- batch_size (int, optional) β Batch size for compilation. Defaults to 1. 
- compress (bool, optional) β Enable DRAM data compression for the .lm file. Defaults to True. 
- log_level (Optional[int], optional) β Logging level. Defaults to logging.NOTSET. 
- tessellate_parameters (Optional[TessellateParameters], optional) β Internal use for MLA tessellation parameters. 
- l2_caching_mode (L2CachingMode, optional) β Internal use for N2A compilerβs L2 caching. Defaults to L2CachingMode.NONE. 
- **kwargs β - Additional internal options, including: - retained_temporary_directory_name (str): Path to retain intermediate files. 
- use_power_limits (bool): Enable power limits during compilation. 
- max_mla_power (float): Set maximum MLA power consumption. 
- layer_norm_use_fp32_intermediates (bool): Use FP32 intermediates for layer normalization. 
- rms_norm_use_fp32_intermediates (bool): Use FP32 intermediates for RMS normalization. 
 
 
- Raises:
- UserFacingException β If compilation fails due to invalid parameters or errors. 
 - Example:ο- >>> model = Model(quantized_net) >>> model.compile(output_path="compiled_models/", batch_size=4, compress=True) 
 - static create_auxiliary_network(transforms: List[afe.apis.transform.Transform], input_types: Dict[afe.ir.defines.InputName, afe.ir.tensor_type.TensorType], *, target: sima_utils.common.Platform = gen1_target, log_level: int | None = logging.NOTSET) Modelο
 - static compose(nets: List[Model], combined_model_name: str = 'main', log_level: int | None = logging.NOTSET) Modelο
 - evaluate(evaluation_data: Iterable[Tuple[afe.apis.defines.InputValues, afe.apis.compilation_job_base.GroundTruth]], criterion: afe.apis.statistic.Statistic[Tuple[List[numpy.ndarray], afe.apis.compilation_job_base.GroundTruth], str], *, fast_mode: bool = False, log_level: int | None = logging.NOTSET) strο
- Evaluate the model using the provided evaluation data and criterion. - This method runs the model on the given dataset and computes an aggregate result using the specified criterion. It supports a fast execution mode for quicker evaluations and customizable logging levels for diagnostic purposes. - Parameters:
- evaluation_data β An iterable of tuples where each tuple contains input values and the corresponding ground truth for evaluation. 
- criterion β A statistical function used to compute the evaluation metric based on the modelβs outputs and ground truth. 
- fast_mode β Optional; if set to True, the evaluation will execute in a faster but potentially less thorough manner. Defaults to False. 
- log_level β Optional; specifies the logging level for evaluation. Defaults to logging.NOTSET. 
 
- Returns:
- A string representing the final result of the evaluation based on the criterion. 
- Raises:
- Exception β If an error occurs during model execution, it is sanitized and re-raised. 
 
 - analyze_quantization_error(evaluation_data: Iterable[afe.apis.defines.InputValues], error_metric: afe.core.graph_analyzer.utils.Metric, *, local_feed: bool, log_level: int | None = logging.NOTSET)ο
 - get_performance_metrics(output_kpi_path: str, *, log_level: int | None = logging.NOTSET)ο
 - generate_elf_and_reference_files(input_data: Iterable[afe.apis.defines.InputValues], output_dir: str, *, batch_size: int = 1, compress: bool = True, tessellate_parameters: afe.backends.mpk.interface.TessellateParameters | None = None, log_level: int | None = logging.NOTSET, l2_caching_mode: afe.backends.mpk.interface.L2CachingMode = L2CachingMode.NONE) Noneο
 - execute_in_accelerator_mode(input_data: Iterable[afe.apis.defines.InputValues], devkit: str, *, username: str = cp.DEFAULT_USERNAME, password: str = '', batch_size: int = 1, compress: bool = True, tessellate_parameters: afe.backends.mpk.interface.TessellateParameters | None = None, log_level: int | None = logging.NOTSET, l2_caching_mode: afe.backends.mpk.interface.L2CachingMode = L2CachingMode.NONE) List[numpy.ndarray]ο