afe.driver.passes

Compiler passes.

These functions wrap IR transforms into compiler passes to be called by driver or API functions. Driver code should construct a compilation pipeline in CompileStep, then run it.

Attributes

`InputDict`
`MIXED_PRECISION_SEARCH_LIMIT`

Classes

BinarySearchState

State of binary search.

Functions

`import_model`(...)	Create a compiler pass to import a model
`tvm_transformations`(...)	Create a compiler pass to run TVM transformations on a model.
`import_and_transform`(...)	Create a compiler pass to import and run TVM transformations on a model.
`update_quantization_configs`(...)	Create a compiler pass that records quantization parameters on each node in a network.
`calibration`(→ Callable[[afe.ir.net.AwesomeNet, ...)	Create a compiler pass to calibrate a network.
`quantization`(→ Callable[[afe.ir.net.AwesomeNet], ...)	Create a compiler pass to quantize a network. Quantization configuration is set by
`equalization`(→ Callable[[afe.ir.net.AwesomeNet, ...)	Run SmoothQuant and/or channel equalization if they are enabled in config.
`calibration_quantization`(...)	Run calibration and quantization. Quantization-related optimizations that can run at the same time
`evaluation`(→ Callable[[afe.ir.net.AwesomeNet, ...)	Execute a model on an input set and compute an aggregate result from the model's outputs.
`binary_search`(...)	Do a binary search for the smallest integer n in the range [lo, hi] such that get_result(n)
`noise_analysis`(→ Callable[[afe.ir.net.AwesomeNet, ...)	Analyze noise that is introduced by quantization.
`noise_based_mixed_precision_quantization`(...)	Do mixed-precision quantization using noise analysis to choose precision.
`dump_diagnostic_files`(...)	Save intermediate compilation results to files for diagnostic purposes.
`dump_diagnostic_files_after`(...)	Run a compiler pass, then save intermediate compilation

Module Contents

afe.driver.passes.InputDict

afe.driver.passes.MIXED_PRECISION_SEARCH_LIMIT = 20

afe.driver.passes.import_model(config: afe.load.importers.general_importer.ImporterParams) → afe.driver.compile_step.CompileStep[Tuple[afe._tvm._defines.TVMIRModule, List[str] | None]]: Create a compiler pass to import a model :param config: configuration for import of a model :return: A compiler step to import a model.

It returns the imported TVM module and the module’s output names. Output names are only included if the source model has output names.

afe.driver.passes.tvm_transformations(*, layout: str | None = 'NCHW', index_to_backend_dict: Dict[int, afe.backends.Backend] | None = None, is_quantized: bool = False, name: str, framework: str | None = None) → Callable[[afe._tvm._defines.TVMIRModule], afe.driver.compile_step.CompileStep[afe._tvm._defines.TVMIRModule]]

Create a compiler pass to run TVM transformations on a model.

The TVM transformations include ConvertLayout.

Parameters:

layout – Data layout of activation tensors in the input model.
index_to_backend_dict – Assignment of nodes to backends. Assignments given here override the partitioning algorithm’s decision.
is_quantized – Whether the input is quantized. If quantized, partitioning will decide whether a given operator can execute on MLA. If not quantized, it will decide whether a given operator can be quantized and then execute on MLA.
name – Name of the model.

Returns:

Compiler pass that transforms Relay IR.

afe.driver.passes.import_and_transform(config: afe.load.importers.general_importer.ImporterParams, *, name: str, index_to_backend_dict: Dict[int, afe.backends.Backend] | None = None, is_quantized: bool = False) → afe.driver.compile_step.CompileStep[afe._tvm._defines.TVMIRModule]

Create a compiler pass to import and run TVM transformations on a model. See import_model and tvm_transformations for parameter documentation.

The returned compile step does the same processing as the old load_*_model functions.

afe.driver.passes.update_quantization_configs(quantization_config: afe.core.configs.QuantizationConfigs, *, custom_quantization_configs: Dict[afe.ir.net.NodeName, Dict[str, Any]] | None = None) → Callable[[afe.ir.net.AwesomeNet], afe.driver.compile_step.CompileStep[afe.ir.net.AwesomeNet]]

Create a compiler pass that records quantization parameters on each node in a network. The compiler pass modifies the network.

Parameters:

quantization_config – Global configuration parameters for quantization. These will be inserted on all nodes, but will not override previously inserted parameters.
custom_quantization_configs – Dictionary to override quantization settings for specific nodes. This parameter may only be used in tests. Where custom_quantization_configs[node_name][field_name] = value, it will set the given node’s given QuantizationConfigs field to the given value. For example, passing the value {“MLA_1/conv2d_add_84”: {“output_int32”: True}} will override the configuration of the node named “MLA_1/conv2d_add_84” by setting its output_int32 field to True.

Returns:

Compiler pass to update parameters. The pass mutates and returns its input.

afe.driver.passes.calibration(calibration_config: afe.core.configs.CalibrationConfigs) → Callable[[afe.ir.net.AwesomeNet, Iterable[InputDict]], afe.driver.compile_step.CompileStep[afe.ir.net.AwesomeNet]]

Create a compiler pass to calibrate a network.

Parameters:: calibration_config – Configuration for calibration.
Returns:: A compiler pass to calibrate a network. The pass mutates and returns its input.

afe.driver.passes.quantization(input_dataset: Iterable[InputDict] | None) → Callable[[afe.ir.net.AwesomeNet], afe.driver.compile_step.CompileStep[afe.ir.net.AwesomeNet]]

Create a compiler pass to quantize a network. Quantization configuration is set by UpdateQuantizationConfigs transform the calibration pass.

Returns:: A compiler pass to quantize a network. The pass mutates and returns its input.

afe.driver.passes.equalization(calibration_config: afe.core.configs.CalibrationConfigs, quantization_config: afe.core.configs.QuantizationConfigs) → Callable[[afe.ir.net.AwesomeNet, Iterable[InputDict]], afe.driver.compile_step.CompileStep[afe.ir.net.AwesomeNet]]: Run SmoothQuant and/or channel equalization if they are enabled in config. This pass should run before quantization.

afe.driver.passes.calibration_quantization(config: afe.core.configs.OptimizationConfigs, *, custom_quantization_configs: Dict[afe.ir.net.NodeName, Dict[str, Any]] | None = None) → Callable[[afe.ir.net.AwesomeNet, Iterable[InputDict]], afe.driver.compile_step.CompileStep[afe.ir.net.AwesomeNet]]

Run calibration and quantization. Quantization-related optimizations that can run at the same time are included here.

Parameters:

config – Parameters for calibration and quantization.
custom_quantization_configs – Dictionary to override quantization settings for specific nodes. This parameter may only be used in tests.

Returns:

A compiler pass to calibrate and quantize a network. The pass mutates and returns its input.

afe.driver.passes.evaluation(criterion: afe.driver.statistic.Statistic[Tuple[List[numpy.ndarray], afe.apis.compilation_job_base.GroundTruth], _A], *, fast_mode: bool = False) → Callable[[afe.ir.net.AwesomeNet, Iterable[Tuple[afe.apis.defines.InputValues, afe.apis.compilation_job_base.GroundTruth]]], afe.driver.compile_step.CompileStep[_A]]

Execute a model on an input set and compute an aggregate result from the model’s outputs.

The primary use case of this function is to estimate a model’s accuracy. In this case, criterion computes an accuracy metric for the model over the data set.

Parameters:

criterion – Function to compute on the model’s output and auxiliary data.
fast_mode – Whether to execute in fast mode.

Returns:

A compiler pass that takes an AwesomeNet and data source, runs the model, and returns the function’s result.

class afe.driver.passes.BinarySearchState

State of binary search. :param lo: Low bound of search range. This is the highest index that was found to not satisfy the search condition. :param hi: High bound of search range. This is the lowest index that was found to satisfy the search condition. :param hi_value: Value associated with high bound. This will be returned if the high value is selected when the

search finishes.

Parameters:: iteration – Iteration of search, starting from zero. Used for deciding to stop early.

lo: int

hi: int

hi_value: _A

iteration: int

afe.driver.passes.binary_search(get_result: Callable[[int], afe.driver.compile_step.CompileStep[tuple[bool, _A]]], lo: int, hi: int, limit: int, *, procedure_name: str | None = None) → afe.driver.compile_step.CompileStep[_A | None]

Do a binary search for the smallest integer n in the range [lo, hi] such that get_result(n) returns True in its bool result. Return the second result of get_result for the best n that was found, or None if every call returned False. The search always tests lo and hi, so if one of these returns True then the search will find a satisfactory n.

The search assumes that get_result is monotonic, that is, there’s some n such that get_result(i) returns False for all i <= n and returns True for all i > n. If it is not monotonic, it may not find the optimal n.

Parameters:

get_result – How to evaluate the search at a given value of n. When it runs, it returns a success flag and caller-specific data.
lo – Lowest value of n to evaluate
hi – Highest value of n to evaluate
limit – Maximum number of binary search steps to perform
procedure_name – Name of the procedure to be printed in progress messages to the console. If None, do not print progress messages.

Returns:

afe.driver.passes.noise_analysis() → Callable[[afe.ir.net.AwesomeNet, afe.ir.net.AwesomeNet, Iterable[InputDict]], afe.driver.compile_step.CompileStep[Dict[afe.core.graph_analyzer.utils.Metric, afe.core.graph_analyzer.graph_analyzer.AnalyzedResultDict]]]

Analyze noise that is introduced by quantization.

Returns:: A compiler pass to analyze noise. The pass takes as parameters an un-quantized net, a quantized net derived from it, and the evaluation input. It executes both nets on the evaluation inputs and compares the values at each layer to estimate quantization noise. It returns the analysis results.

afe.driver.passes.noise_based_mixed_precision_quantization(config: afe.core.configs.OptimizationConfigs, criterion: afe.driver.statistic.Statistic[Tuple[List[numpy.ndarray], afe.apis.compilation_job_base.GroundTruth], float], *, target_accuracy: float, custom_quantization_configs: Dict[afe.ir.net.NodeName, Dict[str, Any]] | None = None, max_iterations: int = MIXED_PRECISION_SEARCH_LIMIT, fast_mode: bool = True) → Callable[[afe.ir.net.AwesomeNet, Iterable[InputDict], Iterable[InputDict], Iterable[tuple[afe.apis.defines.InputValues, afe.apis.compilation_job_base.GroundTruth]]], afe.driver.compile_step.CompileStep[afe.ir.net.AwesomeNet]]

Do mixed-precision quantization using noise analysis to choose precision.

It will first quantize the model with int8 precision and measure its quantization noise on the analysis dataset. Then it will try to minimize the number of int16 nodes that achieve the target accuracy. It raises an exception if it cannot achieve the target accuracy.

A node achieves the target accuracy if evaluating the node using ‘criterion’ returns a number that is at least ‘target_accuracy’.

Parameters:

config – Parameters for calibration and quantization. The quantization precision must be int8.
criterion – Method of evaluating accuracy on a data set.
target_accuracy – Desired accuracy of network.
custom_quantization_configs – Dictionary to override quantization settings for specific nodes. This parameter may only be used in tests.
max_iterations – Maximum number of binary search steps to perform.
fast_mode – Whether to use fast mode when executing the network.

Returns:

A compiler pass that does mixed-precision quantization on a floating-point model, calibration dataset, analysis dataset, and evaluation dataset.

afe.driver.passes.dump_diagnostic_files(model_config: afe.core.configs.ModelConfigs, opt_config: afe.core.configs.OptimizationConfigs, *, prefix: str = '', suffix: str = '') → Callable[[afe.ir.net.AwesomeNet], afe.driver.compile_step.CompileStep[None]]

Save intermediate compilation results to files for diagnostic purposes.

The following files are saved in model_config.output_directory. The model is saved to {prefix}{model_config.name}{suffix}.yaml and {prefix}{model_config.name}{suffix}npz. The configuration parameters are saved to {model_config.model_name}.yaml.

Parameters:

model_config – Model testing configuration
opt_config – Optimization parameters
prefix – Prefix to attach to filenames
suffix – Suffix to attach to filenames

Returns:

A compiler pass that dumps diagnostic files

afe.driver.passes.dump_diagnostic_files_after(step: afe.driver.compile_step.CompileStep[afe.ir.net.AwesomeNet], model_config: afe.core.configs.ModelConfigs, opt_config: afe.core.configs.OptimizationConfigs, *, condition: bool = True, prefix: str = '', suffix: str = '') → afe.driver.compile_step.CompileStep[afe.ir.net.AwesomeNet]

Run a compiler pass, then save intermediate compilation results to files for diagnostic purposes. See dump_diagnostic_files for details.

Parameters:

step – Compilation step to run before dumping results
model_config – Model testing configuration
opt_config – Optimization parameters
prefix – Prefix to attach to filenames
suffix – Suffix to attach to filenames
condition – Whether to dump files. If False, only the compilation step runs.

Returns:

Compilation step extended to dump files at the end