afe.core.mixed_precision.interface

Functions

mixed_precision_analysis(→ bool)

Implements the mixed-precision quantization algorithm.

Module Contents

afe.core.mixed_precision.interface.mixed_precision_analysis(fx_mod: torch.nn.Module, calibration_data: Iterable[afe.apis.defines.InputValues], accuracy_metric: Callable, target_accuracy: float, annotated_onnx_filename: str) → bool

Implements the mixed-precision quantization algorithm. TODO: Implementation details.

Parameters:

fx_mod – Torch representation of the model.
calibration_data – Data used in calibration.
accuracy_metric – One-parameter function used to produce the accuracy metric.
target_accuracy – Value used to determine the end of mixed precision search algorithm.
annotated_onnx_filename – File path to which the annotated ONNX model is to be written.

Returns:

bool. If the target accuracy is less than the 16 bit accuracy, saves an ONNX model containing precision annotations to the annotated_onnx_filename path and returns True else returns False.