afe.core.mixed_precision.interface

Functions

mixed_precision_analysis(→ bool)

Implements the mixed-precision quantization algorithm.

Module Contents

afe.core.mixed_precision.interface.mixed_precision_analysis(fx_mod: torch.nn.Module, calibration_data: Iterable[afe.apis.defines.InputValues], accuracy_metric: Callable, target_accuracy: float, annotated_onnx_filename: str) bool[source]

Implements the mixed-precision quantization algorithm. TODO: Implementation details.

Parameters:
  • fx_mod – Torch representation of the model.

  • calibration_data – Data used in calibration.

  • accuracy_metric – One-parameter function used to produce the accuracy metric.

  • target_accuracy – Value used to determine the end of mixed precision search algorithm.

  • annotated_onnx_filename – File path to which the annotated ONNX model is to be written.

Returns:

bool. If the target accuracy is less than the 16 bit accuracy, saves an ONNX model containing precision annotations to the annotated_onnx_filename path and returns True else returns False.