afe.apis.definesο
This file contains definitions of the types exposed by the development API for AFE.
Attributesο
Default quantization parameters for model quantization. |
|
Int16 quantization parameters for model quantization. |
Classesο
Generic enumeration. |
|
Color space standards for YUV and RGB conversion. |
|
Color conversion direction. |
|
Chroma sub-sampling representation. |
|
Interpolation method used in resize transform. |
|
Deposit location of resized image in padded frame. |
|
Represents a calibration method for model quantization. |
|
Represents a calibration method for model quantization. |
|
Represents the histogram MSE calibration method for quantization. |
|
Represents the moving average min-max calibration method for quantization. |
|
Represents the histogram entropy calibration method for quantization. |
|
Represents the histogram percentile calibration method for quantization. |
|
Quantization scheme. |
|
Parameters controlling how to quantize a network. |
Functionsο
|
|
|
Constructs a quantization scheme, which determines the range of quantizations that a quantization algorithm may choose from. |
|
Constructs a bfloat16 quantization scheme. |
Module Contentsο
- class afe.apis.defines.ExceptionFuncType[source]ο
Generic enumeration.
Derive from this class to define new enumerations.
- class afe.apis.defines.ColorSpaceStandard[source]ο
Color space standards for YUV and RGB conversion. BT601 is for SD video; BT709 is for HD video; BT2020 is for HDR.
- afe.apis.defines.BT_COLOR_COEFF: Dict[ColorSpaceStandard, List[float]][source]ο
- class afe.apis.defines.ResizeDepositLocation[source]ο
Deposit location of resized image in padded frame.
- class afe.apis.defines.CalibrationMethod[source]ο
Represents a calibration method for model quantization.
The
CalibrationMethod
class defines a base structure for various calibration techniques used during the quantization process. Each method is identified by a unique name and can be instantiated using the from_str method.- static from_str(method: str)[source]ο
Creates a calibration method based on the provided method name.
- Supported Methods:
MIN_MAX
/min_max
: Uses the minimum and maximum values of the dataset to determine the quantization range.MSE
/mse
: Utilizes a histogram-based method that minimizes the mean squared error (MSE) between the original and quantized values. This method uses 2048 histogram bins for precise calibration.MOVING_AVERAGE
/moving_average
: Computes the quantization range by maintaining a moving average of the observed min and max values during calibration.HISTOGRAM_ENTROPY
/entropy
: Employs an entropy-based approach to find the optimal threshold for quantization by minimizing information loss. It uses 512 histogram bins.HISTOGRAM_PERCENTILE
/percentile
: Sets the quantization range based on a specified percentile of the distribution (defaulting to the 99.9th percentile). This method helps to ignore outliers and uses 1024 histogram bins.
- Parameters:
method (str) β The name of the calibration method to use.
- Returns:
The corresponding calibration method configured with default parameters.
- Return type:
- Raises:
UserFacingException β If an unsupported calibration method is specified.
- class afe.apis.defines.MinMaxMethod[source]ο
Represents a calibration method for model quantization.
The
CalibrationMethod
class defines a base structure for various calibration techniques used during the quantization process. Each method is identified by a unique name and can be instantiated using the from_str method.
- class afe.apis.defines.HistogramMSEMethod(num_bins)[source]ο
Represents the histogram MSE calibration method for quantization.
The
HistogramMSEMethod
records the running histogram of tensor values during calibration. It searches for the optimalmin
andmax
values based on the histogram distribution to minimize the mean squared error (MSE) between the quantized model and the floating-point model. These optimal values are then used to compute the quantization parameters in the same way as theMIN_MAX
method.By default, the number of bins used for histogram calculation is
2048
when instantiated via thefrom_str()
method. To customize the number of bins, use theHistogramMSEMethod(num_bins)
method.- Parameters:
num_bins (int) β The number of bins used for histogram calculation. Default is
2048
.- Returns:
The configured histogram MSE calibration method.
- Return type:
- class afe.apis.defines.MovingAverageMinMaxMethod[source]ο
Represents the moving average min-max calibration method for quantization.
The
MovingAverageMinMaxMethod
records the running histogram of tensor values during calibration. It searches for the optimalmin
andmax
values based on the distribution of the histogram to minimize the mean squared error (MSE) between the quantized model and the floating-point model. These optimal values are then used to compute the quantization parameters similarly to theMIN_MAX
method.By default, the number of bins used for histogram calculation is 2048 when instantiated via the
from_str()
method. To customize the number of bins, use theHistogramMSEMethod(num_bins)
instead.
- class afe.apis.defines.HistogramEntropyMethod(num_bins)[source]ο
Represents the histogram entropy calibration method for quantization.
The
HistogramEntropyMethod
records the running histogram of tensor values during calibration. It searches for the optimalmin
andmax
values based on the distribution of the histogram to minimize the Kullback-Leibler (KL) divergence (relative entropy) between the quantized model and the floating-point model. These optimal values are then used to compute the quantization parameters in the same way as theMIN_MAX
method.By default, the number of bins used for histogram calculation is
512
when instantiated via thefrom_str()
method. To customize the number of bins, use theHistogramEntropyMethod(num_bins)
method.
- class afe.apis.defines.HistogramPercentileMethod(percentile_value, num_bins)[source]ο
Represents the histogram percentile calibration method for quantization.
The
HistogramPercentileMethod
records the running histogram of tensor values during calibration. It determines the optimalmin
andmax
values based on the specified percentile of the histogram distribution. These values are then used to compute the quantization parameters in the same way as theMIN_MAX
method.When instantiated using the
from_str
method, the default values are: -percentile_value
: 99.9 -num_bins
: 1024To use custom values for the percentile or the number of bins, this class should be instantiated directly.
- Parameters:
percentile_value (float) β The select percentile for determining the quantization range. Defaults to
99.9
.num_bins (int) β The number of bins used for histogram calculation. Defaults to
1024
.
- afe.apis.defines.default_calibration() CalibrationMethod [source]ο
- class afe.apis.defines.QuantizationScheme[source]ο
Quantization scheme.
- Parameters:
asymmetric β Whether to use asymmetric quantization.
per_channel β Whether to use per-channel quantization.
bits β Number of bits of precision to use in the quantized representation
bf16 β Whether to use bfloat16. If True, then asymmetric, per_channel, and bits are ignored.
- afe.apis.defines.quantization_scheme(asymmetric: bool, per_channel: bool, bits: int = 8) QuantizationScheme [source]ο
Constructs a quantization scheme, which determines the range of quantizations that a quantization algorithm may choose from.
- Parameters:
asymmetric (bool) β Required. Specifies whether to use asymmetric (versus symmetric) quantization.
per_channel (bool) β Required. Specifies whether to use per-channel (versus per-tensor) quantization.
bits (int, optional) β The number of bits of precision to use for the quantized representation of activations. Must be either
8
(for int8) or16
(for int16). Defaults to8
. The quantization of weights is fixed as int8.
- Returns:
The defined quantization scheme configured with the specified parameters.
- Return type:
- afe.apis.defines.bfloat16_scheme() QuantizationScheme [source]ο
Constructs a bfloat16 quantization scheme. It directs the compiler to use bfloat16 instead of integer quantization.
- class afe.apis.defines.QuantizationParams[source]ο
Parameters controlling how to quantize a network.
- Parameters:
calibration_method β Calibration method.
activation_quantization_scheme β Quantization scheme for activation tensors.
weight_quantization_scheme β Quantization scheme for weights tensors.
requantization_mode β A way of doing quantized arithmetic.
node_names β Nodes to prevent from quantizing.
custom_quantization_configs β Dictionary setting the nodeβs custom quantization options.
biascorr_type β Selection of bias correction: regular/iterative/none
channel_equalization β If True, channel equalization is enabled.
smooth_quant β If True, smooth quant is enabled.
- calibration_method: CalibrationMethod[source]ο
- activation_quantization_scheme: QuantizationScheme[source]ο
- weight_quantization_scheme: QuantizationScheme[source]ο
- requantization_mode: afe.ir.defines.RequantizationMode[source]ο
- biascorr_type: afe.ir.defines.BiasCorrectionType[source]ο
- with_calibration(method: CalibrationMethod) QuantizationParams [source]ο
Sets the calibration method for activation tensors.
This method configures the calibration approach for activation tensors during quantization. An observer is inserted at each layer to collect statistics of the output tensors. The derived min and max values from these statistics are used to compute the quantization parameters (scale and zero point) for each layer.
- Parameters:
method (CalibrationMethod) β Required. The calibration method to use. Supported methods include various approaches to determine the optimal quantization range based on tensor statistics.
- Returns:
A new instance of quantization parameters with the updated calibration method.
- Return type:
- with_activation_quantization(scheme: QuantizationScheme) QuantizationParams [source]ο
Sets the quantization scheme for activation tensors.
For activations, per-channel quantization is not supported. With per-tensor quantization, the
asymmetric
flag in the scheme can be set to eitherTrue
orFalse
to define the quantization behavior.- Parameters:
scheme (QuantizationScheme) β Required. The quantization scheme to be applied for the model activations.
- Returns:
A new instance of quantization parameters with the updated activation quantization scheme.
- Return type:
- with_weight_quantization(scheme: QuantizationScheme) QuantizationParams [source]ο
Sets the quantization scheme for weight tensors.
For weights, the asymmetric quantization scheme is not supported. With symmetric quantization, the
per_channel
flag can be set toTrue
orFalse
to define the quantization behavior for weights.- Parameters:
scheme (QuantizationScheme) β Required. The quantization scheme to be applied for the model weights.
- Returns:
A new instance of quantization parameters using the chosen weight quantization scheme.
- Return type:
- with_requantization_mode(requantization_mode: afe.ir.defines.RequantizationMode)[source]ο
Sets the requantization mode for convolutions.
Two requantization modes are supported:
RequantizationMode.sima
: Uses arithmetic optimized for fast performance on SiMaβs accelerator. This is the default mode.RequantizationMode.tflite
: Uses TFLiteβs arithmetic with an 8-bit constant multiplier.
- Parameters:
requantization_mode (RequantizationMode) β Required. The requantization mode to be applied.
- Returns:
A new instance of quantization parameters with the updated requantization mode.
- Return type:
- with_unquantized_nodes(node_names: Set[str]) QuantizationParams [source]ο
Selects nodes to prevent from quantizing.
Nodes with the specified names will be excluded from the quantization process. This replaces the set of node names selected by any previous call to
with_unquantized_nodes
. Note that node names can be sensitive to changes in optimization or quantization settings, as some nodes may be created or renamed by the compiler.- Parameters:
node_names (Set[str]) β Required. A set of strings specifying the names of nodes that should not be quantized.
- Returns:
A new instance of quantization parameters with the updated unquantized node configuration.
- Return type:
- with_custom_quantization_configs(custom_quantization_configs: Dict[afe.ir.defines.NodeName, Dict[str, Any]])[source]ο
Sets custom quantization options for specific nodes.
The
custom_quantization_configs
is a dictionary where each key is a node name, and the corresponding value is a dictionary defining custom quantization options. This method is typically used in the following scenarios:Enable the
int32
output of the last convolution node.Enable mixed-precision quantization.
Note: Users must obtain the node names from the SiMa IR graph. To do this, perform int8 quantization of a model and inspect the .sima.json file using Netron to identify node names.
- Parameters:
custom_quantization_configs (Dict[NodeName, Dict[str, Any]]) β A dictionary where each key is a node name and the value is a dictionary of custom quantization settings for that node.
- Returns:
A new instance of quantization parameters with the custom quantization configuration applied.
- Return type:
- with_bias_correction(enable: bool | afe.ir.defines.BiasCorrectionType = True)[source]ο
Enables or disables bias correction for the quantization of convolutions with a bias.
Bias correction calculates a bias term based on the observed input mean and the quantized weights. This term is then added to the convolution output to compensate for quantization errors. The algorithm is described in detail in Section 4.2 of the referenced paper.
- Parameters:
enable (bool | BiasCorrectionType) β Required. Determines whether bias correction is enabled or disabled. -
True
: Enables regular bias correction. -False
: Disables bias correction. -BiasCorrectionType
: Allows specifying a custom bias correction type.- Returns:
A new instance of quantization parameters with bias correction enabled or disabled as specified.
- Return type:
- with_channel_equalization(enable: bool = True)[source]ο
Enables or disables channel equalization for the quantization parameters.
Channel equalization is a preprocessing step that aims to balance the distribution of weight tensors across different channels, which can enhance the accuracy of quantized models.
- Parameters:
enable (bool, optional) β Specifies whether to enable channel equalization. Defaults to
True
.- Returns:
A new instance of quantization parameters with the updated channel equalization setting.
- Return type:
- afe.apis.defines.default_quantization: QuantizationParams[source]ο
Default quantization parameters for model quantization.
This configuration can be used as a baseline for quantizing a neural network using
quantize_net
or as a starting point for customizing quantization parameters. It specifies default settings for calibration methods, activation and weight quantization schemes, and the requantization mode.- afe.apis.defines.calibration_methodο
The calibration method used for quantization. Defaults to MSE().
- Type:
- afe.apis.defines.activation_quantization_schemeο
Defines the quantization scheme for activations. - Asymmetric:
True
- Per-channel:False
- Bits: 8- Type:
- afe.apis.defines.weight_quantization_schemeο
Defines the quantization scheme for weights. - Asymmetric:
False
- Per-channel:True` - Bits: ``8
- Type:
- afe.apis.defines.requantization_modeο
The mode used for requantization. Defaults to
RequantizationMode.sima
.- Type:
- afe.apis.defines.node_namesο
A set of node names to apply custom quantization configurations. Defaults to an empty set ({ββ}).
- Type:
set
- afe.apis.defines.custom_quantization_configsο
Custom configurations for specific nodes. Defaults to None.
- Type:
Optional[Dict]
- Returns:
The default quantization configuration for model quantization.
- Return type:
- afe.apis.defines.int16_quantization: QuantizationParams[source]ο
Int16 quantization parameters for model quantization.
This configuration is designed for quantizing neural networks where activations use 16-bit precision and weights use 8-bit precision. It can be used with
quantize_net
as a baseline for models requiring higher precision for activations, while maintaining efficient weight quantization.- afe.apis.defines.calibration_methodο
The calibration method used for quantization. Defaults to
MSE()
.- Type:
- afe.apis.defines.activation_quantization_schemeο
Defines the quantization scheme for activations. - Asymmetric:
True
- Per-channel:False
- Bits:16
- Type:
- afe.apis.defines.weight_quantization_schemeο
Defines the quantization scheme for weights. - Asymmetric:
False
- Per-channel:True
- Bits:8
- Type:
- afe.apis.defines.requantization_modeο
The mode used for requantization. Defaults to
RequantizationMode.sima
.- Type:
- afe.apis.defines.node_namesο
A set of node names to apply custom quantization configurations. Defaults to an empty set (
{''}
).- Type:
set
- afe.apis.defines.custom_quantization_configsο
Custom configurations for specific nodes. Defaults to
None
.- Type:
Optional[Dict]
- Returns:
The quantization configuration using int16 for activations and int8 for weights.
- Return type: