afe.apis.defines ================ .. py:module:: afe.apis.defines .. autoapi-nested-parse:: This file contains definitions of the types exposed by the development API for AFE. Attributes ---------- .. autoapisummary:: afe.apis.defines.InputValues afe.apis.defines.gen1_target afe.apis.defines.gen2_target afe.apis.defines.BT_COLOR_COEFF afe.apis.defines.YUV2RGB_FULL_RANGE_CONSTANTS afe.apis.defines.default_quantization afe.apis.defines.int16_quantization Classes ------- .. autoapisummary:: afe.apis.defines.ExceptionFuncType afe.apis.defines.ColorSpaceStandard afe.apis.defines.ColorConversion afe.apis.defines.ChromaSampling afe.apis.defines.ResizeMethod afe.apis.defines.ResizeDepositLocation afe.apis.defines.CalibrationMethod afe.apis.defines.MinMaxMethod afe.apis.defines.HistogramMSEMethod afe.apis.defines.MovingAverageMinMaxMethod afe.apis.defines.HistogramEntropyMethod afe.apis.defines.HistogramPercentileMethod afe.apis.defines.QuantizationScheme afe.apis.defines.QuantizationParams Functions --------- .. autoapisummary:: afe.apis.defines.default_calibration afe.apis.defines.quantization_scheme afe.apis.defines.bfloat16_scheme Module Contents --------------- .. py:data:: InputValues .. py:data:: gen1_target .. py:data:: gen2_target .. py:class:: ExceptionFuncType Generic enumeration. Derive from this class to define new enumerations. .. py:attribute:: LOADED_NET_LOAD .. py:attribute:: LOADED_NET_EXECUTE .. py:attribute:: LOADED_NET_QUANTIZE .. py:attribute:: LOADED_NET_CONVERT .. py:attribute:: MODEL_EXECUTE .. py:attribute:: MODEL_SAVE .. py:attribute:: MODEL_LOAD .. py:attribute:: MODEL_COMPILE .. py:attribute:: MODEL_CREATE_AUXILIARY .. py:attribute:: MODEL_COMPOSE .. py:attribute:: MODEL_EVALUATE .. py:attribute:: MODEL_PERFORMANCE .. py:attribute:: GENERATE_ELF_FILES .. py:attribute:: QUANTIZATION_ERROR_ANALYSIS .. py:class:: ColorSpaceStandard Color space standards for YUV and RGB conversion. BT601 is for SD video; BT709 is for HD video; BT2020 is for HDR. .. py:attribute:: BT601 :value: 'BT601' .. py:attribute:: BT709 :value: 'BT709' .. py:attribute:: BT2020 :value: 'BT2020' .. py:data:: BT_COLOR_COEFF :type: Dict[ColorSpaceStandard, List[float]] .. py:data:: YUV2RGB_FULL_RANGE_CONSTANTS :type: Dict[str, List[float]] .. py:class:: ColorConversion Color conversion direction. .. py:attribute:: YUV2RGB :value: 'YUV2RGB' .. py:attribute:: RGB2YUV :value: 'RGB2YUV' .. py:attribute:: BGR2RGB :value: 'BGR2RGB' .. py:attribute:: RGB2BGR :value: 'RGB2BGR' .. py:class:: ChromaSampling Chroma sub-sampling representation. .. py:attribute:: NV12 :value: 'NV12' .. py:attribute:: YUV420 :value: 'YUV420' .. py:attribute:: YUV422 :value: 'YUV422' .. py:class:: ResizeMethod Interpolation method used in resize transform. .. py:attribute:: LINEAR :value: 'linear' .. py:attribute:: NEAREST :value: 'nearest' .. py:attribute:: AREA :value: 'area' .. py:attribute:: CUBIC :value: 'cubic' .. py:class:: ResizeDepositLocation Deposit location of resized image in padded frame. .. py:attribute:: TOPLEFT :value: 'topleft' .. py:attribute:: CENTER :value: 'center' .. py:attribute:: BOTTOMRIGHT :value: 'bottomright' .. py:class:: CalibrationMethod Represents a calibration method for model quantization. The ``CalibrationMethod`` class defines a base structure for various calibration techniques used during the quantization process. Each method is identified by a unique name and can be instantiated using the `from_str` method. .. py:property:: name .. py:method:: from_str(method: str) :staticmethod: Creates a calibration method based on the provided method name. Supported Methods: - ``MIN_MAX`` / ``min_max``: Uses the minimum and maximum values of the dataset to determine the quantization range. - ``MSE`` / ``mse``: Utilizes a histogram-based method that minimizes the mean squared error (MSE) between the original and quantized values. This method uses 2048 histogram bins for precise calibration. - ``MOVING_AVERAGE`` / ``moving_average``: Computes the quantization range by maintaining a moving average of the observed min and max values during calibration. - ``HISTOGRAM_ENTROPY`` / ``entropy``: Employs an entropy-based approach to find the optimal threshold for quantization by minimizing information loss. It uses 512 histogram bins. - ``HISTOGRAM_PERCENTILE`` / ``percentile``: Sets the quantization range based on a specified percentile of the distribution (defaulting to the 99.9th percentile). This method helps to ignore outliers and uses 1024 histogram bins. :param method: The name of the calibration method to use. :type method: str :returns: The corresponding calibration method configured with default parameters. :rtype: CalibrationMethod :raises UserFacingException: If an unsupported calibration method is specified. .. py:class:: MinMaxMethod Represents a calibration method for model quantization. The ``CalibrationMethod`` class defines a base structure for various calibration techniques used during the quantization process. Each method is identified by a unique name and can be instantiated using the `from_str` method. .. py:class:: HistogramMSEMethod(num_bins) Represents the histogram MSE calibration method for quantization. The ``HistogramMSEMethod`` records the running histogram of tensor values during calibration. It searches for the optimal ``min`` and ``max`` values based on the histogram distribution to minimize the mean squared error (MSE) between the quantized model and the floating-point model. These optimal values are then used to compute the quantization parameters in the same way as the ``MIN_MAX`` method. By default, the number of bins used for histogram calculation is ``2048`` when instantiated via the ``from_str()`` method. To customize the number of bins, use the ``HistogramMSEMethod(num_bins)`` method. :param num_bins: The number of bins used for histogram calculation. Default is ``2048``. :type num_bins: int :returns: The configured histogram MSE calibration method. :rtype: HistogramMSEMethod .. py:attribute:: num_bins :type: int .. py:class:: MovingAverageMinMaxMethod Represents the moving average min-max calibration method for quantization. The ``MovingAverageMinMaxMethod`` records the running histogram of tensor values during calibration. It searches for the optimal ``min`` and ``max`` values based on the distribution of the histogram to minimize the mean squared error (MSE) between the quantized model and the floating-point model. These optimal values are then used to compute the quantization parameters similarly to the ``MIN_MAX`` method. By default, the number of bins used for histogram calculation is 2048 when instantiated via the ``from_str()`` method. To customize the number of bins, use the ``HistogramMSEMethod(num_bins)`` instead. .. attribute:: _name The internal name of the calibration method, set to ``'moving_average'``. :type: str .. py:class:: HistogramEntropyMethod(num_bins) Represents the histogram entropy calibration method for quantization. The ``HistogramEntropyMethod`` records the running histogram of tensor values during calibration. It searches for the optimal ``min`` and ``max`` values based on the distribution of the histogram to minimize the Kullback-Leibler (KL) divergence (relative entropy) between the quantized model and the floating-point model. These optimal values are then used to compute the quantization parameters in the same way as the ``MIN_MAX`` method. By default, the number of bins used for histogram calculation is ``512`` when instantiated via the ``from_str()`` method. To customize the number of bins, use the ``HistogramEntropyMethod(num_bins)`` method. .. attribute:: num_bins The number of bins used for histogram calculation. :type: int .. attribute:: _name The internal name of the calibration method, set to ``'entropy'``. :type: str .. py:attribute:: num_bins :type: int .. py:class:: HistogramPercentileMethod(percentile_value, num_bins) Represents the histogram percentile calibration method for quantization. The ``HistogramPercentileMethod`` records the running histogram of tensor values during calibration. It determines the optimal ``min`` and ``max`` values based on the specified percentile of the histogram distribution. These values are then used to compute the quantization parameters in the same way as the ``MIN_MAX`` method. When instantiated using the ``from_str`` method, the default values are: - ``percentile_value``: 99.9 - ``num_bins``: 1024 To use custom values for the percentile or the number of bins, this class should be instantiated directly. :param percentile_value: The select percentile for determining the quantization range. Defaults to ``99.9``. :type percentile_value: float :param num_bins: The number of bins used for histogram calculation. Defaults to ``1024``. :type num_bins: int .. py:attribute:: percentile_value :type: float .. py:attribute:: num_bins :type: int .. py:function:: default_calibration() -> CalibrationMethod .. py:class:: QuantizationScheme Quantization scheme. :param asymmetric: Whether to use asymmetric quantization. :param per_channel: Whether to use per-channel quantization. :param bits: Number of bits of precision to use in the quantized representation :param bf16: Whether to use bfloat16. If True, then asymmetric, per_channel, and bits are ignored. .. py:attribute:: asymmetric :type: bool .. py:attribute:: per_channel :type: bool .. py:attribute:: bits :type: int :value: 8 .. py:attribute:: bf16 :type: bool :value: False .. py:function:: quantization_scheme(asymmetric: bool, per_channel: bool, bits: int = 8) -> QuantizationScheme Constructs a quantization scheme, which determines the range of quantizations that a quantization algorithm may choose from. :param asymmetric: Required. Specifies whether to use asymmetric (versus symmetric) quantization. :type asymmetric: bool :param per_channel: Required. Specifies whether to use per-channel (versus per-tensor) quantization. :type per_channel: bool :param bits: The number of bits of precision to use for the quantized representation of activations. Must be either ``8`` (for int8) or ``16`` (for int16). Defaults to ``8``. The quantization of weights is fixed as int8. :type bits: int, optional :returns: The defined quantization scheme configured with the specified parameters. :rtype: QuantizationScheme .. py:function:: bfloat16_scheme() -> QuantizationScheme Constructs a bfloat16 quantization scheme. It directs the compiler to use bfloat16 instead of integer quantization. .. py:class:: QuantizationParams Parameters controlling how to quantize a network. :param calibration_method: Calibration method. :param activation_quantization_scheme: Quantization scheme for activation tensors. :param weight_quantization_scheme: Quantization scheme for weights tensors. :param requantization_mode: A way of doing quantized arithmetic. :param node_names: Nodes to prevent from quantizing. :param custom_quantization_configs: Dictionary setting the node's custom quantization options. :param biascorr_type: Selection of bias correction: regular/iterative/none :param channel_equalization: If True, channel equalization is enabled. :param smooth_quant: If True, smooth quant is enabled. .. py:attribute:: calibration_method :type: CalibrationMethod .. py:attribute:: activation_quantization_scheme :type: QuantizationScheme .. py:attribute:: weight_quantization_scheme :type: QuantizationScheme .. py:attribute:: requantization_mode :type: afe.ir.defines.RequantizationMode .. py:attribute:: node_names :type: Set[str] .. py:attribute:: custom_quantization_configs :type: Optional[Dict[afe.ir.defines.NodeName, Dict[str, Any]]] :value: None .. py:attribute:: biascorr_type :type: afe.ir.defines.BiasCorrectionType .. py:attribute:: channel_equalization :type: bool :value: False .. py:attribute:: smooth_quant :type: bool :value: False .. py:method:: with_calibration(method: CalibrationMethod) -> QuantizationParams Sets the calibration method for activation tensors. This method configures the calibration approach for activation tensors during quantization. An observer is inserted at each layer to collect statistics of the output tensors. The derived min and max values from these statistics are used to compute the quantization parameters (scale and zero point) for each layer. :param method: Required. The calibration method to use. Supported methods include various approaches to determine the optimal quantization range based on tensor statistics. :type method: CalibrationMethod :returns: A new instance of quantization parameters with the updated calibration method. :rtype: QuantizationParams .. py:method:: with_activation_quantization(scheme: QuantizationScheme) -> QuantizationParams Sets the quantization scheme for activation tensors. For activations, per-channel quantization is not supported. With per-tensor quantization, the ``asymmetric`` flag in the scheme can be set to either ``True`` or ``False`` to define the quantization behavior. :param scheme: Required. The quantization scheme to be applied for the model activations. :type scheme: QuantizationScheme :returns: A new instance of quantization parameters with the updated activation quantization scheme. :rtype: QuantizationParams .. py:method:: with_weight_quantization(scheme: QuantizationScheme) -> QuantizationParams Sets the quantization scheme for weight tensors. For weights, the asymmetric quantization scheme is not supported. With symmetric quantization, the ``per_channel`` flag can be set to ``True`` or ``False`` to define the quantization behavior for weights. :param scheme: Required. The quantization scheme to be applied for the model weights. :type scheme: QuantizationScheme :returns: A new instance of quantization parameters using the chosen weight quantization scheme. :rtype: QuantizationParams .. py:method:: with_requantization_mode(requantization_mode: afe.ir.defines.RequantizationMode) Sets the requantization mode for convolutions. Two requantization modes are supported: - ``RequantizationMode.sima``: Uses arithmetic optimized for fast performance on SiMa’s accelerator. This is the default mode. - ``RequantizationMode.tflite``: Uses TFLite’s arithmetic with an 8-bit constant multiplier. :param requantization_mode: Required. The requantization mode to be applied. :type requantization_mode: RequantizationMode :returns: A new instance of quantization parameters with the updated requantization mode. :rtype: QuantizationParams .. py:method:: with_unquantized_nodes(node_names: Set[str]) -> QuantizationParams Selects nodes to prevent from quantizing. Nodes with the specified names will be excluded from the quantization process. This replaces the set of node names selected by any previous call to ``with_unquantized_nodes``. Note that node names can be sensitive to changes in optimization or quantization settings, as some nodes may be created or renamed by the compiler. :param node_names: Required. A set of strings specifying the names of nodes that should not be quantized. :type node_names: Set[str] :returns: A new instance of quantization parameters with the updated unquantized node configuration. :rtype: QuantizationParams .. py:method:: with_custom_quantization_configs(custom_quantization_configs: Dict[afe.ir.defines.NodeName, Dict[str, Any]]) Sets custom quantization options for specific nodes. The ``custom_quantization_configs`` is a dictionary where each key is a node name, and the corresponding value is a dictionary defining custom quantization options. This method is typically used in the following scenarios: 1. Enable the ``int32`` output of the last convolution node. 2. Enable mixed-precision quantization. **Note:** Users must obtain the node names from the SiMa IR graph. To do this, perform int8 quantization of a model and inspect the `.sima.json` file using Netron to identify node names. :param custom_quantization_configs: A dictionary where each key is a node name and the value is a dictionary of custom quantization settings for that node. :type custom_quantization_configs: Dict[NodeName, Dict[str, Any]] :returns: A new instance of quantization parameters with the custom quantization configuration applied. :rtype: QuantizationParams .. py:method:: with_bias_correction(enable: bool | afe.ir.defines.BiasCorrectionType = True) Enables or disables bias correction for the quantization of convolutions with a bias. Bias correction calculates a bias term based on the observed input mean and the quantized weights. This term is then added to the convolution output to compensate for quantization errors. The algorithm is described in detail in Section 4.2 of the referenced paper. :param enable: Required. Determines whether bias correction is enabled or disabled. - ``True``: Enables regular bias correction. - ``False``: Disables bias correction. - ``BiasCorrectionType``: Allows specifying a custom bias correction type. :type enable: bool | BiasCorrectionType :returns: A new instance of quantization parameters with bias correction enabled or disabled as specified. :rtype: QuantizationParams .. py:method:: with_channel_equalization(enable: bool = True) Enables or disables channel equalization for the quantization parameters. Channel equalization is a preprocessing step that aims to balance the distribution of weight tensors across different channels, which can enhance the accuracy of quantized models. :param enable: Specifies whether to enable channel equalization. Defaults to ``True``. :type enable: bool, optional :returns: A new instance of quantization parameters with the updated channel equalization setting. :rtype: QuantizationParams .. py:method:: with_smooth_quant(enable: bool = True) .. py:data:: default_quantization :type: QuantizationParams Default quantization parameters for model quantization. This configuration can be used as a baseline for quantizing a neural network using ``quantize_net`` or as a starting point for customizing quantization parameters. It specifies default settings for calibration methods, activation and weight quantization schemes, and the requantization mode. .. attribute:: calibration_method The calibration method used for quantization. Defaults to `MSE()`. :type: CalibrationMethod .. attribute:: activation_quantization_scheme Defines the quantization scheme for activations. - Asymmetric: ``True`` - Per-channel: ``False`` - Bits: `8` :type: QuantizationScheme .. attribute:: weight_quantization_scheme Defines the quantization scheme for weights. - Asymmetric: ``False`` - Per-channel: ``True` - Bits: ``8`` :type: QuantizationScheme .. attribute:: requantization_mode The mode used for requantization. Defaults to ``RequantizationMode.sima``. :type: RequantizationMode .. attribute:: node_names A set of node names to apply custom quantization configurations. Defaults to an empty set (`{''}`). :type: set .. attribute:: custom_quantization_configs Custom configurations for specific nodes. Defaults to `None`. :type: Optional[Dict] :returns: The default quantization configuration for model quantization. :rtype: QuantizationParams .. py:data:: int16_quantization :type: QuantizationParams Int16 quantization parameters for model quantization. This configuration is designed for quantizing neural networks where activations use 16-bit precision and weights use 8-bit precision. It can be used with ``quantize_net`` as a baseline for models requiring higher precision for activations, while maintaining efficient weight quantization. .. attribute:: calibration_method The calibration method used for quantization. Defaults to ``MSE()``. :type: CalibrationMethod .. attribute:: activation_quantization_scheme Defines the quantization scheme for activations. - Asymmetric: ``True`` - Per-channel: ``False`` - Bits: ``16`` :type: QuantizationScheme .. attribute:: weight_quantization_scheme Defines the quantization scheme for weights. - Asymmetric: ``False`` - Per-channel: ``True`` - Bits: ``8`` :type: QuantizationScheme .. attribute:: requantization_mode The mode used for requantization. Defaults to ``RequantizationMode.sima``. :type: RequantizationMode .. attribute:: node_names A set of node names to apply custom quantization configurations. Defaults to an empty set (``{''}``). :type: set .. attribute:: custom_quantization_configs Custom configurations for specific nodes. Defaults to ``None``. :type: Optional[Dict] :returns: The quantization configuration using int16 for activations and int8 for weights. :rtype: QuantizationParams