afe.tvm_converter.quantization
==============================

.. py:module:: afe.tvm_converter.quantization

.. autoapi-nested-parse::

   Quantization code that is specific to the TVM converter.


Functions
---------

.. autoapisummary::

   afe.tvm_converter.quantization.correction_factors
   afe.tvm_converter.quantization.requantize_qnn_convolution_dense
   afe.tvm_converter.quantization.tflite_requantization_constants


Module Contents
---------------

.. py:function:: correction_factors(input_q: afe.ir.defines.Quantization, output_q: afe.ir.defines.Quantization) -> Tuple[float, float, int]

   Determine correction factors for requantizing from input_q to output_q.

   The correction factors consist of a scale correction sc, zero point correction zc,
   and shift n such that

       output = (input * sc + zc) * 2**-n

   and sc is in the range 0.5 to 1.

   :param input_q: Quantization of data prior to requantization
   :param output_q: Quantization of data after requantization
   :return: Scale correction, zero point correction, and shift


.. py:function:: requantize_qnn_convolution_dense(weight: numpy.ndarray, bias: Optional[numpy.ndarray], data_zero_point: int, product_q: Union[afe.ir.defines.Quantization, List[afe.ir.defines.Quantization]], output_q: afe.ir.defines.Quantization, is_dense: bool) -> Tuple[numpy.ndarray, numpy.ndarray, Union[int, numpy.ndarray]]

   Convert constant parameters from a relay IR quantized convolution/dense, bias-add,
   and requantization to constant parameters for a SiMa IR convolution/dense.  The
   SiMa IR operator is equivalent to these 3 operators.  Some precision will be
   lost due to rounding when converting between these parameters.

   :param weight: Weight tensor from QNN convolution, in HWIGO layout
       or from QNN dense in OI layout.
   :param bias: Bias tensor from QNN convolution/dense.  If None is given, it is
      treated as an array of zeros.
   :param data_zero_point: Zero point of the convolution's input activation matrix.
   :param product_q: Quantization of the input of the Relay IR requantize operator.
      When using per-tensor quantization, it is a single Quantization.  When using
      per-channel quantization, it is a list of Quantization with one item per channel.
   :param output_q: Quantization of the output of the Relay IR requantize operator.
      This is the same as the quantization of the output of the SiMa IR operator.
   :param is_dense: If True, function is used for requantization of dense operator,
       otherwise for convolution operator.
   :return: Weight, bias, and shift for SiMa IR convolution/dense.


.. py:function:: tflite_requantization_constants(weight: numpy.ndarray, bias: Optional[numpy.ndarray], data_zero_point: int, input_q: Union[afe.ir.defines.Quantization, List[afe.ir.defines.Quantization]], output_q: afe.ir.defines.Quantization, is_dense: bool) -> Union[Tuple[Optional[numpy.ndarray], int, int, int], Tuple[Optional[numpy.ndarray], numpy.ndarray, int, numpy.ndarray]]

   Compute constants for TFLite-style requantization.

   :param weight: Weight tensor from QNN convolution, in HWIGO layout
       or from QNN dense in OI layout.
   :param bias: Bias tensor from QNN convolution/dense.  If None is given, it is
      treated as an array of zeros.
   :param data_zero_point: Zero point of the convolution's input activation matrix.
   :param input_q: Quantization of the input of the Relay IR requantize operator.
      When using per-tensor quantization, it is a single Quantization.  When using
      per-channel quantization, it is a list of Quantization with one item per channel.
   :param output_q: Quantization of the output of the Relay IR requantize operator.
      This is the same as the quantization of the output of the SiMa IR operator.
   :param is_dense: If True, function is used for requantization of dense operator,
       otherwise for convolution operator.
   :return: Weight, bias, and shift for SiMa IR convolution/dense.
   :return: Modified bias, scale correction, zero point correction, and shift for convolution.
      Scale correction and shift are integers for per-tensor convolution, or arrays for per-channel convolution.