afe.tvm_converter.quantization ============================== .. py:module:: afe.tvm_converter.quantization .. autoapi-nested-parse:: Quantization code that is specific to the TVM converter. Functions --------- .. autoapisummary:: afe.tvm_converter.quantization.correction_factors afe.tvm_converter.quantization.requantize_qnn_convolution_dense afe.tvm_converter.quantization.tflite_requantization_constants Module Contents --------------- .. py:function:: correction_factors(input_q: afe.ir.defines.Quantization, output_q: afe.ir.defines.Quantization) -> Tuple[float, float, int] Determine correction factors for requantizing from input_q to output_q. The correction factors consist of a scale correction sc, zero point correction zc, and shift n such that output = (input * sc + zc) * 2**-n and sc is in the range 0.5 to 1. :param input_q: Quantization of data prior to requantization :param output_q: Quantization of data after requantization :return: Scale correction, zero point correction, and shift .. py:function:: requantize_qnn_convolution_dense(weight: numpy.ndarray, bias: Optional[numpy.ndarray], data_zero_point: int, product_q: Union[afe.ir.defines.Quantization, List[afe.ir.defines.Quantization]], output_q: afe.ir.defines.Quantization, is_dense: bool) -> Tuple[numpy.ndarray, numpy.ndarray, Union[int, numpy.ndarray]] Convert constant parameters from a relay IR quantized convolution/dense, bias-add, and requantization to constant parameters for a SiMa IR convolution/dense. The SiMa IR operator is equivalent to these 3 operators. Some precision will be lost due to rounding when converting between these parameters. :param weight: Weight tensor from QNN convolution, in HWIGO layout or from QNN dense in OI layout. :param bias: Bias tensor from QNN convolution/dense. If None is given, it is treated as an array of zeros. :param data_zero_point: Zero point of the convolution's input activation matrix. :param product_q: Quantization of the input of the Relay IR requantize operator. When using per-tensor quantization, it is a single Quantization. When using per-channel quantization, it is a list of Quantization with one item per channel. :param output_q: Quantization of the output of the Relay IR requantize operator. This is the same as the quantization of the output of the SiMa IR operator. :param is_dense: If True, function is used for requantization of dense operator, otherwise for convolution operator. :return: Weight, bias, and shift for SiMa IR convolution/dense. .. py:function:: tflite_requantization_constants(weight: numpy.ndarray, bias: Optional[numpy.ndarray], data_zero_point: int, input_q: Union[afe.ir.defines.Quantization, List[afe.ir.defines.Quantization]], output_q: afe.ir.defines.Quantization, is_dense: bool) -> Union[Tuple[Optional[numpy.ndarray], int, int, int], Tuple[Optional[numpy.ndarray], numpy.ndarray, int, numpy.ndarray]] Compute constants for TFLite-style requantization. :param weight: Weight tensor from QNN convolution, in HWIGO layout or from QNN dense in OI layout. :param bias: Bias tensor from QNN convolution/dense. If None is given, it is treated as an array of zeros. :param data_zero_point: Zero point of the convolution's input activation matrix. :param input_q: Quantization of the input of the Relay IR requantize operator. When using per-tensor quantization, it is a single Quantization. When using per-channel quantization, it is a list of Quantization with one item per channel. :param output_q: Quantization of the output of the Relay IR requantize operator. This is the same as the quantization of the output of the SiMa IR operator. :param is_dense: If True, function is used for requantization of dense operator, otherwise for convolution operator. :return: Weight, bias, and shift for SiMa IR convolution/dense. :return: Modified bias, scale correction, zero point correction, and shift for convolution. Scale correction and shift are integers for per-tensor convolution, or arrays for per-channel convolution.