afe.ir.quantization_utils

Attributes

DTYPE_BOUNDS

Classes

QNNDtype

Data types used in QNN operations

Functions

`round_op`(→ float)	Rounding to the nearest larger integer
`calculate_normalization_shift`(→ Union[float, ...)	Calculate the number of shifts to normalize a scale.
`get_bound`(→ int)
`clip_to_targeted_range`(→ Union[int, numpy.ndarray])	Clip the x with targeted range determined by the given bit number.
`compute_scale`(→ float)	Compute a linear quantization scale for mapping the range (min_val, max_val) onto the quantized integer range
`compute_zero_point`(→ int)	Given min and max value, compute the zero point.
`significant_bits_signed`(→ int)	Get the smallest signed integer bit width that can represent the given integer.
`compute_power_of_2_scale_and_shift`(→ Union[Tuple[int, ...)	Given a float scale or a vector of scale and quantized bit number for input and output,
`compute_weight_scale`(→ float)	Compute weight scale. Weights are always quantized symmetrically.
`compute_weight_scale_per_channel`(→ numpy.ndarray)	Compute per-channel weight scales. The expected layout of weight is AwesomeConvWeightLayout.
`linear_scale`(→ numpy.ndarray)	Linear scale the input based on the scale. Clip the scaled input based on the bit number
`linear_scale_per_channel`(→ numpy.ndarray)	Linear scale the input based on the scale. Clip the scaled input based on the bit number
`linear_quantize`(→ numpy.ndarray)	quantized_input = (input / S) + zero_point.
`linear_quantize_with_quantization`(→ numpy.ndarray)	Apply a quantization to a floating-point tensor to produce a quantized tensor.
`quantize_value`(→ Any)	Quantize a value according to the given quantization.
`dequantize_value`(→ Any)	Dequantize a value according to the given quantization.
`get_zero_kernel_mask_per_channel`(→ numpy.ndarray)	Return the mask of zero kernel. The kernel layout of weight must be in AwesomeConvWeightLayout.
`dequantize`(→ numpy.ndarray)	Original equation:
`requantize`(→ numpy.ndarray)	Requantize a quantized tensor to another quantization domain
`float_requantization`(→ Tuple[float, float])	Calculate floating-point correction parameters to requantize integer data using
`power_of_2_requantization`(→ int)	Calculate a shift factor to requantize data by a power of 2 in integer arithmetic.
`requantization`(→ Tuple[int, int, int])	Calculate correction factors to requantize data in integer arithmetic.
`requantization_tflite`(→ Tuple[int, int, int])	Calculate correction factors to do TFLite requantization.
`is_quantized`(→ bool)
`dequantize_tensor`(→ Union[List[numpy.ndarray], ...)	Dequantize tensor. A tensor can be a List[int], a Tuple[np.ndarray, ...], or a np.ndarray.
`quantize_tensor`(→ Union[Tuple[numpy.ndarray, ...)	Quantize tensor. A tensor can be Tuple[np.ndarray, ...] or a np.ndarray.
`dequantize_input_dict`(→ Dict[afe.ir.defines.NodeName, ...)	Given a input_dict, input scales, and input zero points, dequantize each input in the input_dict
`quantize_input_dict`(→ Dict[afe.ir.defines.NodeName, ...)	Given a input_dict, input scales, and input zero points, quantize each input in the input_dict
`quantize_alpha`(→ Tuple[numpy.ndarray, int])	Quantize the alpha for PreluOp
`quantize_add_subtract`(→ Tuple[List[int], int, int])	Quantize the add/subtact operator
`quantize_multiply`(→ Tuple[int, ...)	Quantize the multiply operator.
`quantize_batch_matmul`(→ Tuple[int, ...)
`quantize_udf`(→ numpy.ndarray)	Create a lookup table for a user-defined function.
`get_input_quantization_func`(...)	Return a function that takes a numpy array and using the scale
`quantize_clip_attrs`(→ afe.ir.attributes.ClipQuantAttrs)	Quantize the attributes of clip operator
`quantize_activation`(...)	Quantize a simple activation function (clip, relu, or nothing) and simplify it if possible.
`requantize_activation`(...)	Requantize an activation function.
`requantize_quantization`(→ afe.ir.defines.Quantization)	Get the quantization of the result of requantizing a tensor.
`quantize_prelu`(→ Tuple[int, int])	Quantized the PRelu alphas and return the quantized alphas and right shifts
`quantize_reciprocal`(→ afe.ir.attributes.AwesomeCalibAttrs)	Quantize the reciprocal part of divide
`quantize_lrn`(→ afe.ir.attributes.LRNQuantAttrs)	Quantize LRN which is implemented based on quantized_local_response_normalization from ml_kernels repo:
`quantize_softmax`(→ afe.ir.attributes.SoftmaxQuantAttrs)	Quantize Softmax which is implemented based on softmax implementation from ml_kernels repo:
`quantize_layer_norm`(...)	Quantize LayerNorm which is implemented based on layer norm implementation from ml_kernels repo:
`quantize_instance_norm`(attrs, input_quant, mean_quant, ...)	Quantize Instance Normalization operator: (input - mean) / sqrt(variance + epsilon).
`quantize_rms_norm_core`(...)	Quantize RMS Normalization which is implemented based on rms norm implementation from ml_kernels repo:
`quantization_data_value_to_output_list`(...)	Convert a Data value of Quantization object(s) to lists of quantization-related values.
`fix_requantization`(...)	Change the data type of the right_shift array, if it is present, to uint8.
`cast_calibration_inputs`(values, cast)	Quantizes a list of tensors according to casts. Identity cast returns the original values.
`create_requantization_from_cast`(...)	Get the Requantization that implements the given cast.

Module Contents

class afe.ir.quantization_utils.QNNDtype

Data types used in QNN operations

INT8 = 'int8'

UINT8 = 'uint8'

INT32 = 'int32'

afe.ir.quantization_utils.DTYPE_BOUNDS

afe.ir.quantization_utils.round_op(x: float, rounding_type: ml_kernels.math_helpers.RoundType = RoundType.TOEVEN) → float: Rounding to the nearest larger integer :param x: A float32 number to be rounded return: Rounded result

afe.ir.quantization_utils.calculate_normalization_shift(scale: float | numpy.ndarray, rounding: ml_kernels.math_helpers.RoundType = RoundType.TRUNC) → float | numpy.ndarray: Calculate the number of shifts to normalize a scale. The original scale will be normalized, depending on the rounding type, after dividing (2**shift).

afe.ir.quantization_utils.get_bound(bits: int, signed: bool = True) → int

afe.ir.quantization_utils.clip_to_targeted_range(x: int | numpy.ndarray, bits: int, restricted_range: bool = False) → int | numpy.ndarray: Clip the x with targeted range determined by the given bit number. :param x: Numpy array or int :param bits: Number of bits used to determine the min and max number :param restricted_range: If true, the abs(a_min) == abs(a_max)

afe.ir.quantization_utils.compute_scale(asymmetry: bool, layer_bits: int, min_val: float, max_val: float, include_real_zero_point: bool = False) → float

Compute a linear quantization scale for mapping the range (min_val, max_val) onto the quantized integer range determined by layer_bits, include_real_zero_point, and asymmetry.

The computed scale is the reciprocal of the scale in TFLite’s convention.

Parameters:

asymmetry – If true, do asymmetric quantization.
layer_bits – Number of bits used for quantization.
min_val – Minimum value.
max_val – Maximum value.
include_real_zero_point – If True, force the float dynamic range covering zero.

return: Computed scale s such that real numbers r are converted to integers q by the formula q = round(s * r).

afe.ir.quantization_utils.compute_zero_point(asymmetry: bool, layer_bits: int, min_val: float, max_val: float, restricted_range: bool = False) → int

Given min and max value, compute the zero point. :param asymmetry: If true, do asymmetric quantization. :param layer_bits: Number of bits used for quantization. :param min_val: Minimum value. :param max_val: Maximum value. :param restricted_range: If True, the dynamic range will be equal

at negative and positive side.

return: Zero point.

afe.ir.quantization_utils.significant_bits_signed(n: int) → int: Get the smallest signed integer bit width that can represent the given integer.

> significant_bits_signed(-129) = 9 > significant_bits_signed(-128) = 8 > significant_bits_signed(127) = 8 > significant_bits_signed(128) = 9

afe.ir.quantization_utils.compute_power_of_2_scale_and_shift(scale: float | numpy.ndarray, input_bit: int, output_bit: int) → Tuple[int, int] | Tuple[numpy.ndarray, numpy.ndarray]: Given a float scale or a vector of scale and quantized bit number for input and output, return a quantized scale and right shift :param scale: Union[float, np.ndarray] :param input_bit: int. Number of bit used for input quantization :param output_bit: int. Number of bit used for output quantization :return: Union[Tuple[int, int], Tuple[np.ndarray, np.ndarray]. Tuple of (scale, right shift)

afe.ir.quantization_utils.compute_weight_scale(weight: numpy.ndarray, bits: int) → float: Compute weight scale. Weights are always quantized symmetrically. :param weight: Weight tensor. :param bits: Number of bits used to quantize weight. return: Scale of weight.

afe.ir.quantization_utils.compute_weight_scale_per_channel(weight: numpy.ndarray, bits: int) → numpy.ndarray: Compute per-channel weight scales. The expected layout of weight is AwesomeConvWeightLayout. :param weight: Weight tensor in AwesomeConvWeightLayout format. :param bits: Number of bits used to quantize weight. return: An array of scales of weight.

afe.ir.quantization_utils.linear_scale(input: numpy.ndarray, scale: float, bits: int, clip: bool = True) → numpy.ndarray: Linear scale the input based on the scale. Clip the scaled input based on the bit number :param input: A numpy array. :param scale: A scale factor that used to scale the input to a target range. :param bits: Number of bit used to clip the scaled input. :param clip: If true, clip the linear scale result to the given dynamic range. return: Scaled input.

afe.ir.quantization_utils.linear_scale_per_channel(input: numpy.ndarray, scale: numpy.ndarray, bits: int, clip: bool = True) → numpy.ndarray

Linear scale the input based on the scale. Clip the scaled input based on the bit number The output channel has to be at the last dimension. :param input: A numpy array. :param scale: A numpy array of scale factors that used to scale the input to a different

target ranges in different channels.

Parameters:

bits – Number of bit used to clip the scaled input.
clip – If true, clip the linear scale results to the given dynamic range.

return: Scaled input.

afe.ir.quantization_utils.linear_quantize(input: numpy.ndarray, scale: float, zp: int, bits: int) → numpy.ndarray: quantized_input = (input / S) + zero_point. :param input: A numpy array. :param scale: scale = (1/S) in the above equation. :param zp: Zero point of the quantized input. :param bits: Number of bit used to clip the scaled input. return Quantized input.

afe.ir.quantization_utils.linear_quantize_with_quantization(input: numpy.ndarray, quantization: afe.ir.defines.Quantization) → numpy.ndarray

Apply a quantization to a floating-point tensor to produce a quantized tensor.

Parameters:

input – Floating-point tensor
quantization – Quantization to apply

Returns:

Quantized tensor

afe.ir.quantization_utils.quantize_value(value: Any, q: afe.ir.defines.DataValue[afe.ir.defines.Quantization | None]) → Any

Quantize a value according to the given quantization. Values consist of arrays and tuples.

Parameters:

value – Value to quantize. It must consist of numpy arrays and tuples.
q – Quantization of the value. None means that the value is not quantized and so it will be returned unchanged.

Returns:

Quantized value. It has the same tuple structure as the input.

afe.ir.quantization_utils.dequantize_value(value: Any, q: afe.ir.defines.DataValue[afe.ir.defines.Quantization | None]) → Any

Dequantize a value according to the given quantization. Values consist of arrays and tuples.

Parameters:

value – Value to dequantize. It must consist of numpy arrays and tuples.
q – Quantization of the value. None means that the value is not quantized and so it will be returned unchanged.

Returns:

Dequantized value. It has the same tuple structure as the input.

afe.ir.quantization_utils.get_zero_kernel_mask_per_channel(weight: numpy.ndarray, threshold: float) → numpy.ndarray

Return the mask of zero kernel. The kernel layout of weight must be in AwesomeConvWeightLayout. :param weight: Weights for convolution in AwesomeConvWeightLayout layout. :param threshold: If the sum of kernel’s absolute value is smaller than the threshold, the kernel will

be treated as a zero kernel.

return: Mask of zero kernel. True means the kernel is a zero kernel.

afe.ir.quantization_utils.dequantize(input: numpy.ndarray, scale: float, zp: int) → numpy.ndarray

Original equation:: quantized_input = (input / S) + zero_point.
Reverse it to get dequantize equation:: dequantized input = (quantized_input - zero_point) * S

Parameters:

input – A numpy array.
scale – scale = (1 / S) in the above equation.
zp – Zero point of the quantized input.

return Dequantized input.

afe.ir.quantization_utils.requantize(data: numpy.ndarray, bits: int, right_shifts: int | numpy.ndarray, zp: int | None = None, per_channel: bool = False, axis: int = -1, rounding_type: ml_kernels.math_helpers.RoundType = RoundType.UPWARD, *, result_type: afe.ir.tensor_type.ScalarType = ScalarType.int8) → numpy.ndarray

Requantize a quantized tensor to another quantization domain :param data: A numpy array. :param bits: Number of bit used to clip the scaled input. :param right_shifts: A numpy array. Each ouput channel has a number of bit shifted to the right.

This acts as a hardware friendly multiple of 2 scale.

Parameters:

zp – Zero point of the quantized input.
per_channel – Default is False. If True, each output channel has one right_shift.
result_type – Numeric type of requantized tensor.

return: Requantized tensor in chosen numeric type.

afe.ir.quantization_utils.float_requantization(input_quantization: afe.ir.defines.Quantization, output_quantization: afe.ir.defines.Quantization) → Tuple[float, float]

Calculate floating-point correction parameters to requantize integer data using floating-point intermediate values.

It returns S and Z such that data can be requantized by the calculation:

quantized_output = round(S * float(quantized_input) + Z)

Parameters:

input_quantization – Quantization of input data
output_quantization – Quantization of output data

Returns:

Requantization scale correction and zero point correction

afe.ir.quantization_utils.power_of_2_requantization(input_quantization: afe.ir.defines.Quantization, output_quantization: afe.ir.defines.Quantization) → int

Calculate a shift factor to requantize data by a power of 2 in integer arithmetic.

This should only be used if the input and output quantization were chosen for power of 2 requantization. It is not a good approximation in general.

It returns a shift such that data can be requantized by the calculation:

quantized_output = quantized_input >> shift

The shift should use rounding to nearest, with any tie-breaking method.

Parameters:

input_quantization – Quantization of input data
output_quantization – Quantization of output data
bits – Integer precision of temporary values

Returns:

Amount to shift right. May be negative.

afe.ir.quantization_utils.requantization(input_quantization: afe.ir.defines.Quantization, output_quantization: afe.ir.defines.Quantization, bits: int = 32, *, sc_correction_bits: int = 32) → Tuple[int, int, int]

Calculate correction factors to requantize data in integer arithmetic.

It returns S, Z, and shift such that data can be requantized by the calculation:

quantized_output = ((S * quantized_input) + Z) >> shift

The shift should use rounding to nearest, with any tie-breaking method.

Parameters:

input_quantization – Quantization of input data
output_quantization – Quantization of output data
bits – Integer precision of temporary values
sc_correction_bits – Integer precision of the scale correction. The returned scale correction, taken as a signed integer, will not exceed this many bits.

Returns:

Requantization scale correction, zero point correction, and right shift

afe.ir.quantization_utils.requantization_tflite(input_quantization: afe.ir.defines.Quantization, output_quantization: afe.ir.defines.Quantization) → Tuple[int, int, int]

Calculate correction factors to do TFLite requantization.

It returns S, Z, and shift such that data can be requantized by the calculation:

quantized_output = ((S * quantized_input) >> shift) + Z

The shift should use rounding to nearest, with any tie-breaking method. The product (S * quantized_input) is assumed not to overflow. It is designed for a datapath that calculates this product in 64-bit precision.

Parameters:

input_quantization – Quantization of input data. The input data’s zero point must be 0.
output_quantization – Quantization of output data

Returns:

Requantization scale correction, zero point correction, and right shift

afe.ir.quantization_utils.is_quantized(data: numpy.ndarray) → bool

afe.ir.quantization_utils.dequantize_tensor(data: List[numpy.ndarray] | Tuple[numpy.ndarray, Ellipsis] | numpy.ndarray, scales: List[float], zps: List[int]) → List[numpy.ndarray] | Tuple[numpy.ndarray, Ellipsis] | numpy.ndarray: Dequantize tensor. A tensor can be a List[int], a Tuple[np.ndarray, …], or a np.ndarray.

afe.ir.quantization_utils.quantize_tensor(data: Tuple[numpy.ndarray, Ellipsis] | numpy.ndarray, scales: List[float | List[float]], zps: List[int | List[int]], layer_bits: List[int | List[int]]) → Tuple[numpy.ndarray, Ellipsis] | numpy.ndarray: Quantize tensor. A tensor can be Tuple[np.ndarray, …] or a np.ndarray.

afe.ir.quantization_utils.dequantize_input_dict(input_dict: Dict[afe.ir.defines.NodeName, numpy.ndarray | Tuple[numpy.ndarray, Ellipsis]], scales: List[float | List[float]], zps: List[int | List[int]]) → Dict[afe.ir.defines.NodeName, numpy.ndarray | Tuple[numpy.ndarray, Ellipsis]]

Given a input_dict, input scales, and input zero points, dequantize each input in the input_dict to float if the data type is QuantizedTensor. :param input_dict: Dict[NodeName, Union[np.ndarray, Tuple[np.ndarray, …]]]. Input dictionary

with (key: value) = (input_name: data)

Parameters:

scales – List[Union[float, List[float]]]. Input scale for each input data
zps – List[Union[int, List[int]]]. Input zero point for each input data

Returns:

A dequantized input_dict

afe.ir.quantization_utils.quantize_input_dict(input_dict: Dict[afe.ir.defines.NodeName, numpy.ndarray | Tuple[numpy.ndarray, Ellipsis]], scales: List[float | List[float]], zps: List[int | List[int]], layer_bits: List[int | List[int]]) → Dict[afe.ir.defines.NodeName, numpy.ndarray | Tuple[numpy.ndarray, Ellipsis]]

Given a input_dict, input scales, and input zero points, quantize each input in the input_dict to QuantizedTensor if the data type is not QuantizedTensor. :param input_dict: Dict[NodeName, Union[np.ndarray, Tuple[np.ndarray, …]]]. Input dictionary

with (key: value) = (input_name: data)

Parameters:

scales – List[Union[float, List[float]]]. Input scale for each input data
zps – List[Union[int, List[int]]]. Input zero point for each input data
layer_bits – Int, number of bit precision for QuantizedTensor

Returns:

A quantized input_dict

afe.ir.quantization_utils.quantize_alpha(alpha: numpy.ndarray, bits: int = 8) → Tuple[numpy.ndarray, int]

Quantize the alpha for PreluOp

Parameters:

alpha – Alpha
bits – Number of bits used for quantization

Returns:

Quantized alpha, shift value

afe.ir.quantization_utils.quantize_add_subtract(is_subtract: bool, input_scales: List[float], input_zps: List[int], scale: float, zero_point: int, layer_bits: int, in1_scale_const: int = 1, in2_scale_const: int = 1) → Tuple[List[int], int, int]

Quantize the add/subtact operator :param is_subtract: If True function is used to quantize subtract

operator, otherwise add operator.

Parameters:

input_scales – Scales of the input nodes.
input_zps – Zero points of the input nodes.
scale – Scale of the current node.
zero_point – Zero point of the current node.
layer_bits – Number of bits used for quantization.
attrs – AwesomeAttributes class
activ_attrs – Activation function used in case of composite operations.
in1_scale_const – Const to be folded in 1st input scale.
in2_scale_const – Const to be folded in 2nd input scale.

afe.ir.quantization_utils.quantize_multiply(lhs_quant: afe.ir.defines.Quantization, rhs_quant: afe.ir.defines.Quantization, output_quant: afe.ir.defines.Quantization, allow_full_output_precision: bool) → Tuple[int, ml_kernels.requantization.BaseRequantization[numpy.ndarray], afe.ir.defines.Quantization]

Quantize the multiply operator.

Parameters:

lhs_quant – Quantization of the first input of multiply
rhs_quant – Quantization of the second input of multiply
output_quant – Quantization of the output of multiply. It may be ignored if allow_full_output_precision is True.
allow_full_output_precision – Whether 32-bit output is allowed. If True, then this function may ignore output_quant and output a 32-bit quantization. If false, then this function will quantize according to output_quant.

Returns:

Tuple of intrinsic shift amount, requantization to perform, and quantization of the output.

afe.ir.quantization_utils.quantize_batch_matmul(lhs_quant: afe.ir.defines.Quantization, rhs_quant: afe.ir.defines.Quantization, output_quant: afe.ir.defines.Quantization, error_reporter: afe.ir.defines.NodeReporter) → Tuple[int, ml_kernels.requantization.BaseRequantization[numpy.ndarray], afe.ir.defines.Quantization]

afe.ir.quantization_utils.quantize_udf(input_quant: afe.ir.defines.Quantization, output_quant: afe.ir.defines.Quantization, input_type: type, output_type: type, func: Callable[[numpy.ndarray], numpy.ndarray], invert_scales: bool = True) → numpy.ndarray

Create a lookup table for a user-defined function.

Parameters:

input_quant – Quantization of the input.
output_quant – Quantization of the output.
input_type – Type of LUT input.
output_type – Type of LUT output.
func – Function to be approximated by the lookup table.
invert_scales – If true, the input scale factors are inverted.

Returns:

Lookup table representing func for the quantized input and output. It is a numpy array of int8 or int16 values.

afe.ir.quantization_utils.get_input_quantization_func(scale: float, zp: int, layer_bit: int) → Callable[[numpy.ndarray], numpy.ndarray]

Return a function that takes a numpy array and using the scale and zero point to quantize the data using the equation below:

quantized_input = (input / S) + zero_point

Parameters:

input – A numpy array.
scale – scale = (1/S) in the above equation.
zp – Zero point of the quantized input.
bits – Number of bit used to clip the scaled input.

afe.ir.quantization_utils.quantize_clip_attrs(attrs: afe.ir.attributes.ClipAttrs, scalar_type: afe.ir.tensor_type.ScalarType, quant: afe.ir.defines.Quantization) → afe.ir.attributes.ClipQuantAttrs

Quantize the attributes of clip operator

Calculate the boundaries of the clip operator based on its quantization parameters and data type.

Parameters:

attrs – Attributes of the clip operator
scalar_type – Scalar data type of the quantized clip operator
quant – Quantization parameters to apply to clip operator

Returns:

Attributes of the quantized clip operator containing boundary parameters calculated for quantized operator.

afe.ir.quantization_utils.quantize_activation(attrs: afe.ir.attributes.ClipAttrs | afe.ir.attributes.ReluAttrs | None, quantization: afe.ir.defines.Quantization, scalar_type: afe.ir.tensor_type.ScalarType, *, quant_config: afe.core.configs.QuantizationConfigs | None = None) → afe.ir.attributes.ClipQuantAttrs | afe.ir.attributes.ReluQuantAttrs | None

Quantize a simple activation function (clip, relu, or nothing) and simplify it if possible.

No requantization is introduced to these activation functions; the input and output quantization scales are always the same. Quantization may simplify an activation function by taking advantage of the clipping behavior of saturating arithmetic.

Parameters:

attrs – Attributes of the activation function to quantize
quantization – Quantization to apply to this activation function
scalar_type – Scalar data type that the activation function will be evaluated on
scalar_type – ScalarType used to initialize ReluAttrs. Has to be integer type.
quant_config – Parameters that were used to choose ‘quantization’. Used for error checking.

Returns:

Attributes of the quantized activation function. It may be a different type than the input.

afe.ir.quantization_utils.requantize_activation(attrs: afe.ir.attributes.ClipQuantAttrs | afe.ir.attributes.ReluQuantAttrs | None, zero_point: int, requantization: ml_kernels.requantization.BaseRequantization[numpy.ndarray], scalar_type: afe.ir.tensor_type.ScalarType) → afe.ir.attributes.ClipQuantAttrs | afe.ir.attributes.ReluQuantAttrs | None

Requantize an activation function.

This represents transforming the expression requant(activ(x)), where the activation is evaluated before requantization, to an equivalent expression newactiv(requant(x)), where the new activation is evaluated after requantization. The new activation could be simpler by taking advantage of integer saturation.

Parameters:

attrs – Activation function’s attributes. This must be for a quantized activation.
zero_point – Original zero point of the activation function, before requantization. Ignored if attrs is None.
requantization – Requantization to perform. The input type of the requantization is assumed to be int16.
scalar_type – ScalarType used to initialize ReluAttrs. Has to be integer type.

Returns:

Transformed activation function’s attributes (clip, relu, or nothing).

afe.ir.quantization_utils.requantize_quantization(quantization: afe.ir.defines.Quantization, requant: ml_kernels.requantization.BaseRequantization[numpy.ndarray]) → afe.ir.defines.Quantization

Get the quantization of the result of requantizing a tensor. This would be the quantization at the output of a Requantize node, for the given input and requantization.

Parameters:

quantization – Quantization of input tensor
requant – Requantization to perform

Returns:

Quantization of the result of applying requant to the input tensor

afe.ir.quantization_utils.quantize_prelu(layer_bits: int, alpha: numpy.ndarray | float) → Tuple[int, int]: Quantized the PRelu alphas and return the quantized alphas and right shifts :param layer_bits: Number of bits used for quantization :param alpha: Union[np.ndarray, float]. alpha in float data type return: Tuple[np.ndarray, np.ndarray]. Tuple of (quantized alpha, right shift)

afe.ir.quantization_utils.quantize_reciprocal(input_qtype: afe.ir.attributes.QuantResultTensorType) → afe.ir.attributes.AwesomeCalibAttrs: Quantize the reciprocal part of divide :param input_qtype: quantization for rhs argument of divide. :return: calibration attributes AwesomeCalibAttrs which are used in ReciprocalOp UDF.

afe.ir.quantization_utils.quantize_lrn(attrs: afe.ir.attributes.LRNAttrs, input_quant: afe.ir.defines.Quantization, quant: afe.ir.defines.Quantization) → afe.ir.attributes.LRNQuantAttrs

Quantize LRN which is implemented based on quantized_local_response_normalization from ml_kernels repo: out = lut(square_sum(x)) * x where lut function is: lambda x: (bias + alpha / size * x) ** (beta)

Parameters:

attrs – LRN attributes.
input_quant – Quantization of input data
quant – Layer quantization

Returns:

Tuple[List[int], List[int], List[int]]. A tuple of (re-scaled input scales, corrected input zero points, right shifts)

afe.ir.quantization_utils.quantize_softmax(attrs: afe.ir.attributes.SoftmaxAttrs, input_quant: afe.ir.defines.Quantization, quant: afe.ir.defines.Quantization, intermediate_min_max: Dict[str, Tuple[float, float]], enable_int16: bool) → afe.ir.attributes.SoftmaxQuantAttrs

Quantize Softmax which is implemented based on softmax implementation from ml_kernels repo: exp = lut_exp(x) # lut_exp(x) = exp(x) exp_sum_rec = lut_rec(np.sum(exp)) # lut_rec(x) = 1/x ofm = exp * exp_sum_rec

Parameters:

attrs – Softmax attributes.
input_quant – Quantization of input data
quant – Layer quantization
intermediate_min_max – Dict of intermediates min/max values.
enable_int16 – Whether to use int8 or int16 quantization.

Returns:

Quantized Softmax attributes

afe.ir.quantization_utils.quantize_layer_norm(attrs: afe.ir.attributes.LayerNormAttrs, input_quant: afe.ir.defines.Quantization, quant: afe.ir.defines.Quantization, intermediate_min_max: dict[str, tuple[float, float]]) → afe.ir.attributes.LayerNormQuantAttrs

Quantize LayerNorm which is implemented based on layer norm implementation from ml_kernels repo: LayerNorm(input, axis, epsilon) = (input - m) / Sqrt(var + epsilon), where

m = ReduceMean(input, axis, keepdims=True), var = ReduceMean((input - m) ** 2, axis, keepdims=True). Use LUT for reciprocal of the sqrt function.

Parameters:

attrs – LayerNormAttrs attributes.
input_quant – Quantization of input data.
quant – Layer quantization.
intermediate_min_max – Dict of intermediates min/max values.

Returns:

Quantized LayerNormAttrs attributes.

afe.ir.quantization_utils.quantize_instance_norm(attrs: afe.ir.attributes.InstanceNormAttrs, input_quant: afe.ir.defines.Quantization, mean_quant: afe.ir.defines.Quantization, variance_quant: afe.ir.defines.Quantization, quant: afe.ir.defines.Quantization)

Quantize Instance Normalization operator: (input - mean) / sqrt(variance + epsilon).

Parameters:

attrs – Instance Normalization attributes.
input_quant – Quantization of the input data.
mean_quant – Quantization of the mean input data.
variance_quant – Quantization of the variance input data.
quant – Layer quantization.

Returns:

Quantized Instance Normalization attributes.

afe.ir.quantization_utils.quantize_rms_norm_core(attrs: afe.ir.attributes.RMSNormAttrs, input_quant: afe.ir.defines.Quantization, quant: afe.ir.defines.Quantization, intermediate_min_max: Dict[str, Tuple[float, float]], enable_lut_int16: bool) → afe.ir.attributes.RMSNormCoreQuantAttrs

Quantize RMS Normalization which is implemented based on rms norm implementation from ml_kernels repo: RMSNorm(x, axis, epsilon) = x / Sqrt(ReduceMean(x ** 2, axis, keepdims=True) + epsilon) Use LUT for reciprocal of the sqrt function.

Parameters:

attrs – RMSNorm attributes.
input_quant – Quantization of input data.
quant – Layer quantization.
intermediate_min_max – Dict of intermediates min/max values.
enable_lut_int16 – If True, quantize LUT to int16 otherwise to int8.

Returns:

Quantized RMSNorm attributes.

afe.ir.quantization_utils.quantization_data_value_to_output_list(quantization: afe.ir.defines.DataValue[afe.ir.defines.Quantization]) → Tuple[List[float], List[int], List[int], List[int], List[int]]

Convert a Data value of Quantization object(s) to lists of quantization-related values. This is used for interfacing to code that stores quantization information in five separate lists.

Param:: quantization: DataValue of Quantization object(s) to convert to quantization parameters Tuple.
Returns:: Lists of scales, zero points, bits, minimum and maximum values.

afe.ir.quantization_utils.fix_requantization(requantization: ml_kernels.requantization.BaseRequantization[numpy.ndarray]) → ml_kernels.requantization.BaseRequantization[numpy.ndarray]: Change the data type of the right_shift array, if it is present, to uint8.

afe.ir.quantization_utils.cast_calibration_inputs(values: List[numpy.ndarray], cast: afe.ir.defines.QuantizationCast): Quantizes a list of tensors according to casts. Identity cast returns the original values.

afe.ir.quantization_utils.create_requantization_from_cast(cast: afe.ir.defines.RequantCast) → ml_kernels.requantization.BaseRequantization[numpy.ndarray]

Get the Requantization that implements the given cast.

Parameters:: cast – Cast to perform
Returns:: Requantization