afe.ir.defines

Classes

`Status`	Status for AwesomeNode
`DataValue`	An abstract value in a network. The type parameter represents
`TensorValue`	An abstract value associated with a tensor in a network.
`TupleValue`	An abstract value associated with a tuple in a network.
`DataIndex`	The position of an A within a DataValue[A]. This is an algebraic data type.
`TensorIndex`	Identifies the single value in a TensorValue.
`TupleIndex`	Identifies a position in a TupleValue.
`NodeAssociatedValue`	A set of abstract values associated with a network node's
`RequantizationMode`	A way of doing quantized arithmetic. Different modes make different arithmetic simplifications
`Quantization`	A quantization scale. It represents an encoding of real numbers r as integers q where:
`RequantMethod`	A requantization method as defined in ml_kernels. This enum is used to
`QuantizationCast`	A quantization-related conversion on data. When the algorithm detects
`IdentityCast`	A conversion that does nothing. It represents the case where no conversion is needed.
`QuantCast`	A quantization cast. It represents a cast of a tensor having the given shape
`DequantCast`	A quantization cast. It represents a cast of a tensor having the given shape
`RequantCast`	A quantization cast. It represents a cast of a tensor having the given shape
`ConvertCast`	A numeric conversion. It represents a conversion from one numeric type
`TupleCast`	A tuple cast. It applies a cast to each element of the tuple.
`InputsQuantCast`	A set of quantization casts to apply to a node's inputs. The dict has an entry
`QuantizationCasts`	A set of quantization casts to apply to a model. The casts are collected during a
`LayerStats`	Layer statistics. For each MLA node, quantization error is calculated,
`NodeReporter`	A node reporter to display information or warning messages about a node during transformations
`LogNodeReporter`	A node reporter to display information or warning messages about a node during transformations
`BiasCorrectionType`	A bias correction method for convolution.

Functions

`foreach_data_value`(→ None)	Apply a function to each tensor value in a DataValue.
`data_value_elements`(→ List[_TENSOR])	Get all tensor values in a DataValue.
`get_expected_tensor_value`(→ _TENSOR)	Get a value from DataValue while expecting that the type of s DataValue is a TensorValue.
`get_expected_tuple_values`(→ List[_TENSOR])	Get a list of values from DataValue while expecting that the type of s DataValue is
`reduce_data_value`(→ _A)	Combine all values in a DataValue using the given function.
`map_data_value`(→ DataValue[_B])	Transform each tensor value in a DataValue according to the given function,
`zip_data_value`(→ DataValue[_C])	Apply f to each pair of tensor values at the same positions in x and y,
`reconstruct_data_value`(→ DataValue[_TENSOR])	Convert a list to a DataValue, using heuristics to guess the data structure. This function is provided
`index_data_value`(→ _TENSOR)	Get the value at the given index.

Module Contents

afe.ir.defines.AwesomeDataLayout = 'NHWC'

afe.ir.defines.AwesomeDataLayout5D = 'NDHWC'

afe.ir.defines.AwesomeConvWeightLayout = 'HWIO'

afe.ir.defines.AwesomeConvWeightLayout5D = 'DHWIO'

afe.ir.defines.AwesomeDepthwiseConvWeightLayout = 'HWOI'

afe.ir.defines.AwesomeDepthwiseConvWeightLayout5D = 'DHWOI'

afe.ir.defines.AwesomeTransposeConvWeightLayout5D = 'DHWOI'

afe.ir.defines.NoneType

afe.ir.defines.NodeName

afe.ir.defines.InputName

afe.ir.defines.TensorFormat

afe.ir.defines.InputShape

afe.ir.defines.ConvPad

afe.ir.defines.AwesomePad2D

afe.ir.defines.AwesomeStrides2D

afe.ir.defines.AwesomeDilation2D

afe.ir.defines.AwesomePoolSize2D

afe.ir.defines.AwesomePad3D

afe.ir.defines.AwesomeStrides3D

afe.ir.defines.AwesomeDilation3D

afe.ir.defines.AwesomePoolSize3D

afe.ir.defines.AwesomePad

afe.ir.defines.AwesomeStrides

afe.ir.defines.AwesomeDilation

afe.ir.defines.AwesomePoolSize

afe.ir.defines.Float

afe.ir.defines.QuantizedTensor

afe.ir.defines.QuantizedTensorNew

afe.ir.defines.QuantizedTensorInt16

afe.ir.defines.QuantizedParam

class afe.ir.defines.Status

Status for AwesomeNode

RELAY: Right after parsing from TVM Relay IR module CALIBRATED: Calibrated SIMA_QUANTIZED: SiMa Quantized BACKEND_IR_LOWERED: After lowering MLA subgraphs to SiMa BackendIR BACKEND_IR_COMPILED: After compilation using compile_awesomenet

RELAY = 'RELAY'

CALIBRATED = 'CALIBRATED'

SIMA_QUANTIZED = 'SIMA_QUANTIZED'

BACKEND_IR_LOWERED = 'BACKEND_IR_LOWERED'

BACKEND_IR_COMPILED = 'BACKEND_IR_COMPILED'

class afe.ir.defines.DataValue: An abstract value in a network. The type parameter represents the data type that stands in for a tensor value.

class afe.ir.defines.TensorValue

An abstract value associated with a tensor in a network.

value: _TENSOR

class afe.ir.defines.TupleValue

An abstract value associated with a tuple in a network. An abstract value is associated with each element of the tuple.

elements: List[DataValue[_TENSOR]]

afe.ir.defines.foreach_data_value(f: Callable[[_TENSOR], None], v: DataValue[_TENSOR]) → None

Apply a function to each tensor value in a DataValue.

Parameters:

f – Function to apply
v – DataValue to traverse

afe.ir.defines.data_value_elements(v: DataValue[_TENSOR]) → List[_TENSOR]

Get all tensor values in a DataValue.

Since the DataValue structure is ignored, this function is only suitable when it doesn’t matter where the tensor values are located inside the DataValue.

afe.ir.defines.get_expected_tensor_value(v: DataValue[_TENSOR]) → _TENSOR: Get a value from DataValue while expecting that the type of s DataValue is a TensorValue.

afe.ir.defines.get_expected_tuple_values(v: DataValue[_TENSOR]) → List[_TENSOR]: Get a list of values from DataValue while expecting that the type of s DataValue is non-nested TupleValue.

afe.ir.defines.reduce_data_value(f: Callable[[_A, _TENSOR], _A], v: DataValue[_TENSOR], initial: _A) → _A

Combine all values in a DataValue using the given function.

Parameters:

f – Combining function
v – DataValue to traverse
initial – Initial value of result

Returns:

Combined value

afe.ir.defines.map_data_value(f: Callable[[_A], _B], v: DataValue[_A]) → DataValue[_B]

Transform each tensor value in a DataValue according to the given function, and return the results as a new DataValue.

Parameters:

f – Function to apply
v – DataValue to transform

Returns:

DataValue with all tensor values transformed

afe.ir.defines.zip_data_value(f: Callable[[_A, _B], _C], x: DataValue[_A], y: DataValue[_B]) → DataValue[_C]

Apply f to each pair of tensor values at the same positions in x and y, which must have the same shape. Return the results as a new DataValue having the same shape as x and y.

Parameters:

f – Function to apply
x – DataValue to transform
y – DataValue to transform

Returns:

Transformed data

afe.ir.defines.reconstruct_data_value(values: List[_TENSOR]) → DataValue[_TENSOR]

Convert a list to a DataValue, using heuristics to guess the data structure. This function is provided for compatibility with existing code that does not keep track of the data structure.

If the list has one item, it’s treated as representing a single tensor. If it has many items, it’s treated as representing a tuple of tensors.

Parameters:: values – Values to interpret as a DataValue

class afe.ir.defines.DataIndex: The position of an A within a DataValue[A]. This is an algebraic data type.

class afe.ir.defines.TensorIndex: Identifies the single value in a TensorValue.

class afe.ir.defines.TupleIndex

Identifies a position in a TupleValue.

index: int

nested_index: DataIndex

afe.ir.defines.index_data_value(v: DataValue[_TENSOR], i: DataIndex) → _TENSOR: Get the value at the given index.

class afe.ir.defines.NodeAssociatedValue

A set of abstract values associated with a network node’s: inputs and outputs.

Input values are held in an ordered dictionary mapping strings to data values. Inputs can be examined positionally or by name.

The output value is a single data value.

inputs: Dict[NodeName | InputName, DataValue[_TENSOR]]

output: DataValue[_TENSOR]

class afe.ir.defines.RequantizationMode

A way of doing quantized arithmetic. Different modes make different arithmetic simplifications embodying different speed accuracy tradeoffs. It is expected that TFLite-style quantization would give better accuracy while Sima-style quantization will run faster. The requantiaztion mode only applies to convolution operators.

sima

tflite

class afe.ir.defines.Quantization

A quantization scale. It represents an encoding of real numbers r as integers q where:

L = -2^(bits-1) (integer range lower bound) U = 2^(bits-1)-1 (integer range upper bound) q_unbounded = round((r * scale) + zero_point) (linear mapping to representable range) q = max(L, min(U, q_unbounded)) (clip to range)

Fields min_val and max_val give the range of floating-point values that are represented, for instance the range that was selected by calibration. This range must be representable within the integer range, that is,

L <= round((min_val * scale) + zero_point) <= round((max_val * scale) + zero_point) <= U

Often it spans the entire range from L to U. It may be smaller if the range was expanded due to constraints on the quantized representation, such as when using symmetric quantization for a numeric range that is not symmetric. If a larger numeric range was clipped when quantizing, min_val and max_val still describe the representable range and not the original range. When a tensor contains only zero, scale is set to 0. and min_val = max_val = 0.

The default values represent quantization of the floating-point range [-128, 127] using the integer range [-128, 127].

scale: float = 1.0

zero_point: int = 0

bits: int = 8

min_val: float = -128.0

max_val: float = 127.0

static representable(scale: float, zero_point: int, bits: int) → Quantization

Create a quantization scale that includes the entire representable integer range. See Quantization for documentation of the parameters. For zero tensors, scale is 0. and min_val = max_val = 0.

Parameters:

scale – Quantization scale.
zero_point – Quantization zero point.
bits – Quantization bits.

Returns:

Quantization scale constructed from the given parameters.

class afe.ir.defines.RequantMethod

A requantization method as defined in ml_kernels. This enum is used to select which type of requantization to use when a network is quantized.

fractional_zero

arith_folded

scaled_fz

class afe.ir.defines.QuantizationCast

A quantization-related conversion on data. When the algorithm detects that a conversion needs to be inserted in a model graph, it’s recorded using this class.

This is an algebraic data type.

class afe.ir.defines.IdentityCast: A conversion that does nothing. It represents the case where no conversion is needed.

class afe.ir.defines.QuantCast

A quantization cast. It represents a cast of a tensor having the given shape from float32 to int8 or int32 by computing round(r * scale + zero_point).

shape: Tuple[int, Ellipsis]

scale: float

zero_point: int

num_bits: int

out_type: afe.ir.tensor_type.ScalarType

class afe.ir.defines.DequantCast

A quantization cast. It represents a cast of a tensor having the given shape from an integer type to float32 by computing (q - zero_point) / scale.

Parameters:

shape – Shape of tensor to dequantize
scale – Quantization scale
zero_point – Quantization zero point
input_dtype – Input data type. The valid Numpy data types are: np.int8, np.int16, or np.int32.

shape: Tuple[int, Ellipsis]

scale: float

zero_point: int

input_dtype: numpy.dtype

output_dtype: numpy.dtype

class afe.ir.defines.RequantCast

A quantization cast. It represents a cast of a tensor having the given shape from an int32 type to int16/int8.

Parameters:

shape – Shape of a tensor
in_scale – Input quantization scale
in_zero_point – Input quantization zero point
out_scale – Output quantization scale
out_zero_point – Output quantization zero point
input_32_bit – If True, the input type is int32. If False, the input type is int16.
output_type – Output data type, can be int16 or int8
requantization_type – Type of requantization to use. If arith_folded is used, then the requantization will use only a shift; the scales and zero points must be related by a power of 2 factor to minimize rounding error.

shape: Tuple[int, Ellipsis]

in_scale: float

in_zero_point: int

out_scale: float

out_zero_point: int

min_val: float

max_val: float

input_32_bit: bool

output_16_bit: bool

requant_method: RequantMethod

get_input_quantization() → Quantization

get_output_quantization() → Quantization

class afe.ir.defines.ConvertCast

A numeric conversion. It represents a conversion from one numeric type to the nearest approximation in another numeric type.

Parameters:

shape – Shape of a tensor
in_type – Scalar type of input
out_type – Scalar type of output

shape: Tuple[int, Ellipsis]

in_type: afe.ir.tensor_type.ScalarType

out_type: afe.ir.tensor_type.ScalarType

class afe.ir.defines.TupleCast

A tuple cast. It applies a cast to each element of the tuple.

elements: List[QuantizationCast]

class afe.ir.defines.InputsQuantCast

A set of quantization casts to apply to a node’s inputs. The dict has an entry for each input.

casts: Dict[InputName, QuantizationCast]

does_nothing() → bool: Return true if this cast does nothing.

class afe.ir.defines.QuantizationCasts

A set of quantization casts to apply to a model. The casts are collected during a traversal of the model, then applied after the traversal is finished.

Field casts holds the casts to apply to node inputs. If a node does not need casts, it: is omitted.

casts: Dict[NodeName, InputsQuantCast]

insert(node: NodeName, cast: InputsQuantCast)

class afe.ir.defines.LayerStats

Layer statistics. For each MLA node, quantization error is calculated, that information is than forwarded to .sima.json file, and it can be viewed in Netron.

Parameters:

metric – Metric that is used for calculating error value.
error_value – Error value.

metric: str

error_value: float

class afe.ir.defines.NodeReporter

A node reporter to display information or warning messages about a node during transformations

abstract info(msg: str)

abstract debug(msg: str)

abstract warn(msg: str)

class afe.ir.defines.LogNodeReporter(node_name: NodeName)

A node reporter to display information or warning messages about a node during transformations

Parameters:: node_name – Name of the node

node_name: NodeName

info(msg: str)

debug(msg: str)

warn(msg: str)

class afe.ir.defines.BiasCorrectionType

A bias correction method for convolution.

REGULAR: Bias correction using input mean estimated during calibration ITERATIVE: Bias correction using input mean estimated by executing the

quantized model with a set of calibration inputs

NONE: No bias correction

REGULAR = 'REGULAR'

ITERATIVE = 'ITERATIVE'

NONE = 'NONE'

`AwesomeDataLayout`
`AwesomeDataLayout5D`
`AwesomeConvWeightLayout`
`AwesomeConvWeightLayout5D`
`AwesomeDepthwiseConvWeightLayout`
`AwesomeDepthwiseConvWeightLayout5D`
`AwesomeTransposeConvWeightLayout5D`
`NoneType`
`NodeName`
`InputName`
`TensorFormat`
`InputShape`
`ConvPad`
`AwesomePad2D`
`AwesomeStrides2D`
`AwesomeDilation2D`
`AwesomePoolSize2D`
`AwesomePad3D`
`AwesomeStrides3D`
`AwesomeDilation3D`
`AwesomePoolSize3D`
`AwesomePad`
`AwesomeStrides`
`AwesomeDilation`
`AwesomePoolSize`
`Float`
`QuantizedTensor`
`QuantizedTensorNew`
`QuantizedTensorInt16`
`QuantizedParam`

afe.ir.defines

Attributes

Classes

Functions

Module Contents