afe.core.compile_networks

Classes

`APUCompilerConfig`
`BackendCompilerConfig`	Parameters controlling how to run backend compilers for a network.

Functions

`compile_network`(→ None)	Compile the quantized AwesomeNet using run_l1_based_model.
`get_zip_file_path`(→ str)	Function that constructs the name of the tar.gz archive
`compute_checksum`(→ str)	Compute the SHA-256 checksum of a file.
`compile_net_to_elf`(→ tuple[int, float])	Compile parts of a network to object code. Use the Product Compiler for
`compile_backend_code`(→ int)	Compile the nodes in an AwesomeNet that contain BackendIR.

Module Contents

afe.core.compile_networks.compile_network(net: afe.ir.net.AwesomeNet, model_config: afe.core.configs.ModelConfigs, opt_config: afe.core.configs.OptimizationConfigs, enable_large_tensors: bool = True) → None

Compile the quantized AwesomeNet using run_l1_based_model. Generate MLC files for each layer and save to output_dir. Save the YAML, if the SIMA_AFE_SAVED_FILES environmental variable is set to 1.

This function is deprecated. Use translate_sub_awesome_net_to_modelgraph and compile_awesomenet to compile the AwesomeNet.

Parameters:

net – A quantized AwesomeNet.
model_config – A ModelConfigs instance containing model related information and status.
opt_config – Optimization configuration parameters
enable_large_tensors – If true, the MLA will handle large tensors, otherwise large tensors will raise an exception

afe.core.compile_networks.get_zip_file_path(output_dir: str, network_name: str) → str

Function that constructs the name of the tar.gz archive

Parameters:

output_dir – Path in which the archive should be created.
network_name – Name of the model

Returns:

String that represents name of the archive.

afe.core.compile_networks.compute_checksum(file_path: str) → str

Compute the SHA-256 checksum of a file.

Parameters:: file_path – Path to the file.
Returns:: Hexadecimal checksum string.

afe.core.compile_networks.compile_net_to_elf(net: afe.ir.net.AwesomeNet, output_elf_path: str, desired_batch_size: int = 1, compress: bool = True, tessellate_parameters: afe.backends.mla.afe_to_n2a_compiler.n2a_compiler_operations.TessellateParameters | None = None, compute_dcmp_ratio: bool = False, enable_large_tensors: bool = True, l2_caching_mode: afe.backends.mla.afe_to_n2a_compiler.n2a_compiler_operations.L2CachingMode = L2CachingMode.NONE, mlc_files_path: str | None = None, do_pack: bool = True, use_power_limits: bool = False, max_power: float | None = None, layer_norm_use_fp32_intermediates: bool = False, rms_norm_use_fp32_intermediates: bool = False) → tuple[int, float]

Compile parts of a network to object code. Use the Product Compiler for the MLA. Use TVM for the APU.

Parameters:

net – an AwesomeNet.
output_elf_path – Path in which output files should be created.
desired_batch_size – The desired batch size of the input to the model. Compiler may use the smaller value, if it cannot support desired value. The value that is used is returned to the user as the first member of the returning Tuple value.
compress – If True mlc file is compressed before generating .elf file.
tessellate_parameters – Dictionary defining the tessellation parameters for inputs and outputs of the MLA segments.
compute_dcmp_ratio – If True, function calculates and returns dcmp_ratio. Used only in get_performance_metrics.
enable_large_tensors – If true, the MLA will handle large tensors, otherwise large tensors will raise an exception
l2_caching_mode – Specifies mode of L2 caching in n2a compiler.
mlc_files_path – Mlc files path. If provided .mlc files will be saved.
do_pack – Whether to produce a tar.gz archive containing compiled files. If True, produce an archive file that contains the compiled files. If False, produce the compiled files.
use_power_limits – If true, the compiler will schedule instructions to conform to power limits.
max_power – Set to a positive float value to override default max power when power limits are used.
layer_norm_use_fp32_intermediates – Use FP32 intermediate tensors in BF16 LayerNorm kernel.
rms_norm_use_fp32_intermediates – Use FP32 intermediate tensors in BF16 RMSNorm kernel.

Returns:

Tuple[int, float] where the first value (int) represents the value of batch size used by compiler and the second value (float) represents data compression ratio. If compute_dcmp_ratio is True, function computes dcmp_ratio otherwise it returns 0f and this value should be ignored by user.

class afe.core.compile_networks.APUCompilerConfig

class afe.core.compile_networks.BackendCompilerConfig

Parameters controlling how to run backend compilers for a network.

If optional backends are omitted, the graph being compiled must not have any nodes that use that backend.

output_dir: Path of directory where compiled files will be created.

temp_dir: Path of directory where temporary files will be created. The temporary directory may be deleted after compilation. This path may be the same as output_dir.

desired_batch_size: The AwesomeNet inputs’ desired batch size to be used in compilation. Compilation will query the backends for the batch size that they can support for the entire AwesomeNet. It will choose the largest supported batch size that is no larger than the desired batch size.

mla: Configuration for MLA compiler

apu: Configuration for APU compiler

output_dir: str

temp_dir: str

desired_batch_size: int = 1

mla: afe.backends.mla.afe_to_n2a_compiler.n2a_compiler_operations.MLACompilerConfig

apu: APUCompilerConfig | None = None

afe.core.compile_networks.compile_backend_code(config: BackendCompilerConfig, net: afe.ir.net.AwesomeNet) → int

Compile the nodes in an AwesomeNet that contain BackendIR.

For the MLA backend, other parts of the model graph are modified to support changes in the code’s behavior when it is compiled by the Production Compiler.

Parameters:

config – Parameters controlling how to run backends.
net – Network whose backend code will be compiled. The network is modified.

Returns:

The batch size of the compiled code. It is equal to or smaller than the batch size in config.