afe.backends.mpk.interface
Attributes
Functions
Generating MPK JSON data for MLA plugin. |
|
Produce the MPK JSON data for executing an APU compiled object file using TVM's runtime. |
|
Produce the MPK JSON data for executing an EV74 plugin. |
|
Generate MPK JSON data for PassThrough plugin. |
|
Generate MPK JSON data for input nodes. |
|
Generate MPK JSON data for output nodes. |
|
|
Print compilation summary. |
Generate MPK data for MLA and EV74 plugins. |
|
Generate MPK JSON data. |
|
|
Generate MPK JSON file. |
Module Contents
- afe.backends.mpk.interface.generate_mla_plugin_mpk_data(node: afe.ir.node.AwesomeNode, input_nodes: list[afe.backends.mpk.defines.PluginInputNodeMPKData], sequence: int, stage: int, model_name: str, desired_batch_size: int, actual_batch_size: int, output_names: dict[str, str]) afe.backends.mpk.defines.PluginMPKData [source]
Generating MPK JSON data for MLA plugin.
- Parameters:
node – AwesomeNode
input_nodes – List of AwesomeNode’s input node names. May differ from node.input_node_names as Tuple and TupleGetItem nodes outputs are redirected to their inputs and Unpack node output is split to multiple outputs.
sequence – Plugin’s position in the model’s execution sequence.
stage – Stage number of the graph. Every graph have a unique stage number that represents their order in AwesomeNet.
model_name – Name of the model. Used to create an .elf file name.
desired_batch_size – Batch size requested by user.
actual_batch_size – Input batch size value used by compiler in .elf file generation.
output_names – Dictionary mapping Model SDK names to original model names.
- Returns:
PluginMPKData class.
- afe.backends.mpk.interface.generate_apu_plugin_mpk_data(node: afe.ir.node.AwesomeNode, input_nodes: list[afe.backends.mpk.defines.PluginInputNodeMPKData], sequence: int, stage: int, model_name: str, output_names: dict[str, str]) afe.backends.mpk.defines.PluginMPKData [source]
Produce the MPK JSON data for executing an APU compiled object file using TVM’s runtime.
- Parameters:
node – Node to produce code for. It must be a BackendIR node using the APU backend.
input_nodes – List of AwesomeNode’s input node names. May differ from node.input_node_names as Tuple and TupleGetItem nodes outputs are redirected to their inputs and Unpack node output is split to multiple outputs.
sequence – Plugin’s position in the model’s execution sequence.
stage – Stage number of the graph. Every graph have a unique stage number that represents their order in AwesomeNet.
model_name – Name of the model. Used to create an .so file name.
output_names – Dictionary mapping Model SDK names to original model names.
- Returns:
PluginMPKData instance representing the input node
- afe.backends.mpk.interface.generate_ev74_plugin_mpk_data(node: afe.ir.node.AwesomeNode, input_nodes: list[afe.backends.mpk.defines.PluginInputNodeMPKData], sequence: int, desired_batch_size: int, actual_batch_size: int, output_names: dict[str, str]) afe.backends.mpk.defines.PluginMPKData [source]
Produce the MPK JSON data for executing an EV74 plugin.
- Parameters:
node – AwesomeNode
input_nodes – List of AwesomeNode’s inputs as they are in the MPK JSON file, in the form of MPK JSON nodes. This list reflects the way nodes were processed to eliminate tuples.
sequence – Plugin’s position in the model’s execution sequence.
desired_batch_size – Batch size requested by user.
actual_batch_size – Batch size used in code generation.
output_names – Dictionary mapping Model SDK names to original model names.
- Returns:
PluginMPKData class.
- afe.backends.mpk.interface.generate_pass_through_plugin(input_nodes: list[afe.backends.mpk.defines.InOutNodesMPKData], sequence: int, desired_batch_size: int, actual_batch_size: int) afe.backends.mpk.defines.PluginMPKData [source]
Generate MPK JSON data for PassThrough plugin.
- Parameters:
input_nodes – Input nodes of a plugin.
sequence – Plugin’s position in the model’s execution sequence.
desired_batch_size – Batch size requested by user.
actual_batch_size – Batch size used in code generation.
- Returns:
PluginMPKData class.
- afe.backends.mpk.interface.generate_input_nodes_mpk_data(net: afe.ir.net.AwesomeNet) tuple[dict[afe.ir.defines.NodeName, afe.ir.defines.DataValue[afe.ir.defines.NodeName]], list[afe.backends.mpk.defines.InOutNodesMPKData]] [source]
Generate MPK JSON data for input nodes.
- Parameters:
net – AwesomeNet.
- Returns:
InOutNodesMPKData class.
- afe.backends.mpk.interface.generate_output_nodes_mpk_data(net: afe.ir.net.AwesomeNet, output_nodes: list[afe.ir.defines.NodeName] | None = None) list[afe.backends.mpk.defines.InOutNodesMPKData] [source]
Generate MPK JSON data for output nodes.
- Parameters:
net – AwesomeNet.
output_nodes – Optional parameter, output nodes if output is tuple node.
- Returns:
InOutNodesMPKData class.
- afe.backends.mpk.interface.log_compilation_summary(data: afe.backends.mpk.defines.AwesomeNetMPKData, desired_batch_size: int, filenames: list[str]) None [source]
Print compilation summary.
- Parameters:
data – AwesomeNetMPKData class.
desired_batch_size – Batch size requested by user.
filenames – List of file names generated after compilation.
- Returns:
None
- afe.backends.mpk.interface.generate_plugins_mpk_data(net: afe.ir.net.AwesomeNet, desired_batch_size: int, actual_batch_size: int) tuple[list[afe.backends.mpk.defines.PluginMPKData], list[afe.backends.mpk.defines.InOutNodesMPKData]] [source]
Generate MPK data for MLA and EV74 plugins.
- Parameters:
net – AwesomeNet.
desired_batch_size – Batch size requested by user.
actual_batch_size – Batch size used in code generation.
- Returns:
List of PluginMPKData classes.
- afe.backends.mpk.interface.generate_mpk_json_data(net: afe.ir.net.AwesomeNet, elf_file_path: str, batch_size: int, compress: bool, tessellate_parameters: afe.core.compile_networks.TessellateParameters | None, enable_large_tensors: bool = True, l2_caching_mode: afe.backends.mla.afe_to_n2a_compiler.defines.L2CachingMode = L2CachingMode.NONE, mlc_files_path: str | None = None, use_power_limits: bool = False, max_mla_power: float | None = None, layer_norm_use_fp32_intermediates: bool = False, rms_norm_use_fp32_intermediates: bool = False) afe.backends.mpk.defines.AwesomeNetMPKData [source]
Generate MPK JSON data.
- Parameters:
net – AwesomeNet.
elf_file_path – ELF file directory path.
batch_size – The batch size of the input to the model.
compress – If True mlc file is compressed before generating .elf file.
tessellate_parameters – Dictionary defining the tessellation parameters for inputs and outputs of the MLA segments.
enable_large_tensors – If true, the MLA will handle large tensors, otherwise large tensors will raise an exception.
l2_caching_mode – Specifies mode of L2 caching in n2a compiler.
mlc_files_path – Mlc files path. If provided .mlc files will be saved.
use_power_limits – If true, the MLA compiler will schedule instructions to conform to power limits.
max_mla_power – Set to a positive float value to override default max MLA power when power limits are used.
layer_norm_use_fp32_intermediates – Use FP32 intermediate tensors in BF16 LayerNorm kernel.
rms_norm_use_fp32_intermediates – Use FP32 intermediate tensors in BF16 RMSNorm kernel
- Returns:
AwesomeNetMPKData class.
- afe.backends.mpk.interface.generate_mpk_json_file(net: afe.ir.net.AwesomeNet, file_path: str, batch_size: int = 1, compress: bool = True, tessellate_parameters: afe.core.compile_networks.TessellateParameters | None = None, enable_large_tensors: bool = True, l2_caching_mode: afe.backends.mla.afe_to_n2a_compiler.defines.L2CachingMode = L2CachingMode.NONE, mlc_files_path: str | None = None, use_power_limits: bool = False, max_mla_power: float | None = None, layer_norm_use_fp32_intermediates: bool = False, rms_norm_use_fp32_intermediates: bool = False) None [source]
Generate MPK JSON file.
- Parameters:
net – AwesomeNet.
file_path – Elf file directory path.
batch_size – The batch size of the input to the model.
compress – If True mlc file is compressed before generating .elf file.
tessellate_parameters – Dictionary defining the tessellation parameters for inputs and outputs of the MLA segments.
enable_large_tensors – If true, the MLA will handle large tensors, otherwise large tensors will raise an exception.
l2_caching_mode – Specifies mode of L2 caching in n2a compiler.
mlc_files_path – Mlc files path. If provided .mlc files will be saved.
use_power_limits – If true, the MLA compiler will schedule instructions to conform to power limits.
max_mla_power – Set to a positive float value to override default max MLA power when power limits are used.
layer_norm_use_fp32_intermediates – Use FP32 intermediate tensors in BF16 LayerNorm kernel.
rms_norm_use_fp32_intermediates – Use FP32 intermediate tensors in BF16 RMSNorm kernel.
- Returns:
None