afe.backends.mpk.interface ========================== .. py:module:: afe.backends.mpk.interface Attributes ---------- .. autoapisummary:: afe.backends.mpk.interface.InOutMPKDataDict Functions --------- .. autoapisummary:: afe.backends.mpk.interface.generate_mla_plugin_mpk_data afe.backends.mpk.interface.generate_apu_plugin_mpk_data afe.backends.mpk.interface.generate_ev74_plugin_mpk_data afe.backends.mpk.interface.generate_pass_through_plugin afe.backends.mpk.interface.generate_input_nodes_mpk_data afe.backends.mpk.interface.generate_output_nodes_mpk_data afe.backends.mpk.interface.log_compilation_summary afe.backends.mpk.interface.generate_plugins_mpk_data afe.backends.mpk.interface.generate_mpk_json_data afe.backends.mpk.interface.generate_mpk_json_file Module Contents --------------- .. py:data:: InOutMPKDataDict .. py:function:: generate_mla_plugin_mpk_data(node: afe.ir.node.AwesomeNode, input_nodes: list[afe.backends.mpk.defines.PluginInputNodeMPKData], sequence: int, stage: int, model_name: str, desired_batch_size: int, actual_batch_size: int, output_names: dict[str, str]) -> afe.backends.mpk.defines.PluginMPKData Generating MPK JSON data for MLA plugin. :param node: AwesomeNode :param input_nodes: List of AwesomeNode's input node names. May differ from node.input_node_names as Tuple and TupleGetItem nodes outputs are redirected to their inputs and Unpack node output is split to multiple outputs. :param sequence: Plugin's position in the model's execution sequence. :param stage: Stage number of the graph. Every graph have a unique stage number that represents their order in AwesomeNet. :param model_name: Name of the model. Used to create an .elf file name. :param desired_batch_size: Batch size requested by user. :param actual_batch_size: Input batch size value used by compiler in .elf file generation. :param output_names: Dictionary mapping Model SDK names to original model names. :returns: PluginMPKData class. .. py:function:: generate_apu_plugin_mpk_data(node: afe.ir.node.AwesomeNode, input_nodes: list[afe.backends.mpk.defines.PluginInputNodeMPKData], sequence: int, stage: int, model_name: str, output_names: dict[str, str]) -> afe.backends.mpk.defines.PluginMPKData Produce the MPK JSON data for executing an APU compiled object file using TVM's runtime. :param node: Node to produce code for. It must be a BackendIR node using the APU backend. :param input_nodes: List of AwesomeNode's input node names. May differ from node.input_node_names as Tuple and TupleGetItem nodes outputs are redirected to their inputs and Unpack node output is split to multiple outputs. :param sequence: Plugin's position in the model's execution sequence. :param stage: Stage number of the graph. Every graph have a unique stage number that represents their order in AwesomeNet. :param model_name: Name of the model. Used to create an .so file name. :param output_names: Dictionary mapping Model SDK names to original model names. :returns: PluginMPKData instance representing the input node .. py:function:: generate_ev74_plugin_mpk_data(node: afe.ir.node.AwesomeNode, input_nodes: list[afe.backends.mpk.defines.PluginInputNodeMPKData], sequence: int, desired_batch_size: int, actual_batch_size: int, output_names: dict[str, str]) -> afe.backends.mpk.defines.PluginMPKData Produce the MPK JSON data for executing an EV74 plugin. :param node: AwesomeNode :param input_nodes: List of AwesomeNode's inputs as they are in the MPK JSON file, in the form of MPK JSON nodes. This list reflects the way nodes were processed to eliminate tuples. :param sequence: Plugin's position in the model's execution sequence. :param desired_batch_size: Batch size requested by user. :param actual_batch_size: Batch size used in code generation. :param output_names: Dictionary mapping Model SDK names to original model names. :returns: PluginMPKData class. .. py:function:: generate_pass_through_plugin(input_nodes: list[afe.backends.mpk.defines.InOutNodesMPKData], sequence: int, desired_batch_size: int, actual_batch_size: int) -> afe.backends.mpk.defines.PluginMPKData Generate MPK JSON data for PassThrough plugin. :param input_nodes: Input nodes of a plugin. :param sequence: Plugin's position in the model's execution sequence. :param desired_batch_size: Batch size requested by user. :param actual_batch_size: Batch size used in code generation. :returns: PluginMPKData class. .. py:function:: generate_input_nodes_mpk_data(net: afe.ir.net.AwesomeNet) -> tuple[dict[afe.ir.defines.NodeName, afe.ir.defines.DataValue[afe.ir.defines.NodeName]], list[afe.backends.mpk.defines.InOutNodesMPKData]] Generate MPK JSON data for input nodes. :param net: AwesomeNet. :returns: InOutNodesMPKData class. .. py:function:: generate_output_nodes_mpk_data(net: afe.ir.net.AwesomeNet, output_nodes: list[afe.ir.defines.NodeName] | None = None) -> list[afe.backends.mpk.defines.InOutNodesMPKData] Generate MPK JSON data for output nodes. :param net: AwesomeNet. :param output_nodes: Optional parameter, output nodes if output is tuple node. :returns: InOutNodesMPKData class. .. py:function:: log_compilation_summary(data: afe.backends.mpk.defines.AwesomeNetMPKData, desired_batch_size: int, filenames: list[str]) -> None Print compilation summary. :param data: AwesomeNetMPKData class. :param desired_batch_size: Batch size requested by user. :param filenames: List of file names generated after compilation. :returns: None .. py:function:: generate_plugins_mpk_data(net: afe.ir.net.AwesomeNet, desired_batch_size: int, actual_batch_size: int) -> tuple[list[afe.backends.mpk.defines.PluginMPKData], list[afe.backends.mpk.defines.InOutNodesMPKData]] Generate MPK data for MLA and EV74 plugins. :param net: AwesomeNet. :param desired_batch_size: Batch size requested by user. :param actual_batch_size: Batch size used in code generation. :returns: List of PluginMPKData classes. .. py:function:: generate_mpk_json_data(net: afe.ir.net.AwesomeNet, elf_file_path: str, batch_size: int, compress: bool, tessellate_parameters: afe.core.compile_networks.TessellateParameters | None, enable_large_tensors: bool = True, l2_caching_mode: afe.backends.mla.afe_to_n2a_compiler.defines.L2CachingMode = L2CachingMode.NONE, mlc_files_path: str | None = None, use_power_limits: bool = False, max_mla_power: float | None = None, layer_norm_use_fp32_intermediates: bool = False, rms_norm_use_fp32_intermediates: bool = False) -> afe.backends.mpk.defines.AwesomeNetMPKData Generate MPK JSON data. :param net: AwesomeNet. :param elf_file_path: ELF file directory path. :param batch_size: The batch size of the input to the model. :param compress: If True mlc file is compressed before generating .elf file. :param tessellate_parameters: Dictionary defining the tessellation parameters for inputs and outputs of the MLA segments. :param enable_large_tensors: If true, the MLA will handle large tensors, otherwise large tensors will raise an exception. :param l2_caching_mode: Specifies mode of L2 caching in n2a compiler. :param mlc_files_path: Mlc files path. If provided .mlc files will be saved. :param use_power_limits: If true, the MLA compiler will schedule instructions to conform to power limits. :param max_mla_power: Set to a positive float value to override default max MLA power when power limits are used. :param layer_norm_use_fp32_intermediates: Use FP32 intermediate tensors in BF16 LayerNorm kernel. :param rms_norm_use_fp32_intermediates: Use FP32 intermediate tensors in BF16 RMSNorm kernel :returns: AwesomeNetMPKData class. .. py:function:: generate_mpk_json_file(net: afe.ir.net.AwesomeNet, file_path: str, batch_size: int = 1, compress: bool = True, tessellate_parameters: afe.core.compile_networks.TessellateParameters | None = None, enable_large_tensors: bool = True, l2_caching_mode: afe.backends.mla.afe_to_n2a_compiler.defines.L2CachingMode = L2CachingMode.NONE, mlc_files_path: str | None = None, use_power_limits: bool = False, max_mla_power: float | None = None, layer_norm_use_fp32_intermediates: bool = False, rms_norm_use_fp32_intermediates: bool = False) -> None Generate MPK JSON file. :param net: AwesomeNet. :param file_path: Elf file directory path. :param batch_size: The batch size of the input to the model. :param compress: If True mlc file is compressed before generating .elf file. :param tessellate_parameters: Dictionary defining the tessellation parameters for inputs and outputs of the MLA segments. :param enable_large_tensors: If true, the MLA will handle large tensors, otherwise large tensors will raise an exception. :param l2_caching_mode: Specifies mode of L2 caching in n2a compiler. :param mlc_files_path: Mlc files path. If provided .mlc files will be saved. :param use_power_limits: If true, the MLA compiler will schedule instructions to conform to power limits. :param max_mla_power: Set to a positive float value to override default max MLA power when power limits are used. :param layer_norm_use_fp32_intermediates: Use FP32 intermediate tensors in BF16 LayerNorm kernel. :param rms_norm_use_fp32_intermediates: Use FP32 intermediate tensors in BF16 RMSNorm kernel. :returns: None