sima_utils.transformer.model.whisper_decoder_post_model

Classes

WhisperDecoderPostModel

Implementation for the post cache model of Whisper.

Module Contents

class sima_utils.transformer.model.whisper_decoder_post_model.WhisperDecoderPostModel

Implementation for the post cache model of Whisper.

This implements a simplified version of the LanguagePostModel. This model is only used when generating new tokens so the num_tokens is assumed to be 1.

num_tokens: Number of tokens. Set to a value greater than 1 to consume multiple input tokens in one model.

layer_idx: Transformer layer index.

skip_encoder_kv_proj: Whether to skip the key/value projections in cross attention.

output_encoder_kv_cache: Whether to output the key/value projections from cross attention.

num_tokens: int

layer_idx: int

skip_encoder_kv_proj: bool

output_encoder_kv_cache: bool

gen_onnx_files(): Generates ONNX files.