sima_utils.transformer.model.whisper_decoder_post_model

Classes

WhisperDecoderPostModel

Implementation for the post cache model of Whisper.

Module Contents

class sima_utils.transformer.model.whisper_decoder_post_model.WhisperDecoderPostModel

Implementation for the post cache model of Whisper.

This implements a simplified version of the LanguagePostModel. This model is only used when generating new tokens so the num_tokens is assumed to be 1.

num_tokens

Number of tokens. Set to a value greater than 1 to consume multiple input tokens in one model.

layer_idx

Transformer layer index.

skip_encoder_kv_proj

Whether to skip the key/value projections in cross attention.

output_encoder_kv_cache

Whether to output the key/value projections from cross attention.

num_tokens: int
layer_idx: int
skip_encoder_kv_proj: bool
output_encoder_kv_cache: bool
gen_onnx_files()

Generates ONNX files.