sima_utils.transformer.model.language_post_model

Classes

LanguagePostModel

Base implementation for the post cache model of the language model.

Module Contents

class sima_utils.transformer.model.language_post_model.LanguagePostModel

Base implementation for the post cache model of the language model.

num_tokens

Number of tokens. Set to a value greater than 1 to consume multiple input tokens in one model.

layer_idx

Transformer layer index.

final_softcapping

Final logit soft capping for gemma 2.

num_tokens: int
layer_idx: int
final_softcapping: float | None
gen_onnx_files()