sima_utils.transformer.model.language_post_model
Classes
Base implementation for the post cache model of the language model. |
Module Contents
- class sima_utils.transformer.model.language_post_model.LanguagePostModel
Base implementation for the post cache model of the language model.
- num_tokens
Number of tokens. Set to a value greater than 1 to consume multiple input tokens in one model.
- layer_idx
Transformer layer index.
- final_softcapping
Final logit soft capping for gemma 2.
- num_tokens: int
- layer_idx: int
- final_softcapping: float | None
- gen_onnx_files()