sima_utils.transformer.hf_transformer

Attributes

`HF_SINGLE_MODEL_FILENAME`
`HF_WEIGHT_INDEX_FILENAME`
`HF_TOKENIZER_MODEL_FILENAME`
`DEFAULT_LAYER_NAMES`
`hf_path`

Classes

LocalHuggingFaceModel

LocalHuggingFaceModel:

Functions

find_file(→ pathlib.Path | bool)

find_file Utility function to recursively find a file within a directory.

Module Contents

sima_utils.transformer.hf_transformer.HF_SINGLE_MODEL_FILENAME = 'model.safetensors'

sima_utils.transformer.hf_transformer.HF_WEIGHT_INDEX_FILENAME = 'model.safetensors.index.json'

sima_utils.transformer.hf_transformer.HF_TOKENIZER_MODEL_FILENAME = 'tokenizer.model'

sima_utils.transformer.hf_transformer.DEFAULT_LAYER_NAMES

class sima_utils.transformer.hf_transformer.LocalHuggingFaceModel

LocalHuggingFaceModel:: A representation of a Local HF Cache with all relevant parts.

Parameters:

hf_cache – Path - The base directory for the HF model cache.
weights – dict - A map of the weight file name and the path to the cached weights.
config – dict - The HF Model config file as a dictionary.
weight_map – dict - A map of weight names to weight files.
metadata – dict - Any available/relevant metadata.
layer_names –
dict[str, dict] - A dictionary containing layer name and prefix data as described below: Layer name prefixes and suffixes
- Language Models
  
  Block prefix and suffix
  
  Language Model Head prefix and suffix
- Vision Language Model
  
  Vision Model prefix and suffix
  
  Projector prefix and suffix
  
  Block prefix and suffix
  
  Language Model Head prefix and suffix
tokenizer_path – Path to the tokenizer model.

hf_cache: pathlib.Path

weights: dict

config: dict

weight_map: dict

metadata: dict

layer_names: dict

tokenizer_path: pathlib.Path | None

params: dict | None = None

static create_from_directory(directory: str | pathlib.Path, layer_names: dict | None = None, find_tokenizer: bool = True) → LocalHuggingFaceModel

create_from_directory - Validate and build a local HF Cache from a user-defined path. 1. Verify that the model has a config file, and all the weight files are present. 2. Build a LocalHuggingFaceModel object with resolved paths to the various weight/config

files, the model configuration, the layername map, and the weight-map, which tells us where to find a specific weight.

Parameters:

directory – User provided path to a local HF Cache.
layer_names – A mapping of layer prefixes, suffixes for each major component in the model.
find_tokenizer – Set True to find the path to the HF_TOKENIZER_MODEL_FILENAME.

Returns:

A LocalHuggingFaceModel object with all relevant attributes.

param_exists(param_name: str) → bool

load_all_params()

unload_all_params()

property vision_model_param_base_name: str

property language_model_param_base_name: str

load_param(component: str, layer_idx: int, parameter_name: str) → numpy.ndarray

load_param Load a huggingface parameter from a layer index, component and the paramater name.

Parameters:

component – LLM Component, from DEFAULT_LAYER_NAMES
layer_idx – Layer index (int). If the integer is positive, the parameter from that block index is used. If the number is negative, then there is no block_index associated with that parameter.
parameter_name – paramater name

Returns:

The parameter as a numpy array.

execute_hf(query: str, image: PIL.Image.Image | None = None, device: str = 'cpu', do_sample: bool = False, top_p: bool = None, max_new_tokens: int = 20, num_return_sequences: int = 1) → dict

execute_hf Execute an LLM using the transformers library.

Resolving Inference Device:

Verify torch device - You can input an invalid device ordinal (ex. ‘cuda:100’ on a machine with 1 GPU) This will only throw an error, once you try to do an operation on that device, so adding code for that.

Loading HF LLM:

The base cache path + the relative path from the base path to the model save files. <Model Root> ├── blobs ├── refs └── snapshots

└── <hash>
├── config.json -> <symlink> ├── … LLM files └── model.safetensors.index.json -> <symlink>

Loading a transformer from a pretrained folder requires the unresolved snapshots/hash/ Every model must have a config saved in a config.json

Parameters:

tokens – List of integer tokens.
device – Inference device. Defaults to ‘cpu’.
do_sample – Whether or not to use sampling ; use greedy decoding otherwise. Defaults to False (greedy decoding).
top_p – If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. Defaults to None.
max_new_tokens – The maximum length the generated tokens can have. Defaults to 20.
num_return_sequences – The number of independently computed returned sequences for each element in the batch.. Defaults to 1.

Raises:

ValueError – When the provided inference device is invalid.

Returns:

A dict contaning prompt, input_ids, image, output tokens and text.

sima_utils.transformer.hf_transformer.find_file(directory: pathlib.Path, filename: str, resolve: bool = True) → pathlib.Path | bool

find_file Utility function to recursively find a file within a directory.

HF hashes/model names will be unique, but they might be nested differently, so this

function is a workaround.

Parameters:

directory – Directory with model files/weights.
filename – filename (stem).
resolve – Whether or not the path needs to be resolved.

Returns:

A resolved path to the file. This will account for the symlinks in HF Caches. If the file is not found, it returns None.

sima_utils.transformer.hf_transformer.hf_path