sima_utils.transformer.hf_transformer

Attributes

HF_SINGLE_MODEL_FILENAME

HF_WEIGHT_INDEX_FILENAME

HF_TOKENIZER_MODEL_FILENAME

DEFAULT_LAYER_NAMES

hf_path

Classes

LocalHuggingFaceModel

LocalHuggingFaceModel:

Functions

find_file(β†’Β pathlib.PathΒ |Β bool)

find_file Utility function to recursively find a file within a directory.

Module Contents

sima_utils.transformer.hf_transformer.HF_SINGLE_MODEL_FILENAME = 'model.safetensors'
sima_utils.transformer.hf_transformer.HF_WEIGHT_INDEX_FILENAME = 'model.safetensors.index.json'
sima_utils.transformer.hf_transformer.HF_TOKENIZER_MODEL_FILENAME = 'tokenizer.model'
sima_utils.transformer.hf_transformer.DEFAULT_LAYER_NAMES
class sima_utils.transformer.hf_transformer.LocalHuggingFaceModel
LocalHuggingFaceModel:

A representation of a Local HF Cache with all relevant parts.

Parameters:
  • hf_cache – Path - The base directory for the HF model cache.

  • weights – dict - A map of the weight file name and the path to the cached weights.

  • config – dict - The HF Model config file as a dictionary.

  • weight_map – dict - A map of weight names to weight files.

  • metadata – dict - Any available/relevant metadata.

  • layer_names –

    dict[str, dict] - A dictionary containing layer name and prefix data as described below: Layer name prefixes and suffixes

    • Language Models
      • Block prefix and suffix

      • Language Model Head prefix and suffix

    • Vision Language Model
      • Vision Model prefix and suffix

      • Projector prefix and suffix

      • Block prefix and suffix

      • Language Model Head prefix and suffix

  • tokenizer_path – Path to the tokenizer model.

hf_cache: pathlib.Path
weights: dict
config: dict
weight_map: dict
metadata: dict
layer_names: dict
tokenizer_path: pathlib.Path | None
params: dict | None = None
static create_from_directory(directory: str | pathlib.Path, layer_names: dict | None = None, find_tokenizer: bool = True) LocalHuggingFaceModel

create_from_directory - Validate and build a local HF Cache from a user-defined path. 1. Verify that the model has a config file, and all the weight files are present. 2. Build a LocalHuggingFaceModel object with resolved paths to the various weight/config

files, the model configuration, the layername map, and the weight-map, which tells us where to find a specific weight.

Parameters:
  • directory – User provided path to a local HF Cache.

  • layer_names – A mapping of layer prefixes, suffixes for each major component in the model.

  • find_tokenizer – Set True to find the path to the HF_TOKENIZER_MODEL_FILENAME.

Returns:

A LocalHuggingFaceModel object with all relevant attributes.

param_exists(param_name: str) bool
load_all_params()
unload_all_params()
property vision_model_param_base_name: str
property language_model_param_base_name: str
load_param(component: str, layer_idx: int, parameter_name: str) numpy.ndarray

load_param Load a huggingface parameter from a layer index, component and the paramater name.

Parameters:
  • component – LLM Component, from DEFAULT_LAYER_NAMES

  • layer_idx – Layer index (int). If the integer is positive, the parameter from that block index is used. If the number is negative, then there is no block_index associated with that parameter.

  • parameter_name – paramater name

Returns:

The parameter as a numpy array.

execute_hf(query: str, image: PIL.Image.Image | None = None, device: str = 'cpu', do_sample: bool = False, top_p: bool = None, max_new_tokens: int = 20, num_return_sequences: int = 1) dict

execute_hf Execute an LLM using the transformers library.

Resolving Inference Device:

Verify torch device - You can input an invalid device ordinal (ex. β€˜cuda:100’ on a machine with 1 GPU) This will only throw an error, once you try to do an operation on that device, so adding code for that.

Loading HF LLM:

The base cache path + the relative path from the base path to the model save files. <Model Root> β”œβ”€β”€ blobs β”œβ”€β”€ refs └── snapshots

└── <hash>

β”œβ”€β”€ config.json -> <symlink> β”œβ”€β”€ … LLM files └── model.safetensors.index.json -> <symlink>

Loading a transformer from a pretrained folder requires the unresolved snapshots/hash/ Every model must have a config saved in a config.json

Parameters:
  • tokens – List of integer tokens.

  • device – Inference device. Defaults to β€˜cpu’.

  • do_sample – Whether or not to use sampling ; use greedy decoding otherwise. Defaults to False (greedy decoding).

  • top_p – If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. Defaults to None.

  • max_new_tokens – The maximum length the generated tokens can have. Defaults to 20.

  • num_return_sequences – The number of independently computed returned sequences for each element in the batch.. Defaults to 1.

Raises:

ValueError – When the provided inference device is invalid.

Returns:

A dict contaning prompt, input_ids, image, output tokens and text.

sima_utils.transformer.hf_transformer.find_file(directory: pathlib.Path, filename: str, resolve: bool = True) pathlib.Path | bool
find_file Utility function to recursively find a file within a directory.
  • HF hashes/model names will be unique, but they might be nested differently, so this

function is a workaround.

Parameters:
  • directory – Directory with model files/weights.

  • filename – filename (stem).

  • resolve – Whether or not the path needs to be resolved.

Returns:

A resolved path to the file. This will account for the symlinks in HF Caches. If the file is not found, it returns None.

sima_utils.transformer.hf_transformer.hf_path