sima_utils.transformer.hf_transformer ===================================== .. py:module:: sima_utils.transformer.hf_transformer Attributes ---------- .. autoapisummary:: sima_utils.transformer.hf_transformer.HF_SINGLE_MODEL_FILENAME sima_utils.transformer.hf_transformer.HF_WEIGHT_INDEX_FILENAME sima_utils.transformer.hf_transformer.HF_TOKENIZER_MODEL_FILENAME sima_utils.transformer.hf_transformer.DEFAULT_LAYER_NAMES sima_utils.transformer.hf_transformer.hf_path Classes ------- .. autoapisummary:: sima_utils.transformer.hf_transformer.LocalHuggingFaceModel Functions --------- .. autoapisummary:: sima_utils.transformer.hf_transformer.find_file Module Contents --------------- .. py:data:: HF_SINGLE_MODEL_FILENAME :value: 'model.safetensors' .. py:data:: HF_WEIGHT_INDEX_FILENAME :value: 'model.safetensors.index.json' .. py:data:: HF_TOKENIZER_MODEL_FILENAME :value: 'tokenizer.model' .. py:data:: DEFAULT_LAYER_NAMES .. py:class:: LocalHuggingFaceModel LocalHuggingFaceModel: A representation of a Local HF Cache with all relevant parts. :param hf_cache: Path - The base directory for the HF model cache. :param weights: dict - A map of the weight file name and the path to the cached weights. :param config: dict - The HF Model config file as a dictionary. :param weight_map: dict - A map of weight names to weight files. :param metadata: dict - Any available/relevant metadata. :param layer_names: dict[str, dict] - A dictionary containing layer name and prefix data as described below: Layer name prefixes and suffixes - Language Models - Block prefix and suffix - Language Model Head prefix and suffix - Vision Language Model - Vision Model prefix and suffix - Projector prefix and suffix - Block prefix and suffix - Language Model Head prefix and suffix :param tokenizer_path: Path to the tokenizer model. .. py:attribute:: hf_cache :type: pathlib.Path .. py:attribute:: weights :type: dict .. py:attribute:: config :type: dict .. py:attribute:: weight_map :type: dict .. py:attribute:: metadata :type: dict .. py:attribute:: layer_names :type: dict .. py:attribute:: tokenizer_path :type: pathlib.Path | None .. py:attribute:: params :type: dict | None :value: None .. py:method:: create_from_directory(directory: str | pathlib.Path, layer_names: dict | None = None, find_tokenizer: bool = True) -> LocalHuggingFaceModel :staticmethod: create_from_directory - Validate and build a local HF Cache from a user-defined path. 1. Verify that the model has a config file, and all the weight files are present. 2. Build a LocalHuggingFaceModel object with resolved paths to the various weight/config files, the model configuration, the layername map, and the weight-map, which tells us where to find a specific weight. :param directory: User provided path to a local HF Cache. :param layer_names: A mapping of layer prefixes, suffixes for each major component in the model. :param find_tokenizer: Set True to find the path to the HF_TOKENIZER_MODEL_FILENAME. :returns: A LocalHuggingFaceModel object with all relevant attributes. .. py:method:: param_exists(param_name: str) -> bool .. py:method:: load_all_params() .. py:method:: unload_all_params() .. py:property:: vision_model_param_base_name :type: str .. py:property:: language_model_param_base_name :type: str .. py:method:: load_param(component: str, layer_idx: int, parameter_name: str) -> numpy.ndarray load_param Load a huggingface parameter from a layer index, component and the paramater name. :param component: LLM Component, from DEFAULT_LAYER_NAMES :param layer_idx: Layer index (int). If the integer is positive, the parameter from that block index is used. If the number is negative, then there is no block_index associated with that parameter. :param parameter_name: paramater name :returns: The parameter as a numpy array. .. py:method:: execute_hf(query: str, image: PIL.Image.Image | None = None, device: str = 'cpu', do_sample: bool = False, top_p: bool = None, max_new_tokens: int = 20, num_return_sequences: int = 1) -> dict execute_hf Execute an LLM using the transformers library. Resolving Inference Device: Verify torch device - You can input an invalid device ordinal (ex. 'cuda:100' on a machine with 1 GPU) This will only throw an error, once you try to do an operation on that device, so adding code for that. Loading HF LLM: The base cache path + the relative path from the base path to the model save files. ├── blobs ├── refs └── snapshots └── ├── config.json -> ├── ... LLM files └── model.safetensors.index.json -> Loading a transformer from a pretrained folder requires the unresolved snapshots/hash/ Every model must have a config saved in a config.json :param tokens: List of integer tokens. :param device: Inference device. Defaults to 'cpu'. :param do_sample: Whether or not to use sampling ; use greedy decoding otherwise. Defaults to False (greedy decoding). :param top_p: If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. Defaults to None. :param max_new_tokens: The maximum length the generated tokens can have. Defaults to 20. :param num_return_sequences: The number of independently computed returned sequences for each element in the batch.. Defaults to 1. :raises ValueError: When the provided inference device is invalid. :returns: A dict contaning prompt, input_ids, image, output tokens and text. .. py:function:: find_file(directory: pathlib.Path, filename: str, resolve: bool = True) -> pathlib.Path | bool find_file Utility function to recursively find a file within a directory. - HF hashes/model names will be unique, but they might be nested differently, so this function is a workaround. :param directory: Directory with model files/weights. :param filename: filename (stem). :param resolve: Whether or not the path needs to be resolved. :returns: A resolved path to the file. This will account for the symlinks in HF Caches. If the file is not found, it returns None. .. py:data:: hf_path