sima_utils.transformer.hf_transformer
=====================================

.. py:module:: sima_utils.transformer.hf_transformer


Attributes
----------

.. autoapisummary::

   sima_utils.transformer.hf_transformer.HF_SINGLE_MODEL_FILENAME
   sima_utils.transformer.hf_transformer.HF_WEIGHT_INDEX_FILENAME
   sima_utils.transformer.hf_transformer.HF_TOKENIZER_MODEL_FILENAME
   sima_utils.transformer.hf_transformer.DEFAULT_LAYER_NAMES
   sima_utils.transformer.hf_transformer.hf_path


Classes
-------

.. autoapisummary::

   sima_utils.transformer.hf_transformer.LocalHuggingFaceModel


Functions
---------

.. autoapisummary::

   sima_utils.transformer.hf_transformer.find_file


Module Contents
---------------

.. py:data:: HF_SINGLE_MODEL_FILENAME
   :value: 'model.safetensors'


.. py:data:: HF_WEIGHT_INDEX_FILENAME
   :value: 'model.safetensors.index.json'


.. py:data:: HF_TOKENIZER_MODEL_FILENAME
   :value: 'tokenizer.model'


.. py:data:: DEFAULT_LAYER_NAMES

.. py:class:: LocalHuggingFaceModel

   LocalHuggingFaceModel:
       A representation of a Local HF Cache with all relevant parts.

   :param hf_cache: Path - The base directory for the HF model cache.
   :param weights: dict - A map of the weight file name and the path to the cached weights.
   :param config: dict - The HF Model config file as a dictionary.
   :param weight_map: dict - A map of weight names to weight files.
   :param metadata: dict - Any available/relevant metadata.
   :param layer_names: dict[str, dict] - A dictionary containing layer name and prefix data as described below:
                       Layer name prefixes and suffixes
                           - Language Models
                               - Block prefix and suffix
                               - Language Model Head prefix and suffix
                           - Vision Language Model
                               - Vision Model prefix and suffix
                               - Projector prefix and suffix
                               - Block prefix and suffix
                               - Language Model Head prefix and suffix
   :param tokenizer_path: Path to the tokenizer model.


   .. py:attribute:: hf_cache
      :type:  pathlib.Path


   .. py:attribute:: weights
      :type:  dict


   .. py:attribute:: config
      :type:  dict


   .. py:attribute:: weight_map
      :type:  dict


   .. py:attribute:: metadata
      :type:  dict


   .. py:attribute:: layer_names
      :type:  dict


   .. py:attribute:: tokenizer_path
      :type:  pathlib.Path | None


   .. py:attribute:: params
      :type:  dict | None
      :value: None


   .. py:method:: create_from_directory(directory: str | pathlib.Path, layer_names: dict | None = None, find_tokenizer: bool = True) -> LocalHuggingFaceModel
      :staticmethod:


      create_from_directory - Validate and build a local HF Cache from a user-defined path.
      1. Verify that the model has a config file, and all the weight files are present.
      2. Build a LocalHuggingFaceModel object with resolved paths to the various weight/config
          files, the model configuration, the layername map, and the weight-map, which tells
          us where to find a specific weight.
      :param directory: User provided path to a local HF Cache.
      :param layer_names: A mapping of layer prefixes, suffixes for each major component in the model.
      :param find_tokenizer: Set True to find the path to the HF_TOKENIZER_MODEL_FILENAME.

      :returns: A LocalHuggingFaceModel object with all relevant attributes.


   .. py:method:: param_exists(param_name: str) -> bool


   .. py:method:: load_all_params()


   .. py:method:: unload_all_params()


   .. py:property:: vision_model_param_base_name
      :type: str


   .. py:property:: language_model_param_base_name
      :type: str


   .. py:method:: load_param(component: str, layer_idx: int, parameter_name: str) -> numpy.ndarray

      load_param Load a huggingface parameter from a layer index, component and the paramater name.

      :param component: LLM Component, from DEFAULT_LAYER_NAMES
      :param layer_idx: Layer index (int). If the integer is positive, the parameter from that block
                        index is used. If the number is negative, then there is no block_index associated with
                        that parameter.
      :param parameter_name: paramater name

      :returns: The parameter as a numpy array.


   .. py:method:: execute_hf(query: str, image: PIL.Image.Image | None = None, device: str = 'cpu', do_sample: bool = False, top_p: bool = None, max_new_tokens: int = 20, num_return_sequences: int = 1) -> dict

      execute_hf Execute an LLM using the transformers library.

      Resolving Inference Device:
          Verify torch device - You can input an invalid device ordinal (ex. 'cuda:100' on a machine with 1 GPU)
          This will only throw an error, once you try to do an operation on that device, so adding code for that.

      Loading HF LLM:
          The base cache path + the relative path from the base path to the model save files.
          <Model Root>
          ├── blobs
          ├── refs
          └── snapshots
              └── <hash>
                  ├── config.json -> <symlink>
                  ├── ... LLM files
                  └── model.safetensors.index.json -> <symlink>
          Loading a transformer from a pretrained folder requires the unresolved snapshots/hash/
          Every model must have a config saved in a config.json

      :param tokens: List of integer tokens.
      :param device: Inference device. Defaults to 'cpu'.
      :param do_sample: Whether or not to use sampling ; use greedy decoding otherwise. Defaults to False (greedy decoding).
      :param top_p: If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. Defaults to None.
      :param max_new_tokens: The maximum length the generated tokens can have. Defaults to 20.
      :param num_return_sequences: The number of independently computed returned sequences for each element in the batch.. Defaults to 1.

      :raises ValueError: When the provided inference device is invalid.

      :returns: A dict contaning prompt, input_ids, image, output tokens and text.


.. py:function:: find_file(directory: pathlib.Path, filename: str, resolve: bool = True) -> pathlib.Path | bool

   find_file Utility function to recursively find a file within a directory.
       - HF hashes/model names will be unique, but they might be nested differently, so this
       function is a workaround.

   :param directory: Directory with model files/weights.
   :param filename: filename (stem).
   :param resolve: Whether or not the path needs to be resolved.

   :returns: A resolved path to the file. This will account for the symlinks in HF Caches.
             If the file is not found, it returns None.


.. py:data:: hf_path