sima_utils.transformer.vision_preprocessor

Attributes

`PILImageResampling`
`IMAGE_RGB_STATS`
`vm_type`

Classes

ImageProcessor

Image processor for CLIP and SigLIP vision model.

Module Contents

sima_utils.transformer.vision_preprocessor.PILImageResampling

sima_utils.transformer.vision_preprocessor.IMAGE_RGB_STATS

class sima_utils.transformer.vision_preprocessor.ImageProcessor(model_type: str, target_size: int)

Image processor for CLIP and SigLIP vision model.

model_type: The type of vision model, “clip” or “siglip”.

image_size: The target image size for the vision model.

keep_aspect: If true, keep aspect ratio by squaring before resize.

image_mean: The mean of RGB images used in model training.

image_std: The std-dev of RGB images used in model training.

resample: The method of resampling used to resize an image.

model_type: str

image_size: tuple[int, int]

keep_aspect: bool

image_mean: list[float]

image_std: list[float]

resample: PILImageResampling

load_image_from_file(image_files: list[str])

expand2square(pil_img: PIL.Image.Image)

preprocess(images: list[PIL.Image.Image], channel_first: bool = True) → list[numpy.ndarray]

Preprocess a list of images as input to a vision model.

Parameters:

images – A list of RGB images.
channel_first – A flag to output CHW if true, or HWC if false.

Returns:

A list of processed images as numpy arrays.

sima_utils.transformer.vision_preprocessor.vm_type

Other Versions v: latest

Tags: v1.5.0; v1.6.0; v1.4.0; latest