Model — Sentence Transformers documentation (original) (raw)

BaseModel

This is the base class for all models in the Sentence Transformers library.

class sentence_transformers.base.model.BaseModel(model_name_or_path: str | None = None, *, modules: list[Module] | OrderedDict[str, Module] | None = None, device: str | None = None, prompts: dict[str, str] | None = None, default_prompt_name: str | None = None, cache_folder: str | None = None, trust_remote_code: bool = False, revision: str | None = None, local_files_only: bool = False, token: bool | str | None = None, model_kwargs: dict[str, Any] | None = None, processor_kwargs: dict[str, Any] | None = None, config_kwargs: dict[str, Any] | None = None, model_card_data: CardData | None = None, backend: Literal['torch', 'onnx', 'openvino'] = 'torch')[source]

Base class for SentenceTransformer, SparseEncoder, and CrossEncoder models.

This class provides common functionality for:

All models inherit from nn.Sequential and are composed of a sequence of modules that are called sequentially in the forward pass.

Initialize a BaseModel instance.

Parameters:

property device_: device_

Get torch.device from module, assuming that the whole module has one device. In case there are no PyTorch parameters, fall back to CPU.

property dtype_: dtype | None_

The dtype of the module (assuming that all the module parameters have the same dtype).

Type:

torch.dtype

evaluate(evaluator: BaseEvaluator, output_path: str | None = None) → dict[str, float] | float[source]

Evaluate the model based on an evaluator

Parameters:

Returns:

The evaluation results.

get_backend() → Literal['torch', 'onnx', 'openvino'][source]

Return the backend used for inference, which can be one of “torch”, “onnx”, or “openvino”.

Returns:

The backend used for inference.

Return type:

str

get_max_seq_length() → int | None[source]

Deprecated since version Use: the max_seq_length property instead.

Returns the maximal sequence length that the first module of the model accepts. Longer inputs will be truncated.

Returns:

The maximal sequence length that the model accepts, or None if it is not defined.

Return type:

Optional[int]

get_model_kwargs() → list[str][source]

Get the keyword arguments specific to this model for inference methods like encode or predict.

Example

from sentence_transformers import SentenceTransformer, SparseEncoder SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2").get_model_kwargs() [] SentenceTransformer("jinaai/jina-embeddings-v4", trust_remote_code=True).get_model_kwargs() ['task', 'truncate_dim'] SparseEncoder("opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distill").get_model_kwargs() ['task']

Returns:

A list of keyword arguments for the forward pass.

Return type:

list[str]

gradient_checkpointing_enable(gradient_checkpointing_kwargs: dict[str, Any] | None = None) → None[source]

Enable gradient checkpointing for the model.

is_singular_input(inputs: Any) → bool[source]

Check if the input represents a single example or a batch of examples.

Parameters:

inputs – The input to check.

Returns:

True if the input is a single example, False if it is a batch.

Return type:

bool

property max_seq_length_: int | None_

Returns the maximal input sequence length for the model. Longer inputs will be truncated.

Returns:

The maximal input sequence length, or None if not defined.

Return type:

Optional[int]

property modalities_: list[Literal['text', 'image', 'audio', 'video', 'message'] | tuple[Literal['text', 'image', 'audio', 'video'], ...]]_

Return the list of modalities supported by this model, e.g. ["text"] or ["text", "image", "message"].

model_card_data_class[source]

alias of BaseModelCardData

preprocess(inputs: list[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | MessageDict | list[MessageDict] | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict] | tuple[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict], str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict]] | list[str | Image | ndarray | Tensor | AudioDict | None | VideoDict | dict[Literal['text', 'image', 'audio', 'video'], str | Image | ndarray | Tensor | AudioDict | None | VideoDict]]], prompt: str | None = None, **kwargs) → dict[str, Tensor | Any][source]

Preprocesses the inputs for the model.

Parameters:

Returns:

A dictionary of tensors with the preprocessed inputs.

Return type:

dict[str, Tensor | Any]

property processor_: Any_

Property to get the processor that is used by this model

push_to_hub(repo_id: str, token: str | None = None, private: bool | None = None, safe_serialization: bool = True, commit_message: str | None = None, local_model_path: str | None = None, exist_ok: bool = False, replace_model_card: bool = False, train_datasets: list[str] | None = None, revision: str | None = None, create_pr: bool = False) → str[source]

Uploads all elements of this model to a HuggingFace Hub repository, creating it if it doesn’t exist.

Parameters:

Returns:

The url of the commit of your model in the repository on the Hugging Face Hub.

Return type:

str

save_pretrained(path: str, model_name: str | None = None, create_model_card: bool = True, train_datasets: list[str] | None = None, safe_serialization: bool = True) → None[source]

Saves a model and its configuration files to a directory, so that it can be loaded again.

Parameters:

save_to_hub(repo_id: str, organization: str | None = None, token: str | None = None, private: bool | None = None, safe_serialization: bool = True, commit_message: str = 'Add new model.', local_model_path: str | None = None, exist_ok: bool = False, replace_model_card: bool = False, train_datasets: list[str] | None = None) → str[source]

DEPRECATED, use push_to_hub instead.

Uploads all elements of this model to a new HuggingFace Hub repository.

Parameters:

Returns:

The url of the commit of your model in the repository on the Hugging Face Hub.

Return type:

str

start_multi_process_pool(target_devices: list[str] | None = None) → dict[Literal['input', 'output', 'processes'], Any][source]

Starts a multi-process pool to infer with several independent processes.

This method is recommended if you want to predict on multiple GPUs or CPUs. It is advised to start only one process per GPU. This method works together with predict and stop_multi_process_pool.

Parameters:

target_devices (List [ str ] , optional) – PyTorch target devices, e.g. [“cuda:0”, “cuda:1”, …], [“npu:0”, “npu:1”, …], or [“cpu”, “cpu”, “cpu”, “cpu”]. If target_devices is None and CUDA/NPU is available, then all available CUDA/NPU devices will be used. If target_devices is None and CUDA/NPU is not available, then 4 CPU devices will be used.

Returns:

A dictionary with the target processes, an input queue, and an output queue.

Return type:

Dict[str, Any]

static stop_multi_process_pool(pool: dict[Literal['input', 'output', 'processes'], Any]) → None[source]

Stops all processes started with start_multi_process_pool.

Parameters:

pool (Dict [ str , object ]) – A dictionary containing the input queue, output queue, and process list.

Returns:

None

supports(modality: Literal['text', 'image', 'audio', 'video', 'message'] | tuple[Literal['text', 'image', 'audio', 'video'], ...]) → bool[source]

Check if the model supports the given modality.

A modality is supported if:

  1. It is directly listed in modalities (including tuple modalities that are explicitly listed), or
  2. It is a tuple of modalities (e.g. ("image", "text")) where each part is individually supported and the model also supports "message" format, which is used to combine multiple modalities into a single input.

Parameters:

modality – A single modality string (e.g. "text", "image") or a tuple of modality strings (e.g. ("image", "text")).

Returns:

Whether the model supports the given modality.

Return type:

bool

Example:

from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") model.supports("text") True model.supports("image") False

tokenize(texts: list[str] | list[dict] | list[tuple[str, str]], **kwargs) → dict[str, Tensor][source]

Deprecated since version `tokenize`: is deprecated. Use preprocess instead.

property tokenizer_: Any_

Property to get the tokenizer that is used by this model

property transformers_model_: PreTrainedModel | None_

Property to get the underlying transformers PreTrainedModel instance, if it exists. Note that it’s possible for a model to have multiple underlying transformers models, but this property will return the first one it finds in the module hierarchy.

Note

This property can also return e.g. ORTModelForFeatureExtraction or OVModelForFeatureExtraction instances from the optimum-intel and optimum-onnx libraries, if the model is loaded using backend="onnx" orbackend="openvino".

Returns:

The underlying transformers model or None if not found.

Return type:

PreTrainedModel or None

BaseModelCardData

class sentence_transformers.base.model_card.BaseModelCardData(language: str | list[str] | None = , license: str | None = None, model_name: str | None = None, model_id: str | None = None, train_datasets: list[dict[str, str]] = , eval_datasets: list[dict[str, str]] = , task_name: str | None = 'retrieval', tags: list[str] = , local_files_only: bool = False, generate_widget_examples: bool = True)[source]

A dataclass storing data used in the model card.

Parameters:

Tip

Install codecarbon to automatically track carbon emission usage and include it in your model cards.