Training Examples — Sentence Transformers documentation (original) (raw)
Getting Started
- Installation
- Quickstart
- Migration Guide
- Migrating from v4.x to v5.x
* Migration for model.encode
* Migration for Asym to Router
* Migration of advanced usage - Migrating from v3.x to v4.x
* Migration for parameters on CrossEncoder initialization and methods
* Migration for specific parameters from CrossEncoder.fit
* Migration for CrossEncoder evaluators - Migrating from v2.x to v3.x
* Migration for specific parameters from SentenceTransformer.fit
* Migration for custom Datasets and DataLoaders used in SentenceTransformer.fit
- Migrating from v4.x to v5.x
Sentence Transformer
- Usage
- Computing Embeddings
* Initializing a Sentence Transformer Model
* Calculating Embeddings
* Prompt Templates
* Input Sequence Length
* Multi-Process / Multi-GPU Encoding - Semantic Textual Similarity
* Similarity Calculation - Semantic Search
* Background
* Symmetric vs. Asymmetric Semantic Search
* Manual Implementation
* Optimized Implementation
* Speed Optimization
* Elasticsearch
* OpenSearch
* Approximate Nearest Neighbor
* Retrieve & Re-Rank
* Examples - Retrieve & Re-Rank
* Retrieve & Re-Rank Pipeline
* Retrieval: Bi-Encoder
* Re-Ranker: Cross-Encoder
* Example Scripts
* Pre-trained Bi-Encoders (Retrieval)
* Pre-trained Cross-Encoders (Re-Ranker) - Clustering
* k-Means
* Agglomerative Clustering
* Fast Clustering
* Topic Modeling - Paraphrase Mining
* paraphrase_mining() - Translated Sentence Mining
* Margin Based Mining
* Examples - Image Search
* Installation
* Usage
* Examples - Embedding Quantization
* Binary Quantization
* Scalar (int8) Quantization
* Additional extensions
* Demo
* Try it yourself - Creating Custom Models
* Structure of Sentence Transformer Models
* Sentence Transformer Model from a Transformers Model
* Advanced: Custom Modules - Evaluation with MTEB
* Installation
* Evaluation
* Additional Arguments
* Results Handling
* Leaderboard Submission - Speeding up Inference
* PyTorch
* ONNX
* OpenVINO
* Benchmarks
- Computing Embeddings
- Pretrained Models
- Training Overview
- Dataset Overview
- Loss Overview
- Training Examples
- Semantic Textual Similarity
* Training data
* Loss Function - Natural Language Inference
* Data
* SoftmaxLoss
* MultipleNegativesRankingLoss - Paraphrase Data
* Pre-Trained Models - Quora Duplicate Questions
* Training
* MultipleNegativesRankingLoss
* Pretrained Models - MS MARCO
* Bi-Encoder - Matryoshka Embeddings
* Use Cases
* Results
* Training
* Inference
* Code Examples - Adaptive Layers
* Use Cases
* Results
* Training
* Inference
* Code Examples - Multilingual Models
* Extend your own models
* Training
* Datasets
* Sources for Training Data
* Evaluation
* Available Pre-trained Models
* Usage
* Performance
* Citation - Model Distillation
* Knowledge Distillation
* Speed - Performance Trade-Off
* Dimensionality Reduction
* Quantization - Augmented SBERT
* Motivation
* Extend to your own datasets
* Methodology
* Scenario 1: Limited or small annotated datasets (few labeled sentence-pairs)
* Scenario 2: No annotated datasets (Only unlabeled sentence-pairs)
* Training
* Citation - Training with Prompts
* What are Prompts?
* Why would we train with Prompts?
* How do we train with Prompts? - Training with PEFT Adapters
* Compatibility Methods
* Adding a New Adapter
* Loading a Pretrained Adapter
* Training Script - Training with Unsloth
* Examples in this repository
* Unsloth Colab notebooks
* Fine-tuning via FastSentenceTransformer
* Inference and deployment
* Benchmarks - Unsupervised Learning
* TSDAE
* SimCSE
* CT
* CT (In-Batch Negative Sampling)
* Masked Language Model (MLM)
* GenQ
* GPL
* Performance Comparison - Domain Adaptation
* Domain Adaptation vs. Unsupervised Learning
* Adaptive Pre-Training
* GPL: Generative Pseudo-Labeling - Hyperparameter Optimization
* HPO Components
* Putting It All Together
* Example Scripts - Distributed Training
* Comparison
* FSDP
- Semantic Textual Similarity
Cross Encoder
- Usage
- Cross-Encoder vs Bi-Encoder
* Cross-Encoder vs. Bi-Encoder
* When to use Cross- / Bi-Encoders?
* Cross-Encoders Usage
* Combining Bi- and Cross-Encoders
* Training Cross-Encoders - Retrieve & Re-Rank
* Retrieve & Re-Rank Pipeline
* Retrieval: Bi-Encoder
* Re-Ranker: Cross-Encoder
* Example Scripts
* Pre-trained Bi-Encoders (Retrieval)
* Pre-trained Cross-Encoders (Re-Ranker) - Speeding up Inference
* PyTorch
* ONNX
* OpenVINO
* Benchmarks
- Cross-Encoder vs Bi-Encoder
- Pretrained Models
- Training Overview
- Loss Overview
- Training Examples
- Semantic Textual Similarity
* Training data
* Loss Function
* Inference - Natural Language Inference
* Data
* CrossEntropyLoss
* Inference - Quora Duplicate Questions
* Training
* Inference - MS MARCO
* Cross Encoder
* Training Scripts
* Inference - Rerankers
* BinaryCrossEntropyLoss
* CachedMultipleNegativesRankingLoss
* Inference - Model Distillation
* Cross Encoder Knowledge Distillation
* Inference - Distributed Training
* Comparison
* FSDP
- Semantic Textual Similarity
Sparse Encoder
- Usage
- Computing Sparse Embeddings
* Initializing a Sparse Encoder Model
* Calculating Embeddings
* Input Sequence Length
* Controlling Sparsity
* Interpretability with SPLADE Models
* Multi-Process / Multi-GPU Encoding - Semantic Textual Similarity
* Similarity Calculation - Semantic Search
* Manual Search
* Vector Database Search
* Qdrant Integration
* OpenSearch Integration
* Elasticsearch Integration
* Seismic Integration
* SPLADE-index Integration - Retrieve & Re-Rank
* Overview
* Interactive Demo: Simple Wikipedia Search
* Comprehensive Evaluation: Hybrid Search Pipeline
* Pre-trained Models - Sparse Encoder Evaluation
* Example with Retrieval Evaluation: - Speeding up Inference
* PyTorch
* ONNX
* OpenVINO
* Benchmarks
- Computing Sparse Embeddings
- Pretrained Models
- Training Overview
- Dataset Overview
- Loss Overview
- Training Examples
- Model Distillation
* MarginMSE - MS MARCO
* SparseMultipleNegativesRankingLoss - Semantic Textual Similarity
* Training data
* Loss Function - Natural Language Inference
* Data
* SpladeLoss - Quora Duplicate Questions
* Training - Information Retrieval
* SparseMultipleNegativesRankingLoss (MNRL)
* Inference & Evaluation - Distributed Training
* Comparison
* FSDP
- Model Distillation
Package Reference
- Sentence Transformer
- SentenceTransformer
* SentenceTransformer
* SentenceTransformerModelCardData
* SimilarityFunction - Trainer
* SentenceTransformerTrainer - Training Arguments
* SentenceTransformerTrainingArguments - Losses
* BatchAllTripletLoss
* BatchHardSoftMarginTripletLoss
* BatchHardTripletLoss
* BatchSemiHardTripletLoss
* ContrastiveLoss
* OnlineContrastiveLoss
* ContrastiveTensionLoss
* ContrastiveTensionLossInBatchNegatives
* CoSENTLoss
* AnglELoss
* CosineSimilarityLoss
* DenoisingAutoEncoderLoss
* GISTEmbedLoss
* CachedGISTEmbedLoss
* GlobalOrthogonalRegularizationLoss
* MSELoss
* MarginMSELoss
* MatryoshkaLoss
* Matryoshka2dLoss
* AdaptiveLayerLoss
* MegaBatchMarginLoss
* MultipleNegativesRankingLoss
* CachedMultipleNegativesRankingLoss
* MultipleNegativesSymmetricRankingLoss
* CachedMultipleNegativesSymmetricRankingLoss
* SoftmaxLoss
* TripletLoss
* DistillKLDivLoss - Samplers
* BatchSamplers
* MultiDatasetBatchSamplers - Evaluation
* BinaryClassificationEvaluator
* EmbeddingSimilarityEvaluator
* InformationRetrievalEvaluator
* NanoBEIREvaluator
* MSEEvaluator
* ParaphraseMiningEvaluator
* RerankingEvaluator
* SentenceEvaluator
* SequentialEvaluator
* TranslationEvaluator
* TripletEvaluator - Datasets
* ParallelSentencesDataset
* SentenceLabelDataset
* DenoisingAutoEncoderDataset
* NoDuplicatesDataLoader - Modules
* Main Modules
* Further Modules
* Base Modules - quantization
* quantize_embeddings()
* semantic_search_faiss()
* semantic_search_usearch()
- SentenceTransformer
- Cross Encoder
- CrossEncoder
* CrossEncoder
* CrossEncoderModelCardData - Trainer
* CrossEncoderTrainer - Training Arguments
* CrossEncoderTrainingArguments - Losses
* BinaryCrossEntropyLoss
* CrossEntropyLoss
* LambdaLoss
* ListMLELoss
* PListMLELoss
* ListNetLoss
* MultipleNegativesRankingLoss
* CachedMultipleNegativesRankingLoss
* MSELoss
* MarginMSELoss
* RankNetLoss - Evaluation
* CrossEncoderRerankingEvaluator
* CrossEncoderNanoBEIREvaluator
* CrossEncoderClassificationEvaluator
* CrossEncoderCorrelationEvaluator
- CrossEncoder
- Sparse Encoder
- SparseEncoder
* SparseEncoder
* SparseEncoderModelCardData
* SimilarityFunction - Trainer
* SparseEncoderTrainer - Training Arguments
* SparseEncoderTrainingArguments - Losses
* SpladeLoss
* CachedSpladeLoss
* FlopsLoss
* CSRLoss
* CSRReconstructionLoss
* SparseMultipleNegativesRankingLoss
* SparseMarginMSELoss
* SparseDistillKLDivLoss
* SparseTripletLoss
* SparseCosineSimilarityLoss
* SparseCoSENTLoss
* SparseAnglELoss
* SparseMSELoss - Samplers
* BatchSamplers
* MultiDatasetBatchSamplers - Evaluation
* SparseInformationRetrievalEvaluator
* SparseNanoBEIREvaluator
* SparseEmbeddingSimilarityEvaluator
* SparseBinaryClassificationEvaluator
* SparseTripletEvaluator
* SparseRerankingEvaluator
* SparseTranslationEvaluator
* SparseMSEEvaluator
* ReciprocalRankFusionEvaluator - Modules
* SPLADE Pooling
* MLM Transformer
* SparseAutoEncoder
* SparseStaticEmbedding - Callbacks
* SpladeRegularizerWeightSchedulerCallback - Search Engines
* semantic_search_elasticsearch()
* semantic_search_opensearch()
* semantic_search_qdrant()
* semantic_search_seismic()
- SparseEncoder
- util
- Helper Functions
* community_detection()
* http_get()
* is_training_available()
* mine_hard_negatives()
* normalize_embeddings()
* paraphrase_mining()
* semantic_search()
* truncate_embeddings() - Model Optimization
* export_dynamic_quantized_onnx_model()
* export_optimized_onnx_model()
* export_static_quantized_openvino_model() - Similarity Metrics
* cos_sim()
* dot_score()
* euclidean_sim()
* manhattan_sim()
* pairwise_cos_sim()
* pairwise_dot_score()
* pairwise_euclidean_sim()
* pairwise_manhattan_sim()
- Helper Functions
- Training Examples
- Edit on GitHub
Supervised Learning