Paper page - Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks (original) (raw)

Published on Aug 27, 2019

Abstract

Sentence-BERT (SBERT) improves semantic sentence similarity search efficiency while maintaining accuracy by using siamese and triplet network structures.

BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared usingcosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learningtasks, where it outperforms other state-of-the-art sentence embeddings methods.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Get this paper in your agent:

hf papers read 1908.10084

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 1,000+

sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 Sentence Similarity • 0.1B • Updated Jan 28 • 29M • 1.19k

sentence-transformers/paraphrase-multilingual-mpnet-base-v2 Sentence Similarity • 0.3B • Updated Aug 19, 2025 • 4.28M • 458

lightonai/Reason-ModernColBERT Sentence Similarity • 0.1B • Updated Sep 9, 2025 • 13.9k • 240

sentence-transformers/distiluse-base-multilingual-cased-v2 Sentence Similarity • 0.1B • Updated Mar 6, 2025 • 713k • 208

Browse 1,000+ models citing this paper

Datasets citing this paper 3

avduarte333/arXivTection Viewer • Updated Sep 24, 2024• 1.55k • 766 • 2

codeparrot/self-instruct-starcoder Viewer • Updated Oct 23, 2023• 9.63k • 520 • 63

chungimungi/arxiv-hard-negatives-cross-encoder Viewer • Updated Dec 30, 2025• 7.25k • 15

Spaces citing this paper 2,169

Collections including this paper 10

Browse 10 collections that include this paper