Paper page - AnglE-optimized Text Embeddings (original) (raw)

Published on Sep 22, 2023

Abstract

AnglE, a novel angle-optimized text embedding model, addresses vanishing gradient issues in semantic textual similarity tasks by mitigating cosine function saturation.

High-quality text embedding is pivotal in improving semantic textual similarity (STS) tasks, which are crucial components in Large Language Model (LLM) applications. However, a common challenge existing text embedding models face is the problem of vanishing gradients, primarily due to their reliance on the cosine function in the optimization objective, which has saturation zones. To address this issue, this paper proposes a novel angle-optimized text embedding model called AnglE. The core idea of AnglE is to introduce angle optimization in a complex space. This novel approach effectively mitigates the adverse effects of the saturation zone in the cosine function, which can impede gradient and hinder optimization processes. To set up a comprehensive STS evaluation, we experimented on existing short-text STS datasets and a newly collected long-text STS dataset from GitHub Issues. Furthermore, we examinedomain-specific STS scenarios with limited labeled data and explore how AnglE works with LLM-annotated data. Extensive experiments were conducted on various tasks including short-text STS, long-text STS, and domain-specific STS tasks. The results show that AnglE outperforms the state-of-the-art (SOTA) STS models that ignore the cosine saturation zone. These findings demonstrate the ability of AnglE to generate high-quality text embeddings and the usefulness of angle optimization in STS.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2309.12871

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 44

mixedbread-ai/mxbai-embed-large-v1 Feature Extraction • 0.3B • Updated Jan 23 • 3.62M • 781

WhereIsAI/UAE-Large-V1 Feature Extraction • Updated Jul 29, 2025 • 1.17M • 237

deepvk/USER-bge-m3 Sentence Similarity • 0.4B • Updated Jul 18, 2024 • 239k • 75

mixedbread-ai/deepset-mxbai-embed-de-large-v1 Feature Extraction • 0.5B • Updated Mar 24, 2025 • 113k • 58

Browse 44 models citing this paper

Datasets citing this paper 1

WhereIsAI/github-issue-similarity Viewer • Updated May 3, 2024• 32.7k • 139 • 8

Spaces citing this paper 311

Collections including this paper 8

Browse 8 collections that include this paper