Paper page - Distilling Dense Representations for Ranking using Tightly-Coupled

Published on Oct 22, 2020

Abstract

The approach improves query latency and storage requirements of ColBERT model through knowledge distillation into a dot product, combining with sparse representations to match cross-encoder reranker effectiveness.

We present an approach to ranking with dense representations that appliesknowledge distillation to improve the recently proposed late-interactionColBERT model. Specifically, we distill the knowledge from ColBERT's expressiveMaxSim operator for computing relevance scores into a simple dot product, thus enabling single-step ANN search. Our key insight is that during distillation, tight coupling between the teacher model and the student model enables more flexible distillation strategies and yields better learned representations. We empirically show that our approach improves query latency and greatly reduces the onerous storage requirements of ColBERT, while only making modest sacrifices in terms of effectiveness. By combining our dense representations with sparse representations derived from document expansion, we are able to approach the effectiveness of a standard cross-encoder reranker using BERT that is orders of magnitude slower.

View arXiv page View PDF GitHub 8 auto Add to collection

Get this paper in your agent:

hf papers read 2010.11386

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash