Paper page - Efficient Passage Retrieval with Hashing for Open-domain Question

Answering (original) (raw)

Abstract

Binary Passage Retriever (BPR) reduces memory cost without accuracy loss by integrating learning-to-hash into Dense Passage Retriever (DPR).

Most state-of-the-art open-domain question answering systems use a neural retrieval model to encode passages into continuous vectors and extract them from a knowledge source. However, such retrieval models often require large memory to run because of the massive size of their passage index. In this paper, we introduce Binary Passage Retriever (BPR), a memory-efficient neural retrieval model that integrates a learning-to-hash technique into the state-of-the-art Dense Passage Retriever (DPR) to represent the passage index using compact binary codes rather than continuous vectors. BPR is trained with a multi-task objective over two tasks: efficient candidate generation based on binary codes and accurate reranking based on continuous vectors. Compared with DPR, BPR substantially reduces the memory cost from 65GB to 2GB without a loss of accuracy on two standard open-domain question answering benchmarks: Natural Questions and TriviaQA. Our code and trained models are available at https://github.com/studio-ousia/bpr.

View arXiv page View PDF GitHub 174 auto Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Get this paper in your agent:

hf papers read 2106.00882

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 2

castorini/bpr-nq-ctx-encoder

castorini/bpr-nq-question-encoder Feature Extraction • Updated Sep 5, 2021 • 32

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2106.00882 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.