Compression of a Set of Correlated Bitmaps (original) (raw)

International ACM SIGIR Conference on Research and Development in Information Retrieval, 1991

Abstract

In large IR systems, information about word occurrence may be stored as a bit matrix, with rows corresponding to different words and columns to documents. Such a matrix is generally very large and very sparse. New methods for compressing such matrices are presented, which exploit possible correlations between rows and between columns. The methods are based on partitioning the matrix into small blocks and predicting the l-bit distribution within a block by means of various bit generation models. Each block is then encoded using Huffman or arithmetic coding. Preliminary experimental results indicate improvements over previous methods.

Shmuel Tomi Klein hasn't uploaded this paper.

Let Shmuel Tomi know you want this paper to be uploaded.

Ask for this paper to be uploaded.