The Canterbury Corpus (original) (raw)


Welcome to the Canterbury Corpus


The Canterbury Corpus is a benchmark to enable researchers to evaluate lossless compression methods. This site includes test files and compression test results for many research compression methods.


Site Contents


p u r p o s e

What the Corpus is, and why

s u m m a r y

A summary of the compression test results

d e t a i l s

More detailed results, including some statistical analysis

c o r p o r a

Descriptions of the various corpora

m e t h o d s

Descriptions of the compression methods

r e s e a r c h

Research on the corpus and compression in general (includes papers and reports in PDF format)

r e l a t e d

Links to related web sites dealing with lossless compression and compression in general

c r e d i t s

Who did what