The Canterbury Corpus (original) (raw)
Welcome to the Canterbury Corpus
The Canterbury Corpus is a benchmark to enable researchers to evaluate lossless compression methods. This site includes test files and compression test results for many research compression methods.
Site Contents
What the Corpus is, and why
A summary of the compression test results
More detailed results, including some statistical analysis
Descriptions of the various corpora
Descriptions of the compression methods
Research on the corpus and compression in general (includes papers and reports in PDF format)
Links to related web sites dealing with lossless compression and compression in general
Who did what