The .xz file format (original) (raw)
The .xz file format is a container format for compressed streams. There are no archiving capabilities, that is, the .xz format can hold only a single file just like the .gz and .bz2 file formats used by gzip and bzip2, respectively.
- **Streamable:**It is always possible to create and decompress .xz files in a pipe; no seeking is required.
- **Random-access reading:**The data can be split into independently compressed blocks. Every .xz file contains an index of the blocks, which makes limited random-access reading possible when the block size is small enough.
- **Multiple filters (algorithms):**It is possible to add support for new filters, so no new file format is needed every time a new algorithm has been developed. Developers can use a developer-specific filter ID space for experimental filters.
- **Filter chaining:**Up to four filters can be chained, which is very similar to piping on the command line. Chaining can improve compression ratio with some file types. A different filter chain can be used for every independently compressed block.
- **Integrity checks:**Metadata is always protected with CRC32. The integrity of the actual data may be verified with CRC32, CRC64, SHA-256, or the check may be omitted completely. It is possible to add new integrity checks in the future, but there is no possibility for developer-specific check IDs like there is for filter IDs.
- **Concatenation:**Just like with .gz and .bz2 files, .xz files can be concatenated. A decompressor can handle a concatenated file as if it was a regular single-stream .xz file.
- Padding: Binary zeros may be appended to .xz files to pad them to fill, for example, a block on a backup tape. Padding must be a multiple of four bytes because the size of a valid .xz file is always a multiple of four bytes.
Once a new filter or integrity check has been added to the .xz file format specification, it won’t be removed. This is to ensure that all .xz files, that use only the filters defined in the .xz file format specification, can always be decompressed in the future.
New filters, integrity checks, or other additions to the .xz file format are unlikely to occur very often. New filters are only added to the official list when they are clearly useful.