[Python-Dev] Question on bz2 codec. Is this a bug? (original) (raw)

Chris Bergstresser chris at subtlety.com
Wed Sep 29 22:06:16 CEST 2010

Previous message: [Python-Dev] We should be using a tool for code reviews
Next message: [Python-Dev] Question on bz2 codec. Is this a bug?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi all --

I looked through the bug tracker, but I didn't see this listed. I was trying to use the bz2 codec, but it seems like it's not very useful in the current form (and I'm not sure if it's getting added back to py3k, so maybe this is a moot point). It looks like the codec writes every piece of data fed to it as a separate compressed block. This results in compressed files which are significantly larger than the uncompressed files, if you're writing a lot of small bursts of data. It also leads to interesing oddities like this:

import codecs

with codecs.open('text.bz2', 'w', 'bz2') as f:
    for x in xrange(20):
        f.write('This is data %i\n' % x)

with codecs.open('text.bz2', 'r', 'bz2') as f:
    print f.read()

This prints "This is data 0" and exits, because the codec won't read beyond the first compressed block.

My question is, is this known, intended behavior? Should I open a bug report? Is it going away in py3k, so there's no real point in fixing it?

-- Chris

Previous message: [Python-Dev] We should be using a tool for code reviews
Next message: [Python-Dev] Question on bz2 codec. Is this a bug?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list