[Python-Dev] versioned .so files for Python 3.2 (original) (raw)

John Arbash Meinel john.arbash.meinel at gmail.com
Wed Jul 7 23:56:23 CEST 2010


Scott Dial wrote:

On 6/30/2010 2:53 PM, Barry Warsaw wrote:

It might be amazing, but it's still a significant overhead. As I've described, multiply that by all the py files in all the distro packages containing Python source code, and then still try to fit it on a CDROM. I decided to prove to myself that it was not a significant issue to have parallel directory structures in a .tar.bz2, and I was surprised to find it much worse at that then I had imagined. For example, # cd /usr/lib/python2.6/site-packages _# tar --exclude=".pyc" --exclude=".pyo" _ -cjf mercurial.tar.bz2 mercurial # du -h mercurial.tar.bz2 640K mercurial.tar.bz2 # cp -a mercurial mercurial2 _# tar --exclude=".pyc" --exclude=".pyo" _ -cjf mercurial2.tar.bz2 mercurial mercurial2 # du -h mercurial.tar.bz2 1.3M mercurial2.tar.bz2

I believe the standard (and largest) block size for .bz2 is 900kB, and I think that is uncompressed. Though I know that bz2 can chain, since it can compress all NULL bytes extremely well (multiple GB down to kB, IIRC).

There was a question as to whether LZMA would do better here, I'm using 7zip, but .xz should perform similarly.

$ du -sh mercurial* 2.6M mercurial 2.6M mercurial2

366K mercurial.tar.bz2 734K mercurial2.tar.bz2

303K mercurial.7z 310K mercurial2.7z

So LZMA with the 'normal' compression has a big enough window to find almost all of the redundancy, and 310kB is certainly a very small increase over the 303kB. And clearly bz2 does not, since 734kB is actually slightly more than 2x 366kB.

John =:->



More information about the Python-Dev mailing list