[Python-Dev] test_gzip/test_tarfile failure om AMD64 (original) (raw)

Thomas Wouters thomas at python.org
Mon May 29 02:34:45 CEST 2006


On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:

On May 28, 2006, at 4:31 AM, Thomas Wouters wrote: > > I'm seeing a dubious failure of testgzip and testtarfile on my > AMD64 machine. It's triggered by the recent struct changes, but I'd > say it's probably caused by a bug/misfeature in zlibmodule: > zlib.crc32 is the result of a zlib 'crc32' functioncall, which > returns an unsigned long. zlib.crc32 turns that unsigned long into > a (signed) Python int, which means a number beyond 1<<31 goes_ _> negative on 32-bit systems and other systems with 32-bit longs, but > stays positive on systems with 64-bit longs: > > (32-bit) > >>> zlib.crc32("foobabazr") > -271938108 > > (64-bit) > >>> zlib.crc32("foobabazr") > 4023029188 > > The old structmodule coped with that: > >>> struct.pack("<l", -271938108)_ _> '\xc4\x8d\xca\xef' > >>> struct.pack("<l", 4023029188)_ _> '\xc4\x8d\xca\xef' > > The new one does not: > >>> struct.pack("<l", -271938108)_ _> '\xc4\x8d\xca\xef' > >>> struct.pack("<l", 4023029188)_ _> Traceback (most recent call last): > File "", line 1, in > File "Lib/struct.py", line 63, in pack > return o.pack(*args) > struct.error: 'l' format requires -2147483647 <= number <= 2147483647_ _> > The structmodule should be fixed (and a test added ;) but I'm also > wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my > suggested fix would be to change the PyIntFromLong() call to > PyLongFromUnsignedLong(), making zlib always return positive > numbers -- it might break some code on 32-bit platforms, but that > code is already broken on 64-bit platforms. But I guess I'm okay > with the long being changed into an actual 32-bit signed number on > 64-bit platforms, too. The struct module isn't what's broken here. All of the struct types have always had well defined bit sizes and alignment if you explicitly specify an endian, >I and >L are 32-bits everywhere, and >Q is supported on platforms that don't have long long. The only thing that's changed is that it actually checks for errors consistently now.

Yes. And that breaks things. I'm certain the behaviour is used in real-world code (and I don't mean just the gzip module.) It has always, as far as I can remember, accepted 'unsigned' values for the signed versions of ints, longs and long-longs (but not chars or shorts.) I agree that that's wrong, but I don't think changing struct to do the right thing should be done in 2.5. I don't even think it should be done in 2.6 -- although 3.0 is fine.

-- Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/3bc0bb68/attachment-0001.html



More information about the Python-Dev mailing list