[Python-Dev] int vs ssize_t in unicode (original) (raw)

Neal Norwitz nnorwitz at gmail.com
Fri Apr 14 09:10:30 CEST 2006


On 4/13/06, "Martin v. Löwis" <martin at v.loewis.de> wrote:

Neal Norwitz wrote: > I just grepped for INTMAX and there's a ton of them still (well 83 in > /.c). Some aren't an issue like posixmodule.c, those are > SCINTMAX. marshal is probably ok, but all uses should be verified. > Really all uses of {INT,LONG}{MIN,MAX} should be verified and > converted to PYSSIZET{MIN,MAX} as appropriate.

BTW, it would be great if someone could try to put together some tests for bigmem machines. I'll add it to the todo wiki. The tests should be broken up by those that require 2+ GB of memory, those that take 4+, etc. Many people won't have boxes with that much memory.

The test cases should test all methods (don't forget slicing operations) at boundary points, particularly just before and after 2GB. Strings are probably the easiest. There's unicode too. lists, dicts are good but will take more than 16 GB of RAM, so those can be pushed out some.

I have some machines available for testing.

I replaced all the trivial ones; the remaining ones are (IMO) more involved, or correct. In particular:

- collectionsmodule: deque is still restricted to 2GiB elements - cPickle: pickling does not support huge strings (and probably shouldn't); likewise marshal - sre is still limited to INTMAX completely

I've got outstanding changes somewhere to clean up _sre.

- not sure why the mbcs codec is restricted to INTMAX; somebody should check the Win64 API whether the restriction can be removed (most likely, it can) - PyObjectCallFunction must be duplicated for PYSSIZETCLEAN, then exceptions.c can be extended.

My new favorite static analysis tool is grep:

grep '(int)' /.c | egrep -v 'sizeof(int)' | wc -l 418

I know a bunch of those aren't problematic, but a bunch are. Same with long casts.

n



More information about the Python-Dev mailing list