[Python-Dev] More int/long integration issues (original) (raw)

Chad Netzer cnetzer@mail.arc.nasa.gov
13 Mar 2003 12:25:17 -0800


On Thu, 2003-03-13 at 08:42, Aahz wrote:

On Thu, Mar 13, 2003, David Abrahams wrote:

> Now that we have a kind of long/int integration, maybe it makes sense > to update xrange()? Or is that really a 2.4 feature?

IIRC, it was decided that doing that wouldn't make sense until the standard sequences (lists/tuples) can support more than 2**31 items.

I'm working on a patch that allows both range() and xrange() to work with large (PyLong) values. Currently, with my patch, the length of range is still limited to a C long (due to memory issues anyway), and xrange() could support longer sequences (conceptually), although indexing them still is limited to C int indices.

I noticed the need for a least supporting long values when I found some bugs in code that did things like:

a = 1/1e-5 range( a-20, a)

or

a = 1/1e-6 b = 1/1e-5 c = 1/1e-4 range(a, b, c)

Now, this example is hardcoded, but in graphing software, or other numerical work, the actual values come from the data set. All of a sudden, you could be dealing with very small numbers (say, because you want to examine error values), and you get:

a = 1/1e-21 b = 1/1e-20 c = 1/1e-19 range(a, b, c)

And your piece of code now fails. By the comments I've seen, this failure tends to come as a big surprise (people are simply expecting range to be able to work with PyLong values, over short lengths).

Also, someone who is working with large files (> C long on his machine) claimed to be having problems w/ xrange() failing (although, if he is indexing the xrange object, my patch can't help anyway)

I've seen enough people asking in the newsgroups about this behavior (at least four in the past 5 months or so), and I've submitted some application patches to make things work for these cases (ie. by explicitly subtracting out the large common base of each parameter, and adding it back in after the list is generated), so I decided to make a patch to change the range behavior.

Fixing range was relatively easy, and could be done with no performance penalty (the code to handle longs ranges is only invoked after the existing code path fails; the common case is unaltered). Fixing xrange() is trickier, and I'm opting to maintain backwards compatibility as much as possible.

In any case, I should have the patch ready to submit within the next week or so (just a few hours more work is needed, for testing and cleanup)

Then the argument about whether it should ever be included can begin in earnest. But I have seen enough examples of people being surprised that ranges of long values (where the range length is well within the addressable limit, but the range values must be PyLongs) that I think at least range() should be fixed. And if range() is fixed, then sadly, xrange() should be fixed as well (IMO).

BTW, I'm all for deprecating xrange() with all deliberate speed. Doing so would only make updating range behavior easier.

Chad