[Python-Dev] Bad interaction of index and sequence repeat (original) (raw)
Travis Oliphant oliphant.travis at ieee.org
Mon Jul 31 20:28:09 CEST 2006
- Previous message: [Python-Dev] Bad interaction of __index__ and sequence repeat
- Next message: [Python-Dev] patching pydoc?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Nick Coghlan wrote:
Nick Coghlan wrote:
Armin Rigo wrote:
Hi,
There is an oversight in the design of index() that only just surfaced :-( It is responsible for the following behavior, on a 32-bit machine with >= 2GB of RAM: >>> s = 'x' * (2**100) # works! >>> len(s) 2147483647 This is because PySequenceRepeat(v, w) works by applying w.index in order to call v->sqrepeat. However, index is defined to clip the result to fit in a Pyssizet. This means that the above problem exists with all sequences, not just strings, given enough RAM to create such sequences with 2147483647 items. For reference, in 2.4 we correctly get an OverflowError. Argh! What should be done about it? I've now got a patch on SF that aims to fix this properly [1]. I revised this patch to further reduce the code duplication associated with the indexing code in the standard library. The patch now has three new functions in the abstract C API: PyNumberIndex (used in a dozen or so places) - raises IndexError on overflow PyNumberAsSsizet (used in 3 places) - raises OverflowError on overflow PyNumberAsClippedSsizet() (used once, by PyEvalSliceIndex) - clips to PYSSIZETMIN/MAX on overflow All 3 have an int * output argument allowing type errors to be flagged directly to the caller rather than through PyErrOccurred(). Of the 3, only PyNumberIndex is exposed through the operator module. Probably the most interesting thing now would be for Travis to review it, and see whether it makes things easier to handle for the Numeric scalar types (given the amount of code the patch deleted from the builtin and standard library data types, hopefully the benefits to Numeric will be comparable).
I noticed most of the checks for PyInt where removed in the patch. If I remember correctly, I left these in for "optimization." Other than that, I think the patch is great.
As far as helping with NumPy, I think it will help to be able to remove special-checks for all the different integer-types. But, this has not yet been done in the NumPy code.
-Travis
- Previous message: [Python-Dev] Bad interaction of __index__ and sequence repeat
- Next message: [Python-Dev] patching pydoc?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]