[Python-3000] long/int unification (original) (raw)
martin at v.loewis.de martin at v.loewis.de
Fri Aug 25 03:49:55 CEST 2006
- Previous message: [Python-3000] Removing 'old-style' ('simple') slices from Py3K.
- Next message: [Python-3000] long/int unification
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Here is a quick status of the int_unification branch, summarizing what I did at the Google sprint in NYC.
- the int type has been dropped; the builtins int and long now both refer to long type
- all PyInt_* API is forwarded to the PyLong_* API. Little changes to the C code are necessary; the most common offender is PyInt_AS_LONG((PyIntObject*)v) since I completely removed PyIntObject.
- Much of the test suite passes, although it still has a number of bugs.
- There are timing tests for allocation and for addition. On allocation, the current implementation is about a factor of 2 slower; the integer addition is about 1.5 times slower; the initial slowdowns was by a factor of 3. The pystones dropped about 10% (pybench fails to run on p3yk).
A couple of interesting observations:
- bool was a subtype of int, and is now a subtype of long. In order to avoid knowing the internal representation of long, the bool type compares addresses against Py_True and Py_False, instead of looking at ob_ival.
- to add the small ints cache, an array of statically allocated longs is used, rather than heap-allocating them.
- after adding the small ints cache, lot of things broke, e.g. for code like py> x = 4 py> x = -4 py> x -4 py> 4 -4 This happened because long methods just toggle the sign of the object they got, messing up the small ints cache.
- to further speedup the implementation, I added special casing for one-digit numbers. As they are always in range(-32767,32768), the arithmethic operations don't need overflow checking anymore (even multiplication won't overflow 32-bit int).
- I found that in 2.x, long objects overallocate 2 byte on a 32-bit machine, and 6 bytes on a 64-bit machine, because sizeof(PyLongObject) rounds up.
- pickle and marshal have been changed to deal with the loss of int; pickle generates INT codes even for longs now provided the value is in the range for the code.
I'm not sure whether this performance change is acceptable; at this point, I'm running out of ideas how to further improve the performance. Using a plain 32-bit int as the representation could be another try, but I somewhat doubt it helps given that the the supposedly-simpler single-digit case is so slow.
Regards, Martin
- Previous message: [Python-3000] Removing 'old-style' ('simple') slices from Py3K.
- Next message: [Python-3000] long/int unification
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]