[Python-Dev] Tagged integers (original) (raw)

James Y Knight foom at fuhm.net
Wed Jul 14 08:41:19 CEST 2004


So I was saying to someone the other day "Gee, I wonder why Python doesn't use tagged integers, it seems like it would be a lot faster than allocating new objects all the time.", and they said "Cause you'd have to change everything, too much work!" and I said "Nah, you only need to change a few things to use macros, it'd only take a few hours, mostly query-replace".

So, of course, I had to do it then, and it only took a couple hours, and appears to be at least somewhat faster.

On the test that probably puts my change in the most positive light possible: "x = 0; while x < 50000000: x = x + 1", it achieves about a 50% increase in speed. More normal integer-heavy things seem to be at most 20% faster.

My implementation works as follows:

So, why doesn't python already use tagged integers? Surely someone's thought to "just do it" before? I only see discussion of it with relation to pypy.

A couple things:

Here's the patch I have against Python-2.3.3. Please note this is just a couple hour hack, it may have errors. Most of the diff is quite boring and repetitious. <http://fuhm.net/~jknight/python233-tagint.diff.gz>.

So, any thoughts? Worth continuing on with this? If this is something that people are interested in actually doing, I could update the patch against latest CVS and put the changes in #ifdefs so it's compile-time selectable.

Thoughts for future development: There is space available for 2 more tagged data types. Could they be productively used? Perhaps one for single element tuples. Perhaps one for single character unicode strings. Dunno if those are easily doable and would actually increase performance.

James

PS: Here's the interesting portions of the changes. Yes, I realize typeof() and ({ are GCC extensions, but this was the most straightforward way to implement inline expression macros that may use their arguments more than once. Maybe they should be inline functions instead of macros?

===== object.h

#define Py_OBJ_TAG(op) (((Py_uintptr_t)(op)) & 0x03)

#define Py_GETTYPE(op) ({typeof((op)) __op = (op);
(!Py_OBJ_TAG(__op))?__op->ob_type:Py_tagged_types[Py_OBJ_TAG(__op)]; })

#define Py_GETREF(op) ({typeof((op)) __op = (op);
Py_OBJ_TAG(__op)?1:__op->ob_refcnt; }) #define Py_SETREF(op, val) ({typeof((op)) __op = (op);
if(!Py_OBJ_TAG(__op))
__op->ob_refcnt = (val);
})

#define Py_ADDREF(op, amt) ({typeof((op)) ___op = (op);
if(!Py_OBJ_TAG(___op))
___op->ob_refcnt += (amt);
Py_GETREF(___op);
})

#define Py_INCREF(op) (
_Py_INC_REFTOTAL _Py_REF_DEBUG_COMMA
Py_ADDREF(op, 1), (void)0)

#define Py_DECREF(op)
if (_Py_DEC_REFTOTAL _Py_REF_DEBUG_COMMA
Py_ADDREF(op, -1) != 0)
_Py_CHECK_REFCNT(op)
else
_Py_Dealloc((PyObject *)(op))

===== intobject.h

#define PyInt_AS_LONG(op) ({typeof((op)) __op = (op);
Py_OBJ_TAG(__op)?(((long)__op) >> 2):((PyIntObject *)__op)->ob_ival; })



More information about the Python-Dev mailing list