Issue 1153056: PyXxx_Check() speed-up - Python tracker (original) (raw)

Created on 2005-02-27 21:14 by arigo, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
typecheck_fast.diff arigo,2005-02-27 21:14 patch #1
typecheck_fast_2.diff arigo,2005-12-14 13:11 updated to svn head
Messages (11)
msg47858 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-02-27 21:14
Once upon a time, the macros PyInt_Check(), PyString_Check(), etc., used to be very fast: just a pointer comparison. But 2.2 came. Suddenly, types could inherit from each other. Thus the macros became: (ob->type == &PyXxx_Check | PyType_IsSubType(ob->type, &PyXxx_Check) Subtyping of built-in types still being rare, the common cases here are: (1) ob is indeed exactly of type Xxx, (2) ob is absolutely not an instance of Xxx. The macro is fast for case 1 but becomes a call to a slow function in case 2. It sounds minor -- but a program can make half a million "case 2" calls in 5 seconds. The following patch proposes to fix that, based on a few measurements that show which are the more common "case 2" calls in a few applications. It adds a family of type flags meaning "this type is not at all an Xxx", for Xxx in [module, tuple, float, int, unicode, long, complex, str]. I cannot measure the performance gain on this machine (figures vary all the time) but it looks like some 0.2s for the above-mentioned 5s program.
msg47859 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2005-02-27 23:44
Logged In: YES user_id=21627 I would rather see a constant-time subtype check implemented in Python, using strategies that other interpreters have been using for a long time. Essentially, each type gets a unique number (e.g. in the order of PyType_Ready calls), and a bit-field of all supertypes, with a bit set for each type that is one of its supertypes. Creation of the bitfield could be delayed until a subtype check is actually needed. The tricky part is to update the supertype bitmask when __super__ is assigned.
msg47860 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-02-27 23:56
Logged In: YES user_id=4771 I'm not sure I understand your suggestion. Would each type have an array of N bits, where N is the total number of types that exist? That looks quite unreasonable in any program that builds a lot of types (I know some that build them programatically in loops). What I did is a limited version of that: a bitfield with room for only some commonly checked-for superclasses. (With some care, my patch could be modified to build the bitfield in PyType_Ready(); this would allow the PyXxx_Check() macros to become a single bit check operation.) Changing the base classes isn't an issue in my limited patch because you cannot ever add or remove the core built-in types I check for from the bases of an existing type.
msg47861 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2005-02-28 06:20
Logged In: YES user_id=21627 No. Each type would have an array of M bits, where M is the maximum number of a super class; each type would also store the value of M. As types are created and deleted dynamically, it might be necessary to keep track of assigned type numbers, and recycle them as types are deleted.
msg47862 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-02-28 09:26
Logged In: YES user_id=4771 Ah, but that's not so different: M can still be arbitrarily large... However, see also bug #1153075. I think that PyInt_Check() using PyType_IsSubtype() is essentially buggy: the purpose of PyInt_Check() is to check if the PyObject* is a PyIntObject*, not if its type pretends to have 'int' in its mro. If we fix this, then making PyType_IsSubtype() faster will become unrelated to making PyXxx_Check() faster. If we want to fix PyXxx_Check() for the bug described above and make it fast too, then we probably only need an array of N bits, where N is the number of C-level PyXxxObject structures. The idea of only including the most common such structures instead of all of them was based on simplicity and performance (it's faster and easier to check for a known bit than for a bit which itself is computed at run-time). If we want to make PyType_IsSubtype() fast, it might be worth the effort or not; we should measure it. But I think it's unrelated.
msg47863 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2005-04-14 08:05
Logged In: YES user_id=6656 (I haven't seen any mail about this patch!) > array of N bits, where N is the number of C-level PyXxxObject > structures That's unbounded too, in the presence of extension modules. (I still think my fix for #1153075 suffices, and is probably a good thing). I can believe that special casing int and str subclasses is worth it -- but where to stop?
msg47864 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-04-14 09:22
Logged In: YES user_id=4771 I am convinced now by #1153075 that PyCheck_Xxx should not be semantically changed, and really means walking the mro. So the patch here remains valid. Yes, I know that N is unbounded, that's why the next sentence was "let's only include the most common of them". Where to stop? Don't guess, measure. That's what I did, and in a few applications there were systematically a set of types that were PyChecked quite more often than the others. So my patch special-cases exactly these types.
msg47865 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2005-04-14 10:57
Logged In: YES user_id=6656 I suppose I should read the patch then :) It seems very odd that _PyType_FastTypeCheck mutates a->tp_flags. In fact, shoudn't it be possible to not have _PyType_FastTypeCheck at all?
msg47866 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-04-14 13:48
Logged In: YES user_id=4771 There might be problems with types that don't go through PyType_Ready(). We can't just compute the correct flags in there and do a single bit test in PyXxx_Check(), as far as I can see. Hence the idea to add the flag to a type only when it is dynamically found not to be a subtype of the given base type. I can't think of anything bad happening because of this mutation...
msg47867 - (view) Author: Björn Lindqvist (sonderblade) Date: 2005-05-08 23:26
Logged In: YES user_id=51702 With pystone.py I could measure a speedup of atleast 1000 pystones. From 36764 stones without the patch to 38461 with the patch. Granted, I didn't test very scientifically but the speedup is noticeable.
msg47868 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-12-14 13:11
Logged In: YES user_id=4771 I measured the performance gain more precisely. On a couple of modern Pentium 4s it makes no difference at all. On my old laptop (Pentium III 700Mhz) results are a bit inconsistent. As usual, Pystone shows speed-ups but not really anything else. So I'm just closing this. For reference I attach the patch updated to the svn head.
History
Date User Action Args
2022-04-11 14:56:09 admin set github: 41629
2005-02-27 21:14:15 arigo create