[Python-Dev] Speeding up instance attribute access (original) (raw)

Guido van Rossum guido@python.org
Fri, 08 Feb 2002 14🔞20 -0500

Previous message: [Python-Dev] Speeding up instance attribute access
Next message: [Python-Dev] Speeding up instance attribute access
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

(By mistake I sent an incomplete version of this earlier. Please ignore that and read this instead.)

Inspired by the second half of Jeremy's talk on DevDay, here's my alternative approach for speeding up instance attribute access. Like my idea for globals, it uses double indirection rather than recompilation.

We only care about attributes of 'self' (which is identified as the first argument of a method, not by name). We can exclude functions from our analysis that make any assignment to self -- this is extremely rare and would throw off our analysis. We should also exclude static methods and class methods, since their first argument doesn't have the same role.
Static analysis of the source code of a class (without access to the base class) can determine attributes of the class, and to some extent instance variables. Without also analyzing the base classes, this analysis cannot reliably distinguish between instance variables and methods inherited from a base class; it can distinguish between instance variables and methods defined in the current class.
We can guess the status of un-assigned-to inherited attributes by seeing whether they are called or not. This is not 100% accurate, so we need things to work (if slower) even when we guess wrong.
For instance variable references and stores of the form self., the bytecode compiler emits opcodes LOAD_SELF_IVAR and STORE_SELF_IVAR , where is a small int identifying the instance variable (ivar). A particular ivar is identified by the same throughout all methods defined in the same class statement, but there is no attempt to coordinate this across different classes related by inheritance.
Some data structure describing the mapping from to attribute name, and whether it's an ivar or a method, is produced by the compiler and stored in the class, in a way that a user can't change it. The function objects representing methods also contain a pointer to this data structure. (Or the code objects? But it needs to be shared. Details, details.)
At run time, when a class object is created, another data structure is created that accumulates the -to-name mappings from that class and all its base classes. This has more accurate information because it can collect information from the bases (though it can still be fooled by dynamic manipulations of classes). In particular, it has the canonical set of known instance variables for instances of this class. This is stored in the class object, in a way that a user can't change it.
When an instance is created, the run-time data structure stored in the class is consulted to know how many instance variables to allocate. An array is allocated at the end of the instance (or in a separately allocated block?) with one PyObject pointer for each known instance variable. There's also a pointer to the instance dict, to hold instance variables that the parser didn't spot, but this pointer starts off as NULL -- a dictionary is created for it only when needed.
There is no requirement that the layout of the ivar array for instances of a subclass is an extension of the layout of the ivar array for its base classes. But there is a requirement that all instances of the same class that are not instances of a subclass (i.e., all x and y where x.class is y.class) have the same layout, and this layout is determined by the run-time data structure stored in the class.
Now all we need is an efficient way to map LOAD_SELF_IVAR to an index in the array of ivars. Two different classes play a role here: the class of self (the run-time class) and the class that defined the method containing the LOAD_SELF_IVAR opcode (the compile-time class). We assume the run-time class is a subclass of the compile-time class. The correct mapping can easily be calculated from the data structures left behind by the compiler in the compile-time class and by class construction at run time in the run-time class. Since this mapping doesn't change, it can be calculated once and then cached. The cache can be held in the run-time class; it can be a dictionary whose keys are compile-time classes. This means a single dict lookup that must be done once per method call (and only when LOAD_SELF_IVAR or STORE_SELF_IVAR is used). We could save even that dict lookup in most cases by caching the run-time class and the outcome with the method.
We need fallbacks for various exceptional cases:
- If the compile-time class uses LOAD_SELF_IVAR but the run-time class doesn't think that is an instance variable, LOAD_SELF_IVAR must fall back to look in the instance dict (if non-NULL) and then down the run-time class and its base classes.
- If the ivar slot in the instance corresponding to exists but is NULL, LOAD_SELF_IVAR must fall back to searching the run-time class and its base classes.
For both cases, the mapping from to attribute names must be available. The language and the current code generation guarantee that 'self' is the first local variable.
The instance dict must become a proxy that knows whether a given name is stored in the array of ivars or in the overflow dict; this is much like Jeremy's DLict.
Note that there are two sources of savings: the major savings (probably) comes from avoiding a dict lookup for every ivar access; an additional minor savings comes from collapsing two opcodes:
```
LOAD_FAST       0 (self)
LOAD_ATTR       1 (foo)
```
into one:
```
LOAD_SELF_IVAR  0 (self.foo)
```
I don't know if we should try to generate LOAD_SELF_IVAR only for things that really are (likely) ivars, or for all attributes. Maybe it would be nice if we also had a single-opcode way to express a method call on self, e.g. CALL_SELF_METHOD , , where identifies the method, and and are the number of positional and keyword arguments. Or maybe we should just have LOAD_SELF_METHOD which looks in the instance overflow dict (but only if non-NULL) and in the class and bases, but can avoid looking in the ivars array if does not describe a known ivar (again this information can be cached).
The required global analysis is a bit hairy, and not something we already do. I believe that Jeremy thinks PyChecker already does this; I'm not sure if that means we can borrow code or just ideas.

--Guido van Rossum (home page: http://www.python.org/~guido/)

Previous message: [Python-Dev] Speeding up instance attribute access
Next message: [Python-Dev] Speeding up instance attribute access
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]