[Python-3000] Draft PEP: Dropping PyObject_HEAD (original) (raw)

Brett Cannon brett at python.org
Sat Apr 28 01:30:50 CEST 2007


Second PEP today. Martin is on a roll! =)

On 4/27/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:

I propose the PEP below for Py3k.

Regards, Martin PEP: 3122 Title: Dropping PyObjectHEAD Version: Revision:54998Revision: 54998 Revision:54998 Last-Modified: Date:2007−04−2710:31:58+0200(Fr,27Apr2007)Date: 2007-04-27 10:31:58 +0200 (Fr, 27 Apr 2007) Date:2007042710:31:58+0200(Fr,27Apr2007) Author: Martin v. Löwis <martin at v.loewis.de> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Apr-2007 Python-Version: 3.0 Post-History: Abstract ======== Python currently relies on undefined C behavior, with its usage of PyObjectHEAD. This PEP proposes to change that into standard C. Rationale ========= Standard C defines that an object must be accessed only through a pointer of its type, and that all other accesses are undefined behavior, with a few exceptions. In particular, the following code has undefined behavior:: struct FooObject{ PyObjectHEAD int data; }; PyObject foo(struct FooObjectf){ return (PyObject*)f; } int bar(){ struct FooObject *f = malloc(sizeof(struct FooObject)); struct PyObject *o = foo(f); f->obrefcnt = 0; o->obrefcnt = 1; return f->obrefcnt; } The problem here is that the storage is both accessed as if it where struct PyObject, and as struct FooObject. Historically, compilers did not cause any problems with this

Reads easier if you replace "cause" with "have".

code. However, modern compiler use that clause as an

Probably want to pluralize "compiler".

Your use of "clause" really confused me until I realized what you were talking about.

optimization opportunity, finding that f->obrefcnt and o->obrefcnt cannot possibly refer to the same memory, and that therefore the function should return 0, without having to fetch the value of obrefcnt at all in the return statement. For GCC, Python now uses -fno-strict-aliasing to work around that problem; with other compilers, it may just see undefined behavior. Even with GCC, using -fno-strict-aliasing may pessimize the generated code unnecessarily.

Specification ============= Standard C has one specific exception to its aliasing rules precisely designed to support the case of Python: a value of a struct type may also be accessed through a pointer to the first field. E.g. if a struct starts with an int, the struct* may also be cast to an int*, allowing to write int values into the first field. For Python, PyObjectHEAD and PyObjectVARHEAD will be dropped, and PyObject gets defined to contain all fields explicitly:: typedef struct object{ PyObjectHEADEXTRA Pyssizet obrefcnt; struct typeobject *obtype; }PyObject; typedef struct { PyObject obbase; Pyssizet obsize; } PyVarObject; Types defined as fixed-size structure will then include PyObject as its first field; variable-sized objects PyVarObject. E.g.:: typedef struct{ PyObject obbase; PyObject *start, *stop, *step; } PySliceObject; typedef struct{ PyVarObject obbase; PyObject **obitem; Pyssizet allocated; } PyListObject; As a convention, the base field SHOULD be called obbase. However, all accesses to obrefcnt and obtype MUST cast the object pointer to PyObject* (unless the pointer is already known to have that type), and SHOULD use the respective accessor macros. To simplify access to obtype, a macro:: #define PyType(o) (((PyObject*)o)->obtype) is added.

An example of how this will change current code would be good. E.g., o->ob_type->tp_name becomes PyType(o)->typ_name or o->ob_base->ob_type->tp_name.

Otherwise I am all for cleaning up the codebase and thus support this PEP.

-Brett



More information about the Python-3000 mailing list