[Python-Dev] [Python-checkins] r79397 - in python/trunk: Doc/c-api/capsule.rst Doc/c-api/cobject.rst Doc/c-api/concrete.rst Doc/data/refcounts.dat Doc/extending/extending.rst Include/Python.h Include/cStringIO.h Include/cobject.h Include/datetime.h Include/py_curses.h Include/pycapsule.h Include/pyexpat.h Include/ucnhash.h Lib/test/test_sys.py Makefile.pre.in Misc/NEWS Modules/_ctypes/callproc.c Modules/_ctypes/cfield.c Modules/_ctypes/ctypes.h Modules/_cursesmodule.c Modules/_elementtree.c Modules/_testcapimodule.c Modules/cStringIO.c Modules/cjkcodecs/cjkcodecs.h Modules/cjkcodecs/multibytecodec.c Modules/cjkcodecs/multibytecodec.h Modules/datetimemodule.c Modules/pyexpat.c Modules/socketmodule.c Modules/socketmodule.h Modules/unicodedata.c Objects/capsule.c Objects/object.c Objects/unicodeobject.c PC/VS7.1/pythoncore.vcproj PC/VS8.0/pythoncore.vcproj PC/os2emx/python27.def PC/os2vacpp/python.def Python/compile.c Python/getargs.c (original) (raw)

Larry Hastings larry at hastings.org
Thu Mar 25 18:38:54 CET 2010


M.-A. Lemburg wrote:

Backporting PyCapsule is fine, but the changes you made to all those PyCObject uses does not look backwards compatible.

The C APIs exposed by the modules (e.g. the datetime module) are used in lots of 3rd party extension modules and changing them from PyCObject to PyCapsule is a major change in the module API.

You're right, my changes aren't backwards compatible. I thought it was reasonable for four reasons:

  1. The CObject API isn't safe. It's easy to crash Python 2.6 in just a few lines by mixing and matching CObjects. Switching Python to capsules prevents a class of exploits. I've included a script at the bottom of this message that demonstrates three such crashes. The script runs in Python 2 and 3, but 3.1 doesn't crash because it's using capsules.

  2. As I just mentioned, Python 3.1 already uses capsules everywhere instead of CObjects. Since part of the purpose of Python 2.7 is to prepare developers for the to upgrade to 3.1, getting them to switch to capsules now is just one more way they are prepared.

  3. Because CObject is unsafe, I want to deprecate it in 2.7, and if we ever made a 2.8 I want to remove it completely.

  4. When Python publishes an API using a CObject, it describes the thing the CObject points to in a header file. In nearly all cases that header file also provides a macro or inline function that does the importing work for you. I changed those to use capsules too. So if the third-party code uses the macro or inline function, all you need do is recompile it against 2.7 and it works fine. Sadly I know of one exception: pyexpat.expat_CAPI. The header file just describes the struct pointed to by the CObject, but callers

I can suggest four ways to ameliorate the problem.

First, we could do as Antoine Pitrou suggests on the bug (issue 7992): wherever the CObject used to be published as a module attribute to expose an API, we could provide both a CObject and a capsule; internally Python would only use the capsules. This would allow third-party libraries to run against 2.7 unchanged. The major problem with this is that third-party libraries would still be vulnerable to the mix-and-match CObject crash. A secondary, minor concern: obviously we'd store the CObject attribute with the existing name, and the capsule attribute would have to get some new name. But in Python 3.1, these attributes already expose a capsule. Therefore, people who convert to using the capsules now would have to convert again when moving to 3.1.

Second, we could make CObject internally support unpacking capsules. If you gave a capsule to PyCObject_AsVoidPtr() it would unpack it and return the pointer within. (We could probably also map the capsule "context" to the CObject "desc", if any of the Python use cases needed it.) I wouldn't change anything else about CObjects; creating and using them would continue to work as normal. This would also allow third-party libraries to run against Python 2.7 unchanged. The only problem is that it's unsafe, as indeed allowing any use of PyCObject_AsVoidPtr() is unsafe.

Third, I've been pondering writing a set of preprocessor macros, shipped in their own header file distributed independently of Python and released to the public domain, that would make it easy to use either CObjects or capsules depending on what version of Python you were compiling against. Obviously, using these macros would require a source code change in the third-party library. But these macros would make it a five-minute change. This could compliment the first or second approaches.

Fourth, we could back out of the changes to published APIs and convert them back to CObjects. -1.

Your thoughts?

/larry/


import sys def log(message): print(message) sys.stdout.flush()

def crash1(): log("Running crash1...") try: import datetime import cStringIO cStringIO.cStringIO_CAPI = datetime.datetime_CAPI

    import cPickle
    s = cPickle.dumps([1, 2, 3])
except ImportError:
    # This test isn't translatable to Python 3.
    pass
log("Survived crash1!")

def crash2(): log("Running crash2...") try: import unicodedata import _socket _socket.CAPI = unicodedata.ucnhash_CAPI import ssl except AttributeError: # Congratulations, you didn't crash. pass log("Survived crash2!")

def crash3(): log("Running crash3...") try: import unicodedata import _multibytecodec _multibytecodec.__create_codec(unicodedata.ucnhash_CAPI)

except ValueError:
    # Congratulations, you didn't crash.
    pass
log("Survived crash3!")

import sys

if len(sys.argv) > 1: if sys.argv[1] == '1': crash1() sys.exit(0) elif sys.argv[1] == '2': crash2() sys.exit(0) elif sys.argv[1] == '3': crash3() sys.exit(0)

crash1() crash2() crash3()



More information about the Python-Dev mailing list