[Python-Dev] [Python-checkins] cpython: pyexpat uses the new Unicode API (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Tue Oct 4 11:48:42 CEST 2011

Previous message: [Python-Dev] [Python-checkins] cpython: fix compiler warnings
Next message: [Python-Dev] Python-Dev Digest, Vol 99, Issue 7
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Le 03/10/2011 11:10, Amaury Forgeot d'Arc a écrit :

changeset: 72548:a1be34457ccf user: Victor Stinner<victor.stinner at haypocalc.com> date: Sat Oct 01 01:05:40 2011 +0200 summary: pyexat uses the new Unicode API

files: Modules/pyexpat.c | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c --- a/Modules/pyexpat.c +++ b/Modules/pyexpat.c @@ -1234,11 +1234,13 @@ static PyObject * xmlparsegetattro(xmlparseobject *self, PyObject *nameobj) { - const PyUNICODE *name; + PyUCS4 firstchar; int handlernum = -1; if (!PyUnicodeCheck(nameobj)) goto generic; + if (PyUnicodeREADY(nameobj)) + return NULL; Why is this PyUnicodeREADY necessary? Can tpgetattro pass unfinished unicode objects? I hope we don't have to update all extension modules?

The Unicode API is supposed to only deliver ready strings. But all extensions written for Python 3.2 use the "legacy" API (PyUnicode_FromUnicode and PyUnicode_FromString(NULL, size)) and so no string is ready.

But no, you don't have to update your extension reading strings to add a call to PyUnicode_READY. You only have to call PyUnicode_READY if you use the new API (e.g. PyUnicode_READ_CHAR), so if you modify your code. Another extract of my commit (on pyexpat):

name = PyUnicode_AS_UNICODE(nameobj);

first_char = PyUnicode_READ_CHAR(nameobj, 0);

Victor

Previous message: [Python-Dev] [Python-checkins] cpython: fix compiler warnings
Next message: [Python-Dev] Python-Dev Digest, Vol 99, Issue 7
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list