[Python-Dev] Encoding of PyFrameObject members (original) (raw)

M.-A. Lemburg mal at egenix.com
Fri Feb 6 11:44:46 CET 2015


On 06.02.2015 00:27, Francis Giraldeau wrote:

I need to access frame members from within a signal handler for tracing purpose. My first attempt to access cofilename was like this (omitting error checking):

PyFrameObject *frame = PyEvalGetFrame(); PyObject *ob = PyUnicodeAsUTF8String(frame->fcode->cofilename) char *str = PyBytesAsString(ob) However, the function PyUnicodeAsUTF8String() calls PyObjectMalloc(), which is not reentrant. If the signal handler nest over PyObjectMalloc(), it causes a segfault, and it could also deadlock. Instead, I access members directly: char *str = PyUnicodeDATA(frame->fcode->cofilename); sizet len = PyUnicodeGETDATASIZE(frame->fcode->cofilename); Is it safe to assume that unicode objects cofilename and coname are always UTF-8 data for loaded code? I looked at the PyTokenizerFromString() and it seems to convert everything to UTF-8 upfront, and I would like to make sure this assumption is valid.

The macros won't work in all cases, as they don't pay attention to the different kinds used in the Unicode implementation.

I don't think there's any API you can use to extract the underlying data without going through PyObject_Malloc() at some point (you may be lucky if there already is a UTF-8 version available, but it's not guaranteed).

I guess your best bet is to write your own UTF-8 codec which then copies the data to a buffer that you can control. Have a look at Objects/stringlib/codecs.h: utf8_encode.

Alternatively, you can copy the data to a Py_UCS4 buffer which you allocate using code such as this (untested, adapted from the UTF-8 encoder):

Py_UCS4 *p;
enum PyUnicode_Kind repkind;
void *repdata;
Py_ssize_t repsize, k;

if (PyUnicode_READY(rep) < 0)
    goto error;
repkind = PyUnicode_KIND(rep);
repdata = PyUnicode_DATA(rep);
repsize = PyUnicode_GET_LENGTH(rep);

p = malloc((repsize + 1) * sizeof(Py_UCS4));
for(k=0; k<repsize; k++) {
    *p++ = PyUnicode_READ(repkind, repdata, k);
}
/* 0-terminate */
*p++ = 0;

...

free(p);

-- Marc-Andre Lemburg eGenix.com

Professional Python Services directly from the Source (#1, Feb 06 2015)

Python Projects, Coaching and Consulting ... http://www.egenix.com/ mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/


::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/



More information about the Python-Dev mailing list