[Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Mon Feb 6 22:57:46 CET 2012


2012/2/6 Jim Jewett <jimjjewett at gmail.com>:

I realize that PyIdentifier is a private name, and that PEP 3131 requires anything (except test cases) in the standard library to stick with ASCII ... but somehow, that feels like too long of a chain.

I would prefer to see PyIdentifier renamed to PyASCIIIdentifier, or at least a comment stating that Identifiers will (per PEP 3131) always be ASCII -- preferably with an assert to back that up.

_Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can only be ASCII: the C language doesn't accept non-ASCII identifiers. I thaugh that _Py_IDENTIFIER() macro was the only way to create a identifier and so ASCII was enough... but there is also _Py_static_string.

_Py_static_string(name, value) allows to specify an arbitrary string, so you may pass a non-ASCII value. I don't see any usecase where you need a non-ASCII value in Python core.

-        id->object = PyUnicodeDecodeUTF8Stateful(id->string, -                                                  strlen(id->string), -                                                  NULL, NULL); +        id->object = unicodefromascii((unsigned char*)id->string, +                                       strlen(id->string));

This is just an optimization.

If you think that _Py_static_string() is useful, I can revert my change. Otherwise, _Py_static_string() should be removed.

Victor



More information about the Python-Dev mailing list