[Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Tue Feb 7 09:55:06 CET 2012


2012/2/7 "Martin v. Löwis" <martin at v.loewis.de>:

PyIDENTIFIER(xxx) defines a variable called PyIdxxx, so xxx can only be ASCII: the C language doesn't accept non-ASCII identifiers. That's not exactly true. In C89, source code is in the "source character set", which is implementation-defined, except that it must contain the "basic character set". I believe that it allows for implementation-defined characters in identifiers.

Hum, I hope that these C89 compilers use UTF-8.

In C99, this is extended to include "universal character names" (\u escapes). They may appear in identifiers as long as the characters named are listed in annex D.59 (which I cannot locate).

Does C99 specify the encoding? Can we expect UTF-8?

Python is supposed to work on many platforms ans so support a lot of compilers, not only compilers supporting non-ASCII identifiers.

Victor



More information about the Python-Dev mailing list