[Python-Dev] PEP-393/PEP-3118: unicode format specifiers (original) (raw)

Stefan Krah stefan at bytereef.org
Wed Mar 7 11:50:44 CET 2012


"Martin v. L?wis" <martin at v.loewis.de> wrote:

> I think it would be nice for Python3.3 to implement the PEP-3118 > suggestion: > > 'c' -> UCS1 > > 'u' -> UCS2 > > 'w' -> UCS4

What is the use case for these format codes?

Unfortunately I've only worked with UTF-8 so far and I'm not too familiar with UCS2 and UCS4.

If the arrays that Victor mentioned give one character per array location, then memoryview(str) could be used for zero-copy slicing etc.

The main reason why I raised the issue is this: If Python-3.3 is shipped with 'u' -> UCS4 in the array module and then someone figures out that the above format codes are a great idea, we'd be stuck with yet another format code incompatibility.

Stefan Krah



More information about the Python-Dev mailing list