[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5 (original) (raw)

Chris Angelico rosuav at gmail.com
Sun Jan 12 23:28:31 CET 2014

Previous message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
Next message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Jan 13, 2014 at 4:57 AM, Juraj Sukop <juraj.sukop at gmail.com> wrote:

On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano <steve at pearwood.info> wrote:

First, "utf16string" confuses me. What is it? If it is a Unicode string, i.e.: It is a Unicode string which happens to contain code points outside U+00FF (as with the TTF example above), so that it triggers the (at least) 2-bytes memory representation in CPython 3.3+. I agree, I chose the variable name poorly, my bad.

When I'm talking about Unicode strings based on their maximum codepoint, I usually call them something like "ASCII string", "Latin-1 string", "BMP string", and "SMP string". Still not wholly accurate, but less confusing than naming an encoding... oh wait, two of those are encodings :| But you could use "narrow string" for the first two. Or "string(0..127)" for ASCII, "string(0..255)" for Latin-1, and then for consistency "string(0..65535)" and "string(0..1114111)" for the others, except that I doubt that'd be helpful :) At any rate, "BMP" as a term for "includes characters outside of Latin-1 but all on the Basic Multilingual Plane" would probably be close enough to get away with.

ChrisA

Previous message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
Next message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list