[Python-Dev] PEP 461 Final? (original) (raw)

Ethan Furman ethan at stoneleaf.us
Sat Jan 18 23:01:03 CET 2014


On 01/18/2014 05:48 AM, Nick Coghlan wrote:

On 18 Jan 2014 11:52, "Ethan Furman" wrote:

I'll admit to being somewhat on the fence about %a. It seems there are two possibilities with %a: 1) have it be ascii(repr(obj)) 2) have it be str(obj).encode('ascii', 'strict') This gets very close to crossing the line into implicit encoding of text again. Binary interpolation is being added back for the specific use case of working with ASCII compatible segments in binary formats, and it's at best arguable that supporting %a will help with that use case.

Agreed.

However, without it, there may be a greater temptation to inappropriately define bytes just to support binary interpolation, rather than because a type truly has an appropriate translation directly to bytes.

True.

By allowing %a, we avoid that temptation. This is also potentially useful specifically in the case of binary logging formats and as a quick way to request backslash escaping of non-ASCII characters in text.

Call it +0.5 for allowing %a. I don't expect it to be used heavily, but I think it will head off a fair bit of potential misuse of bytes.

So, if %a is added it would act like:


"%a" % some_obj

tmp = str(some_obj) res = b'' for ch in tmp: if ord(ch) < 256: res += bytes([ord(ch)] else: res += unicode_escape(ch)

where 'unicode_escape' would yield something like "\u0440" ?

-- Ethan



More information about the Python-Dev mailing list