[Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3 (original) (raw)

Ethan Furman ethan at stoneleaf.us
Wed Mar 26 15:35:52 CET 2014


On 03/26/2014 03:10 AM, Victor Stinner wrote:

2014-03-25 23:37 GMT+01:00 Ethan Furman:

%a will call ascii() on the interpolated value. I'm not sure that I understood correctly: is the "%a" format supported? The result of ascii() is a Unicode string. Does it mean that ("%a" % obj) should give the same result than ascii(obj).encode('ascii', 'strict')?

Changed to:

%a will give the equivalent of repr(some_obj).encode('ascii', 'backslashreplace') on the interpolated value. Use cases include developing a new protocol and writing landmarks into the stream; debugging data going into an existing protocol to see if the problem is the protocol itself or bad data; a fall-back for a serialization format; or any situation where defining __bytes__ would not be appropriate but a readable/informative representation is needed [8].

Would it be possible to add a table or list to summarize supported format characters? I found:

- single byte: %c - integer: %d, %u, %i, %o, %x, %X, %f, %g, "etc." (can you please complete "etc." ?) - bytes and bytes method: %s - ascii(): %a

Changed to:

%-interpolation

All the numeric formatting codes (d, i, o, u, x, X, e, E'', f, F, g, G, and any that are subsequently added to Python 3) will be supported, and will work as they do for str, including the padding, justification and other related modifiers (currently #, 0, -, (space), and+(plus any added to Python 3)). The only non-numeric codes allowed arec, s, and a``.

For the numeric codes, the only difference between str and bytes (or bytearray) interpolation is that the results from these codes will be ASCII-encoded text, not unicode. In other words, for any numeric formatting code %x::

I don't understand the purpose of this sentence. Does it mean that %a must not be used? IMO this sentence can be removed.

The sentence about %a being for debugging has been removed.

Non-ASCII values will be encoded to either \xnn or \unnnn representation. Unicode is larger than that! print(ascii(chr(0x10ffff))) => '\U0010ffff'

Removed. With the explicit reference to the 'backslashreplace' error handler any who want to know what it might look like can refer to that.

.. note::

If a str is passed into %a, it will be surrounded by quotes. And: - bytes gets a "b" prefix and surrounded by quotes as well (b'...') - the quote ' is escaped as ' if the string contains quotes ' and "

Shouldn't be an issue now with the new definition which no longer references the ascii() function.

Can you also please add examples for %a?


Examples::

 >>> b'%a' % 3.14
 b'3.14'

 >>> b'%a' % b'abc'
 b'abc'

 >>> b'%a' % 'def'
 b"'def'"

Proposed variations ===================

It would be fair to mention also a whole different PEP, Antoine's PEP 460!

My apologies for the omission.

A competing PEP, PEP 460 Add binary interpolation and formatting [9], also exists.

.. [9] http://python.org/dev/peps/pep-0460/

Thank you, Victor.



More information about the Python-Dev mailing list