[Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3 (original) (raw)
Ethan Furman ethan at stoneleaf.us
Wed Mar 26 15:35:52 CET 2014
- Previous message: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
- Next message: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 03/26/2014 03:10 AM, Victor Stinner wrote:
2014-03-25 23:37 GMT+01:00 Ethan Furman:
%a
will callascii()
on the interpolated value. I'm not sure that I understood correctly: is the "%a" format supported? The result of ascii() is a Unicode string. Does it mean that ("%a" % obj) should give the same result than ascii(obj).encode('ascii', 'strict')?
Changed to:
%a
will give the equivalent of
repr(some_obj).encode('ascii', 'backslashreplace')
on the interpolated
value. Use cases include developing a new protocol and writing landmarks
into the stream; debugging data going into an existing protocol to see if
the problem is the protocol itself or bad data; a fall-back for a serialization
format; or any situation where defining __bytes__
would not be appropriate
but a readable/informative representation is needed [8].
Would it be possible to add a table or list to summarize supported format characters? I found:
- single byte: %c - integer: %d, %u, %i, %o, %x, %X, %f, %g, "etc." (can you please complete "etc." ?) - bytes and bytes method: %s - ascii(): %a
Changed to:
%-interpolation
All the numeric formatting codes (d
, i
, o
, u
, x
, X
,
e
, E'',
f,
F,
g,
G, and any that are subsequently added to Python 3) will be supported, and will work as they do for str, including the padding, justification and other related modifiers (currently
#,
0,
-,
(space), and
+(plus any added to Python 3)). The only non-numeric codes allowed are
c,
s, and
a``.
For the numeric codes, the only difference between str
and bytes
(or
bytearray
) interpolation is that the results from these codes will be
ASCII-encoded text, not unicode. In other words, for any numeric formatting
code %x
::
I don't understand the purpose of this sentence. Does it mean that %a must not be used? IMO this sentence can be removed.
The sentence about %a being for debugging has been removed.
Non-ASCII values will be encoded to either
\xnn
or\unnnn
representation. Unicode is larger than that! print(ascii(chr(0x10ffff))) => '\U0010ffff'
Removed. With the explicit reference to the 'backslashreplace' error handler any who want to know what it might look like can refer to that.
.. note::
If a
str
is passed into%a
, it will be surrounded by quotes. And: - bytes gets a "b" prefix and surrounded by quotes as well (b'...') - the quote ' is escaped as ' if the string contains quotes ' and "
Shouldn't be an issue now with the new definition which no longer references the ascii() function.
Can you also please add examples for %a?
Examples::
>>> b'%a' % 3.14
b'3.14'
>>> b'%a' % b'abc'
b'abc'
>>> b'%a' % 'def'
b"'def'"
Proposed variations ===================
It would be fair to mention also a whole different PEP, Antoine's PEP 460!
My apologies for the omission.
A competing PEP, PEP 460 Add binary interpolation and formatting
[9], also
exists.
.. [9] http://python.org/dev/peps/pep-0460/
Thank you, Victor.
- Previous message: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
- Next message: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]