[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5 (original) (raw)

Ethan Furman ethan at stoneleaf.us
Sat Jan 11 01:23:53 CET 2014


On 01/08/2014 02:42 PM, Antoine Pitrou wrote:

With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str separation.

From the PEP:

Python 3 generally mandates that text be stored and manipulated as unicode (i.e. str objects, not bytes). In some cases, though, it makes sense to manipulate bytes objects directly. Typical usage is binary network protocols, where you can want to interpolate and assemble several bytes object (some of them literals, some of them compute) to produce complete protocol messages. For example, protocols such as HTTP or SIP have headers with ASCII names and opaque "textual" values using a varying and/or sometimes ill-defined encoding. Moreover, those headers can be followed by a binary body... which can be chunked and decorated with ASCII headers and trailers!

As it stands now, the PEP talks about ASCII, about how it makes sense sometimes to work directly with bytes objects, and then refuses to allow % to embed ASCII text in the byte stream.

All other features present in formatting of str objects (either through the percent operator or the str.format() method) are unsupported. Those features imply treating the recipient of the operator or method as text, which goes counter to the text / bytes separation (for example, accepting %d as a format code would imply that the bytes object really is a ASCII-compatible text string).

No, it implies that portion of the byte stream is ASCII compatible. And we have several examples: PDF, HTML, DBF, just about every network protocol (not counting M$), and, I'm sure, plenty I haven't heard of.

-1 on the PEP as it stands now.

-- Ethan



More information about the Python-Dev mailing list