[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5 (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Sat Jan 11 09:17:07 CET 2014
- Previous message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Next message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 11 January 2014 08:58, Ethan Furman <ethan at stoneleaf.us> wrote:
On 01/10/2014 02:42 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 17:33:57 -0500 "Eric V. Smith" <eric at trueblade.com> wrote:
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 12:56:19 -0500 "Eric V. Smith" <eric at trueblade.com> wrote:
I agree. I don't see any reason to exclude int and float. See Guido's messages http://bugs.python.org/issue3982#msg180423 and http://bugs.python.org/issue3982#msg180430 for some justification and discussion. If you are representing int and float, you're really formatting a text message, not bytes. Basically if you allow the formatting of int and float instances, there's no reason not to allow the formatting of arbitrary objects through str. It doesn't make sense to special-case those two types and nothing else. It might not for .format(), but I'm not convinced. But for %-formatting, str is already special-cased for these types. That's not what I'm saying. str.mod is able to represent all kinds of types through %s and calling str. It doesn't make sense for bytes.mod to only support int and float. Why only them? Because embedding the ASCII equivalent of ints and floats in byte streams is a common operation?
It's emphatically NOT a binary interpolation operation though - the binary representation of the integer 1 is the byte value 1, not the byte value 49. If you want the byte value 49 to appear in the stream, then you need to interpolate the ASCII encoding of the string "1", not the integer 1.
If you want to manipulate text representations, do it in the text domain. If you want to manipulate binary representations, do it in the binary domain. The whole point of the text model change in Python 3 is to force programmers to decide which domain they're operating in at any given point in time - while the approach of blurring the boundaries between the two can be convenient for wire protocol and file format manipulation, it is a horrendous bug magnet everywhere else.
PEP 360 is just about adding back some missing functionality in the binary domain (interpolating binary sequences together), not about bringing back the problematic text model that allows particular text representations to be interpreted as if they were also binary data.
That said, I actually think there's a valid use case for a Python 3 type that allows the bytes/text boundary to be blurred in making it easier to port certain kinds of Python 2 code to Python 3 (specifically, working with wire protocols and file formats that contain a mixture of encodings, but all encodings are known to at least be ASCII compatible). It is highly unlikely that such a type will ever be part of the standard library, though - idiomatic Python 3 code shouldn't need it, affected Python 2 code can be ported without it (but may look more complicated due to the use of explicit decoding and encoding operations, rather than relying on implicit ones), and it should be entirely possible to implement it as an extension module (modulo one bug in CPython that may impact the approach, but we won't know for sure until people actually try it out).
Fortunately, after years of my suggesting the idea to almost everyone that complained about the move away from the broken POSIX text model in Python 3, Benno Rice has started experimenting with such a type based on a preliminary test case I wrote at linux.conf.au last week: https://github.com/jeamland/asciicompat/blob/master/tests/ncoghlan.py
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Next message: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]