[Python-Dev] PEP 460 reboot (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Mon Jan 13 10:13:48 CET 2014


On 13 Jan 2014 17:43, "Ethan Furman" <ethan at stoneleaf.us> wrote:

On 01/12/2014 10:51 PM, Nick Coghlan wrote:

I am a strong -1 on the more lenient proposal, as it makes binary interpolation in Python 3 an unsafe operation for ASCII incompatible binary formats. No more unsafe that calling .upper() on ASCII incompatible streams.

Right - Guido's proposal is completely useless for arbitrary binary data. You can't trust it.

However, Python 3 has no equivalent binary interpolation feature that is safe for arbitrary binary data, so the lenient version will be a bug magnet if it is the only version of binary interpolation provided.

However, if new formatb and formatb_map methods were included in the proposal with the current strict PEP 460 semantics, then my objections would be reduced substantially. In that case, we'd still be providing the new binary interpolation feature in addition to restoring the ASCII compatible interpolation feature, so the latter would be less of an attractive nuisance when writing code that needs to handle arbitrary binary formats and can't assume ASCII compatibility.

With that approach, I'd even support the idea of implicit strict ASCII encoding of text inputs for the ASCII compatible version.

The existing binary operations that assume ASCII do so inherently - they're not input driven, the operation itself assumes ASCII, so if you're working with data that may not be ASCII compatible, you simply don't use them (these are operations like title(), upper(), lower(), the default arguments for split() and strip(), etc). How is this different from not using % interpolation when the byte stream is incompatible? It isn't.

Because I want to use the PEP 460 binary interpolation API, but wouldn't be able to use Guido's more lenient proposal, as it is a bug magnet in the presence of arbitrary binary data. Provide both APIs and my objections go away - ASCII interpolation just becomes another way to translate between structured and text data, while binary interpolation would be a strictly binary only operation.

And what do you mean by "input driven"? If the LHS is bytes, the result is bytes, no matter what the input is. This is not the Py2 world where you may end up with str or unicode; you always end up with bytes if the LHS is bytes.

The LHS may or may not be tainted with assumptions about ASCII compatibility, which means it effectively is tainted with such assumptions, which means code that needs to handle arbitrary binary data can't use it and is left without a binary interpolation feature.

That's why adding formatb to Guido's more lenient proposal resolves my objections: it provides the binary interpolation feature I want, and maintains Python 3's clear distinction between the text domain and the binary domain.

Cheers, Nick.

[snip the rest that seems to flow from these misunderstandings] -- Ethan


Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140113/3f1d265a/attachment.html>



More information about the Python-Dev mailing list