[Python-Dev] PEP 460 reboot (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Thu Jan 16 01:35:42 CET 2014
- Previous message: [Python-Dev] PEP 460 reboot
- Next message: [Python-Dev] PEP 460 reboot
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 15 Jan 2014 20:58, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
Aside: OK, Guido, ya got me. I have a separate screed recounting the reasons for my apostasy, but that's probably not interesting any more. I'll send it to individuals on request. > But in terms of explaining the text model, that > separation is important enough that > > (1) We should be reluctant to strengthen the > "its really just ASCII" messages. True. I think the right message is is "Unless you know why you desperately want this, not only don't you need it, but using it is the Python equivalent of skydiving without a parachute." N.B. Don't take the metaphor as an insult. I think it's become clear that those who "desperately want this" not only use parachutes, they pack their own. No need to worry about them. > (2) It may be worth creating a virtual > split in the documentation. Please don't. All we need to tell naive users is: Look at the structure of the bytes. If that structure is "text", convert to str using .decode(). Please don't use bytes. If that structure isn't text, you're in a specialist domain, and it's your problem. Many structured uses of bytes use ASCII- encoded keywords: we provide bytes methods for handling them, but you must be aware that these methods cannot distinguish "bytes representing text encoded as ASCII" from "any old bytes". Be warned: They will happily -- and silently -- corrupt the latter. Make sure you respect the higher-level structure of your data when using them.
Yes, I'm currently thinking the appropriate approach to the docs will be to remove the current "these have most of the str methods too" paragraph for binary sequences and instead create three completely explicit lists of methods:
- provided, works with arbitrary data
- provided, assumes the use of an ASCII compatible data format
- not provided
PEP 461 would add a fourth category, of being provided, but with more restricted semantics.
Cheers, Nick.
> Virtual subclass ASCIIStructuredBytes > ==================================== > > One particularly common use of bytes is to represent > the contents of a file, or of a network message. In > these cases, the bytes will often represent Text > in a specific encoding and that encoding will usually > be a superset of ASCII. Rather than create and support > an ASCIIStructuredBytes subclass, Python simply added > support for these use cases straight to Bytes objects, > and assumes that this support simply won't be used when > when it does not make sense. For example, bytes literals This is going quite the wrong direction, I think. The only people who should care about "Text in a specific encoding and that encoding will usually be a superset of ASCII" are codec writers, and by now writing those is a very rare task. Everybody else uses ASCII keywords in a simple formal language. > could be used to construct a sound sample, but the > literals will be far easier to read when they are used > to represent (encoded) ASCII text, such as "OPEN".
Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140116/8f290d60/attachment.html>
- Previous message: [Python-Dev] PEP 460 reboot
- Next message: [Python-Dev] PEP 460 reboot
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]