[Python-Dev] PEP 460 reboot (original) (raw)
Stephen J. Turnbull stephen at xemacs.org
Wed Jan 15 11:57:16 CET 2014
- Previous message: [Python-Dev] PEP 460 reboot
- Next message: [Python-Dev] PEP 460 reboot
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Aside: OK, Guido, ya got me.
I have a separate screed recounting the reasons for my apostasy, but that's probably not interesting any more. I'll send it to individuals on request.
But in terms of explaining the text model, that separation is important enough that
(1) We should be reluctant to strengthen the "its really just ASCII" messages.
True. I think the right message is is "Unless you know why you desperately want this, not only don't you need it, but using it is the Python equivalent of skydiving without a parachute."
N.B. Don't take the metaphor as an insult. I think it's become clear that those who "desperately want this" not only use parachutes, they pack their own. No need to worry about them.
(2) It *may* be worth creating a virtual split in the documentation.
Please don't. All we need to tell naive users is:
Look at the structure of the bytes. If that structure is "text",
convert to str using .decode(). Please don't use bytes.
If that structure isn't text, you're in a specialist domain, and
it's your problem. Many structured uses of bytes use ASCII-
encoded keywords: we provide bytes methods for handling them, but
you *must* be aware that these methods *cannot* distinguish "bytes
representing text encoded as ASCII" from "any old bytes". Be
warned: They will happily -- and silently -- corrupt the latter.
Make sure you respect the higher-level structure of your data when
using them.
Virtual subclass ASCIIStructuredBytes ==================================== One particularly common use of bytes is to represent the contents of a file, or of a network message. In these cases, the bytes will often represent Text *in a specific encoding* and that encoding will usually be a superset of ASCII. Rather than create and support an ASCIIStructuredBytes subclass, Python simply added support for these use cases straight to Bytes objects, and assumes that this support simply won't be used when when it does not make sense. For example, bytes literals
This is going quite the wrong direction, I think. The only people who should care about "Text in a specific encoding and that encoding will usually be a superset of ASCII" are codec writers, and by now writing those is a very rare task. Everybody else uses ASCII keywords in a simple formal language.
*could* be used to construct a sound sample, but the literals will be far easier to read when they are used to represent (encoded) ASCII text, such as "OPEN".
- Previous message: [Python-Dev] PEP 460 reboot
- Next message: [Python-Dev] PEP 460 reboot
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]