[Python-Dev] PEP 460 reboot (original) (raw)

Wed Jan 15 11:57:16 CET 2014

(1)  We should be reluctant to strengthen the
     "its really just ASCII" messages.
(2)  It *may* be worth creating a virtual
     split in the documentation.
Look at the structure of the bytes.  If that structure is "text",
convert to str using .decode().  Please don't use bytes.

If that structure isn't text, you're in a specialist domain, and
it's your problem.  Many structured uses of bytes use ASCII-
encoded keywords: we provide bytes methods for handling them, but
you *must* be aware that these methods *cannot* distinguish "bytes
representing text encoded as ASCII" from "any old bytes".  Be
warned: They will happily -- and silently -- corrupt the latter.
Make sure you respect the higher-level structure of your data when
using them.
Virtual subclass ASCIIStructuredBytes
====================================

One particularly common use of bytes is to represent
the contents of a file, or of a network message.  In
these cases, the bytes will often represent Text
*in a specific encoding* and that encoding will usually
be a superset of ASCII.  Rather than create and support
an ASCIIStructuredBytes subclass, Python simply added
support for these use cases straight to Bytes objects,
and assumes that this support simply won't be used when
when it does not make sense. For example, bytes literals
*could* be used to construct a sound sample, but the
literals will be far easier to read when they are used
to represent (encoded) ASCII text, such as "OPEN".