[Python-Dev] bytes type discussion (original) (raw)

Guido van Rossum guido at python.org
Wed Feb 15 21:33:10 CET 2006


On 2/14/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

Fred L. Drake, Jr. wrote:

> The proper response in this case is often to re-start decoding > with the correct encoding, since some of the data extracted so far may have > been decoded incorrectly. If the protocol has been sensibly designed, that shouldn't happen, since everything up to the coding marker should be ascii (or some other protocol-defined initial coding). For protocols that are not sensibly designed (or if you're just trying to guess) what you suggest may be needed. But it would be good to have a nicer way of going about it for when the protocol is sensible.

I think that the implementation of encoding-guessing or auto-encoding-upgrade techniques should be left out of the standard library design for now. I know that XML does something like this, but fortunately we employ dedicated C code to parse XML so that particular case should be taken care of without complicating the rest of the standard I/O library.

As far as searching bytes objects, that shouldn't be a problem as long as the search 'string' is also specified as a bytes object.

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-Dev mailing list