[Python-Dev] PEP 263 considered faulty (for some Japanese) (original) (raw)

M.-A. Lemburg mal@lemburg.com
Tue, 12 Mar 2002 18:55:19 +0100

Previous message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
Next message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Guido van Rossum wrote:

> My position on this is not to introduce more defaults -- explicit > is better than implicit and in this particular case (encodings) > it'll result in a net win. I'd like to believe you. But the fact that apparently there are Japanese users who are willing to give up part of the language and library just so they can have an certain default, suggests that the need for defaults is a strong force. Maybe we would've been better off leaving sys.setdefaultencoding() enabled -- then those people might have put sys.setdefaultencoding("utf-16") at the top of their program rather than hacking site.py... :-(

(Actually, they'll tweak sitecustomize.py.) It's good that they can only apply the change in this one location. Placing it inside the various modules would cause a maintenance nightmare, removing one of the great advantages of Python over other languages.

Anyway, I tend to believe that changes to the default encoding are only rarely needed and then only to overcome problems with occasional use of Unicode.

It's a myth that you can port a program to Unicode by tweaking the default encoding to fit your environment alone.

A true port will have to follow the Unicode object through the complete processing chain and apply the needed changes along the chain (which is much work, but certainly possible). A different strategy would be treating text data as binary data and not using Unicode at all. It all depends on the application scope.

> > In the light of the post by Atsuo Ishimoto and the responses from both > > Marc-Andre Lemburg and Martin von Loewis, however, I'm not sure > > whether Suziki Hisao's response represents the Japanese community, and > > it's possible that nothing needs to be done. > > Well, users using non-ASCII coding in their source files > should start to be explicit about the encoding (in phase 1 > they'll get a warning printed which makes them aware of the > problem), but other than that, I don't see a need for > changes to the strategy.

Suzuki won't get the warning, because his source files are pure ASCII -- but his Unicode string literals will be interpreted as utf-16, which will break his programs. The question is, do we care about him and others like him, or do we decide that their habits are bad for them and they have to change them?

We do care (after all, the PEP was designed for non-ASCII users), but it was never intended that we allow encodings like UTF-16 to be used for Python source code.

I'm afraid there's nothing much we can do. For UTF-8 they would just have to add a single coding comment to all source files, but there's nothing we can offer them for UTF-16.

-- Marc-Andre Lemburg CEO eGenix.com Software GmbH

Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/

Previous message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
Next message: [Python-Dev] PEP 263 considered faulty (for some Japanese)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]