[Python-Dev] deleting setdefaultencoding iin site.py is evil (original) (raw)

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Tue Aug 25 18:23:05 CEST 2009


On 04:08 pm, chris at simplistix.co.uk wrote:

Hi All,

Would anyone object if I removed the deletion of of sys.setdefaultencoding in site.py? I'm guessing "yes!" so thought I'd state my reasons now: This deletion appears to be pretty flimsy; reload(sys) and you have it back. Which is lucky, because I need it after it's been deleted...

The ability to change the default encoding is a misfeature. There's essentially no way to write correct Python code in the presence of this feature.

Using setdefaultencoding is never the sensible way to deal with encoded strings. Actually exposing this function in the sys module would lead all kinds of people who haven't fully grasped the way str, unicode, and encodings work to doing horrible things to create broken programs. It's bad enough that it's already possible to get this function back with the reload(sys) trick.

Why? Well, because you can no longer put sitecustomize.py in a project- specific location (http://bugs.python.org/issue1734860) and because for some projects the only way I can deal with encoded strings sensibly is to use setdefaultencoding, in my case at the start of a script generated by zc.buildout's zc.recipe.egg (I know all the encodings in this project are utf-8, but I don't want to go playing whack-a-mole with whatever modules this rather large project uses that haven't been made properly unicode aware). Yes, it needs to be used as early as possible, and the docs should say this, but deleting it seems to be petty in terms of stopping its use when sitecustomize.py is too early and too system-wide and spraying .decode('utf-8')'s all over a code base made up of a load of eggs managed by buildout simply isn't feasible... Thoughts?

It may be a major task, but the best thing you can do is find each str and unicode operation in the software you're working with and make them correct with respect to your inputs and outputs. Flipping a giant switch for the entire process is just going to change which things are wrong.

Jean-Paul



More information about the Python-Dev mailing list