[Python-3000] Unicode and OS strings (original) (raw)
Gregory P. Smith greg at krypto.org
Sat Sep 15 22:36:49 CEST 2007
- Previous message: [Python-3000] Move argv[0]? (Re: Unicode and OS strings)
- Next message: [Python-3000] Unicode and OS strings
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 9/14/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
Hagen Fürstenau wrote: > sys.argv could be of type bytes and sys.arguments (or whatever) could be > a function taking an encoding parameter (which defaults to UTF-8) and > returning strings. > > Of course that's backwards incompatible and I'm not sure if it's too > late for something like this now.
It would be pretty disruptive to ask everyone to change their habit of thinking of sys.argv as a list of strings.
Would it? We're already asking them to convert between bytes and unicode strings anywhere else I/O is done. I see the command line and environment as merely more forms of input. The only way to parse them into data structures automatically is to keep them as bytes. They are C concepts and can't imply an encoding. As it is, its entirely possible to have -multiple- encodings on a command line at once as well as in environment variables. They're all context sensitive. This isn't going to change.
I would suggest doing it the other way around -- have sys.argv be an object that automatically converts to unicode on access, and something else, such as sys.argbytes, for getting the raw bytes if that fails.
I'd leave sys.argv bytes and make sys.args/arguments/argstrs be some best effort parsing. argv is the C/C++ name for bytes, lets not confuse people. similarly for the environment. os.environ dict should be bytes object keys and values (or perhaps a bytes object subclass that refuses null bytes). the os.getenv and os.putenv functions should take care of any best effort decoding/encoding and have an optional getenv encoding= parameter to explicitly specify.
-gps
- Previous message: [Python-3000] Move argv[0]? (Re: Unicode and OS strings)
- Next message: [Python-3000] Unicode and OS strings
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]