[Python-Dev] Python-3.0, unicode, and os.environ (original) (raw)
Toshio Kuratomi a.badger at gmail.com
Fri Dec 5 23:21:50 CET 2008
- Previous message: [Python-Dev] Python-3.0, unicode, and os.environ
- Next message: [Python-Dev] Python-3.0, unicode, and os.environ
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Victor Stinner wrote:
It would be maybe easier if os.environ supports bytes and unicode keys. But we have to keep these assertions: os.environ[bytes] -> bytes os.environ[str] -> str I think the same choices have to be made here. If LANG=C, we still have to decide what to do when os.environ[str] is set to a non-ASCii string. If the charset is US-ASCII, os.environ will drop non-ASCII values. But most variables are ASCII only. Examples with my shell: Yes. But you still have the question of what to do when: os.environ[str] = chr(0x10000)
So I don't think it makes things simpler than having separate os.environ and os.environb that update the same data behind the scenes.
Additionally, the subprocess question makes using the key value undesirable compared with having a separate os.environb that accesses the same underlying data. The user should be able to choose bytes or unicode. Examples:
the subprocess question was posed further up the thread as basically -- does the user need to access os.environb in order to override things in the environment when calling subprocess? I think the answer to that is yes since you might want to start with your environment and modify it slightly when you call programs via subprocess. If you just try to copy os.environ and os.environ only iterates through the decodable env vars, that doesn't work. If you have an os.environb to copy it becomes possible.
- subprocess.Popen('ls') => use unicode environment (os.environ) - subprocess.Popen(b'ls') => use bytes environment (os.environb) That's... not expected to me :-(
If I never touch os.environ and invoke subprocess the normal way, I'd still expect the whole environment to be passed on to the program being called. This is how invoking programs manually, shell scripting, invoking programs from perl, python2, etc work.
Also, it's not really a good fit with the other things that key off of the initial argument. os.listdir(b'.') changes the output to bytes. subprocess.Popen(b'ls') would change what environment gets input into the call.
Here's my problem with it, though. With these semantics any program that works on arbitrary files and runs on *NIX has to check os.listdir(b'') and do the conversion manually. Only programs that have to support strange environment like yours (mixing Shift-JIS and UTF-8) :-) Most programs don't have to support these charset mixture. Any program that is intended to be distributed, accesses arbitrary files, and works on *nix platforms needs to take this into account. Just because the environment inside of my organization is sane doesn't mean that when we release the code to customers, clients, or the free software community that the places it runs will be as strict about these things.
Are most programs specific to one organization or are they distributed to other people? I can't answer that... everything I work on (except passwords:-) is distributed -- from sys admin cronjobs to web applications since I'm lucky that my whole job is devoted to working on free software.
-Toshio
-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: OpenPGP digital signature URL: <http://mail.python.org/pipermail/python-dev/attachments/20081205/da33d279/attachment.pgp>
- Previous message: [Python-Dev] Python-3.0, unicode, and os.environ
- Next message: [Python-Dev] Python-3.0, unicode, and os.environ
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]