[Python-Dev] Bytes for the command line, process arguments and environment variables (original) (raw)
Victor Stinner victor.stinner at haypocalc.com
Sat Jan 3 04:29:11 CET 2009
- Previous message: [Python-Dev] I would like an svn account
- Next message: [Python-Dev] Bytes for the command line, process arguments and environment variables
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
Python 3.0 is released and supports unicode everywhere, great! But as pointed by different people, bytes are required on non-Windows OS for backward compatibility. This email is just a sum up all many issues/email threads.
Problems with Python 3.0:
(1) Invalid unicode string on the command line => some people wants to get the command line arguments as bytes and so start even if non decodable unicode strings are present on the command line => http://bugs.python.org/issue3023
(2) Non decodable environment variables are skipped in os.environ => Create os.environb (or anything else) to get these variables as bytes (and be able to setup new variables as bytes) => Read the email thread "Python-3.0, unicode, and os.environ" (Decembre 2008) opened by Toshio Kuratomi
(3) Support bytes for os.exec*() and subprocess.Popen(): process arguments and the environment variables => http://bugs.python.org/issue4035: my patch for os.exec*() => http://bugs.python.org/issue4036: my patch for subprocess.Popen()
Command line
I like the curent behaviour and I don't want to change it. Be free to propose a solution to solve the issue ;-)
Environment
I already proposed "os.environb" which will have the similar API than "os.environ" but with bytes. Relations between os.environb and os.environ:
for an undecodable variable value in os.environb, os.environ will raise a KeyError. Example with utf8 charset and os.environb[b'PATH'] = '\xff': path=os.environ['PATH'] will raise a KeyError to keep the current behaviour.
os.environ raises an UnicodeDecodeError if the key or value can not be encoded in the current charset. Example with ASCII charset: os.environ['PATH'] = '/home/hayp\xf4'
except undecodable variable values in os.environb, os.environ and os.environb will be consistent. Example: delete a variable in os.environb will also delete the key in os.environ.
I think that most of these points (or all points) are ok for everyone (especially ok for Toshio Kuratomi and me :-)).
Now I have to try to write an implementation of this, but it's complex, especially to keep os.environ and os.environb consistents!
Processes
I proposed patches to fix non-Windows OS, but Antoine Pitrou wants also bytes on Windows. Amaury wrote that it's possible using the ANSI version of the Windows API. I don't know this API and so I can not contribute to this point.
Rejected idea
Use a private Unicode block causes interoperability problems:
- the block may be already used by other programs/libraires
- 3rd party programs/libraries don't understand this block and may have problems this display/process the data
(Is the idea really rejected? It has at least many problems)
I don't have new solutions, it's just an email to restart the discussion about bytes ;-) Martin also asked for a PEP to change the posix module API to support bytes.
-- Victor Stinner aka haypo http://www.haypocalc.com/blog/
- Previous message: [Python-Dev] I would like an svn account
- Next message: [Python-Dev] Bytes for the command line, process arguments and environment variables
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]