msg74784 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-10-15 00:45 |
Python3 skips environment variables which can not be parsed and decoded as unicode strings. But exec*() functions keep the original environment and so the child process environment is different than the Python environement (than os.environ). I propose to remove these variables to avoid strange behaviours, but also to avoid possible security issues. The attached patch is an implementation of this idea using a custom implementation of unsetenv(): _Py_unsetenv() argument is not the name of the variable but the raw variable including the value (eg. "a=b"). So it's also possible to drop truncated variables like "a" (no value nor "=" character). This issue also affects Python2 since Python2 does also skip variable with no value but the variables still exist in memory (and so child process get them). |
|
|
msg74785 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-10-15 00:48 |
Note: I don't have Windows at home to test my patch on Windows. At least, the patch works correctly on Ubuntu Gutsy/i386. Example to demonstrate the issue: $ env -i a=a b=$(echo -e '--\xff--') c=c ./python -c "import os; os.execvp('/usr/bin/env', ['/usr/bin/env'])" a=a b=--�-- c=c Patched Python: $ env -i a=a b=$(echo -e '--\xff--') c=c ./python -c "import os; os.execvp('/usr/bin/env', ['/usr/bin/env'])" a=a c=c I tested Python 2.5: b is also removed, but Python 2.6 keeps the variable b. |
|
|
msg74786 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-10-15 01:01 |
while+strcmp() in _Py_unsetenv() is useless since we already get the pointer to the evil variable. The new patch is shorter. |
|
|
msg74788 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-10-15 01:05 |
See also issue #4006 which asks the opposite (keep invalid variables, even in os.environ(b)) :-) |
|
|
msg74789 - (view) |
Author: Toshio Kuratomi (a.badger) * |
Date: 2008-10-15 02:00 |
Yep :-) I am against throwing away valid data just because we can't interpret it automatically. Environment variables in Unix hold bytes. Those bytes are usually ASCii characters, however, they do not have to be. This is a case of being on the border between python and the outside world so we need to be able to pass in bytes if the user requests it. Let's say that you have a local directory of: /home/\xff/username/bin in your PATH environment variable and a command named my_app.sh in there. At the shell you can happily run myapp.sh and it will do it's thing. Now you open your python shell and do: subprocess.call(['myapp.sh']) and it doesn't work. This is non-intuitive behaviour for people who are used to how the shell works. All this patch will do is take away the work around of subprocess.call(['bash', 'myapp.sh']) """ I tested Python 2.5: b is also removed, but Python 2.6 keeps the variable b. """ I just tested python-2.5.1 and b is kept, not removed. |
|
|
msg74796 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-10-15 08:24 |
About your subprocess example: we choose to refuse it because we don't mix bytes (your non decodable PATH) and unicode ('myapp.sh'). Using my patch in issue #4036 you will be able to run program with bytes arguments (including the program name), but there is no way (yet?) in Python3 to read the raw (bytes) environment variable PATH. About Python2, I retested and all versions (2.5.2 and trunk (2.7a0)) keep the variables. But Python2 doesn't convert value to unicode and so it's not right example. Python2 skips variable with no "=" character, but I don't know how to reproduce this example (maybe using a C program using execve). |
|
|
msg74805 - (view) |
Author: Toshio Kuratomi (a.badger) * |
Date: 2008-10-15 15:25 |
""" About your subprocess example: we choose to refuse it because we don't mix bytes (your non decodable PATH) and unicode ('myapp.sh') """ If python3 is doing things right we shouldn't be mixing bytes and unicode here: 1) the programmer is only sending unicode to subprocess, not a mixture of bytes and unicode. 2) Python should be converting the arguments to subprocess.call() into bytes before combining it with PATH, at least on Unix. The conversion to bytes is something Python has to do at some point before looking on the filesystem for the command as filenames are a sequence of bytes in Unix. Note: your patch for #4036 looks like the right thing to do for the args argument but as you point out, that doesn't have bearing on the environment. |
|
|
msg87957 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2009-05-17 04:36 |
This patch is out of date with PEP 383 |
|
|