[Python-3000] Unicode and OS strings (original) (raw)

Jim Jewett jimjjewett at gmail.com
Fri Sep 21 16:00:38 CEST 2007


On 9/18/07, James Y Knight <foom at fuhm.net> wrote:

On Sep 18, 2007, at 11:11 AM, Guido van Rossum wrote:

One of the more common things to do with command line arguments is open them. So, it'd really be nice if:

python -c 'import sys; open(sys.argv[1])' [some filename]

would always work, regardless of the current system encoding and what characters make up the filename.

(Outside ASCII), if you treat sys.argv as text, that is probably impossible without filesystem support. Before python even sees the data, the terminal itself is allowed to change between canonical equivalents, which have different binary representations.

It does sound like we need a way to get to the original bytes, similar to sys.stdin.buffer. Is it reasonable to expose sys.argv.buffer? (Since this would be bytes rather than text, I assume this would be a single array, rather than a list of already separated arguments.)

Similarly, could os.environ have a bytes mirror, where the keys and values are (immutable) bytes?

-jJ



More information about the Python-3000 mailing list