[Python-3000] Unicode and OS strings (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Fri Sep 28 23:00:29 CEST 2007
- Previous message: [Python-3000] Unicode and OS strings
- Next message: [Python-3000] Unicode and OS strings
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
msvcrt ships with the operating system - I'd call that a conforming implementation.
Yes, but it's not part of the operating system interface; Microsoft documents it as "for future use only by system-level components".
I still regard handling argv as anything other the raw bytes that come from the host as bad.
The point is that you cannot use "raw bytes" in Win32, not without potential loss of data. If you pass arbitrary bytes to os.spawn*, they get converted to Unicode, and the resulting Unicode command line gets passed to the child process. So the native API is Unicode, not arbitrary bytes - there is also _wmain supported by the C library, if you want broken down command line arguments, but without character set conversions.
If we're going to call something sys.argv, then presumably that was done because there was a conventionally accepted meaning to it, and I would argue that meaning comes from standard C.
Yes, but also in C, the meaning is "characters", not "bytes". ISO C 99 5.1.2.2.1p2 specifies they are strings passed by the host environment, and elaborates that if the host environment does is not capable of supplying mixed-case strings, it should convert them all into lower case. So the intention clearly is that argv[] is text, not bytes.
Regards, Martin
- Previous message: [Python-3000] Unicode and OS strings
- Next message: [Python-3000] Unicode and OS strings
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]