[Python-Dev] Python-3.0, unicode, and os.environ (original) (raw)
Steve Holden steve at holdenweb.com
Thu Dec 11 18:46:57 CET 2008
- Previous message: [Python-Dev] Python-3.0, unicode, and os.environ
- Next message: [Python-Dev] Python-3.0, unicode, and os.environ
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Ulrich Eckhardt wrote:
On Thursday 11 December 2008, Steve Holden wrote:
Ulrich Eckhardt wrote:
What I'd just like some feedback on is the approach to return a distinct type (neither a byte string nor a Unicode string) from readdir(). In order to use this, a programmer will have to convert it explicitly, otherwise e.g. printing it will just produce <envstring at 0x01234567>. This will immediately bump each programmer with their heads on the issue of unknown encodings and they will have to make the application-specific choice whether an approximation of the filename, an exception or ignoring the file is the right choice. Also, it presents the options for doing this conversion in a single class, which I personally find much better than providing overloads for hundreds of functions. [...] Seems to me this just threatens to add to the confusion.
If you know what your filesystem produces, you can take the appropriate action to convert it into a type that makes sense to the user. If you don't, then at least if you have the string in its bytes form you can ^^^^^^^^^^^^^^^^^^^ There are operating systems that don't use bytes to represent a file path, namely all the MS Windows variants. Even worse, when you use a byte string there, it typically means that you want to use the obsolete encoding that is based on codepages. Why can we not preserve the representation of a path as it is? Why do we have to convert it to anything at all, without even knowing if this conversion is needed? I just want to do something to a file's content, why does its path have to be converted to something and then be converted back in order for the system to digest it? You don't: that was my point. You only need to perform any kind of conversion when the filename has to be presented to something other than the file system.
re-present it to the filesystem to manipulate the file. What are we supposed to do with the "special type"? You receive from readdir() and pass it to stat(), simple as that. No conversions from the native representation needed. If you need a textual representation, then you have to convert it and you have to do so explicitly according to whatever logic your application requires. Exactly.
If readdir() returned Unicode text, people would start taking that for granted. If it returned bytes, just the same. Returning a completely unrelated type will give them enough hint that for this thing they have to rethink their assumptions. This runs along the lines of "In the face of ambiguity, refuse the temptation to guess.", as it makes guessing rather impossible. So you are suggesting this "special object" be used only to represent files to users? Now I understand.
I just don't see a case where using a separate path class would break things. Further, the special handling that is required would be made even clearer by using such a class. But it does have to be implemented ...
regards Steve
Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/
- Previous message: [Python-Dev] Python-3.0, unicode, and os.environ
- Next message: [Python-Dev] Python-3.0, unicode, and os.environ
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]