[Python-Dev] os.path.normcase rationale? (original) (raw)
Guido van Rossum guido at python.org
Sat Sep 25 16:45:38 CEST 2010
- Previous message: [Python-Dev] os.path.normcase rationale?
- Next message: [Python-Dev] os.path.normcase rationale?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, Sep 24, 2010 at 8:25 PM, Steven D'Aprano <steve at pearwood.info> wrote:
On Sat, 25 Sep 2010 09:22:47 am Guido van Rossum wrote:
I think that, like os.path.realpath(), it should not fail if the file does not exist.
Maybe the API could be called os.path.unnormpath(), since it is in a sense the opposite of normpath() (which removes case) ? But I would want to write it so that even on Unix it scans the filesystem, in case the filesystem is case-preserving (like the default fs on OS X). It is not entirely clear to me what this function is meant to actually do? Should it: 1. Return the case of a filename in some canonical form which depends on the file system? 2. Return the case of a filename as it is actually stored on disk?
This one. This is actually useful (on case-preserving filesystems). There is no doubt in my mind that this is the requested and needed functionality.
3. Something else?
and just for completeness: 4. Return the case of a filename in some arbitrarily-chosen canonical form which does not depend on the file system? These are not the same, either conceptually or in practice. If you want #4, you already have it in os.path.normcase. I think that the OP, Chris, wants #1, but it isn't entirely clear to me.
I don't think this is where the issue lies.
It's possible that he wants #2.
Various people have posted links to recipes that solve case #2. Note though that this necessarily demands that if the file doesn't exist, it should raise an exception.
No it needn't; realpath() uses the filesystem but leaves non-existing parts alone. Also some of the path may exist (e.g. a parent directory).
In the case of #1, if the file system doesn't exist, we can't predict what the canonical form should be.
The very concept of canonical form for file names is troublesome. If the file system is case-preserving, the file system doesn't define a canonical form: the case of the file name will depend on how the file is initially named. If the file system is case-destructive the behaviour will depend on the file system itself: e.g. FAT12 and ISO 9660 both uppercase file names, but other file systems may make other choices. For some arbitrary path, where we don't know what file system it is, or if the path doesn't actually exist, we have no way of telling what the file system's canonical form will be, or even whether it will have one. Note that I've been talking about case preservation, not case sensitivity. That's because case preservation is orthogonal to sensitivity. You can see three of the four combinations, e.g.: Preserving + insensitive: fat32, NTFS under Win32, normally HFS+ Preserving + sensitive: ext3, NTFS under POSIX, optionally HFS+ Destructive + insensitive: fat12, fat16 without long file name support To the best of my knowledge, destructive + sensitive doesn't exist. It could, in principle, but it would be silly to do so. Note that just knowing the file system type is not enough to tell what its behaviour will be. Given an arbitrary file system, there's no obvious way to determine what it will do to file names short of trying to create a file and see what happens.
This operation should not do any writes.
The solution may well be OS specific. Solutions for Windows and OS X have already been pointed out. If it can't be done for other Unix versions, I think returning the input unchanged on those platform is a fine fallback (as it is for non-existent filenames).
-- --Guido van Rossum (python.org/~guido)
- Previous message: [Python-Dev] os.path.normcase rationale?
- Next message: [Python-Dev] os.path.normcase rationale?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]