[Python-Dev] os.path.normcase rationale? (original) (raw)
Steven D'Aprano steve at pearwood.info
Tue Oct 5 13:04:39 CEST 2010
- Previous message: [Python-Dev] os.path.normcase rationale?
- Next message: [Python-Dev] os.path.normcase rationale?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, 5 Oct 2010 07:21:15 pm Chris Withers wrote:
On 25/09/2010 04:25, Steven D'Aprano wrote: > 1. Return the case of a filename in some canonical form which > depends on the file system? > 2. Return the case of a filename as it is actually stored on disk?
How do 1 and 2 differ?
Case #1 imposes a particular canonical form, regardless of what is actually stored on disk. It is similar to normpath, except that we could have different canonical forms depending on what the file system was. normpath merely generalises from the operating system, and never looks at the file system.
Some file systems are case-preserving, and don't have a canonical form. We might choose to arbitrarily impose one, as normcase already does. Some are case-folding, in which case it might be sensible to choose the same canonical form as the file system actually uses. However, this may be implementation dependent e.g. under FAT12 or FAT16, the file system will take a file name like pArRoT.tXt and fold it to PARROT.TXT, or possibly parrot.txt, or Parrot.txt. Even if that's not the case for FAT12, it may be the case for other case-folding file systems. And the behaviour of FAT16 will differ according to whether or not it has been built with support for long file names.
Case #2 says to actually look at the file and see what the file system considers it's name to be. Consider a NTFS file system. By default it is case-preserving and case-insensitive, although that can be changed. (Just because a file system is NTFS doesn't mean that will be case-insensitive. NTFS can also run in a POSIX mode which is case-sensitive. But I digress.)
For simplicity, suppose you're on Windows using NTFS with the standard non-POSIX behaviour. You create a file named pArRoT.tXt. This will be stored on disk using the exact characters that you typed. The file system does no case-folding and merely uses whatever characters are fed to it, which in the case of Windows apps is likely to be whatever characters the user types. In this case, we don't try to impose a particular case on file names, but return whatever actually exists on disk.
FWIW, the use case that setuptools has (and for which it currently incorrectly uses normpath) is number 2.
> 4. Return the case of a filename in some arbitrarily-chosen > canonical form which does not depend on the file system? This is what normpath does, but only if you're on Windows ;-)
Not quite. macpath.normcase() also lowercases the path. So does the module for OS/2.
In any case, Windows is not a file system. It is quite possible to have virtually any combination of case-destroying, case-preserving, -sensitive and -insensitive file systems on the one Windows system. Say, a FAT12 floppy, an NTFS partition, and an ext2 USB stick. Windows doesn't ship with native support for ext2, but that doesn't mean it can't be installed with third party drivers.
normpath pays no attention to any of this, and just lowercases the path. At least that's cheap, and consistent, even if it solves the wrong problem :)
-- Steven D'Aprano
- Previous message: [Python-Dev] os.path.normcase rationale?
- Next message: [Python-Dev] os.path.normcase rationale?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]