[Python-Dev] Re: os.path.commonprefix breakage (original) (raw)

Skip Montanaro skip@mojam.com (Skip Montanaro)
Wed, 16 Aug 2000 23:41:59 -0500 (CDT)


Fred> I'd guess that the path separator should only be appended if it's
Fred> part of the passed-in strings; that would make it a legitimate
Fred> part of the prefix.  If it isn't present for all of them, it
Fred> shouldn't be part of the result:

>>> os.path.commonprefix(["foo", "foo/bar"])
'foo'

Hmmm... I think you're looking at it character-by-character again. I see three possibilities:

* it's invalid to have a path with a trailing separator

* it's okay to have a path with a trailing separator

* it's required to have a path with a trailing separator

In the first and third cases, you have no choice. In the second you have to decide which would be best.

On Unix my preference would be to not include the trailing "/" for aesthetic reasons. The shell's pwd command, the os.getcwd function and the os.path.normpath function all return directories without the trailing slash. Also, while Python may not have this problem (and os.path.join seems to normalize things), some external tools will interpret doubled "/" characters as single characters while others (most notably Emacs), will treat the second slash as "erase the prefix and start from /".

In fact, the more I think of it, the more I think that Mark's reliance on the trailing slash is a bug waiting to happen (in fact, it just happened ;-). There's certainly nothing wrong (on Unix anyway) with paths that don't contain a trailing slash, so if you're going to join paths together, you ought to be using os.path.join. To whack off prefixes, perhaps we need something more general than os.path.split, so instead of

prefix = os.path.commonprefix(files)
for file in files:
   tail_portion = file[len(prefix):]

Mark would have used

prefix = os.path.commonprefix(files)
for file in files:
   tail_portion = os.path.splitprefix(prefix, file)[1]

The assumption being that

os.path.splitprefix("/home", "/home/beluga/skip")

would return

["/home", "beluga/skip"]

Alternatively, how about os.path.suffixes? It would work similar to os.path.commonprefix, but instead of returning the prefix of a group of files, return a list of the suffixes resulting in the application of the common prefix:

>>> files = ["/home/swen", "/home/swanson", "/home/jules"]
>>> prefix = os.path.commonprefix(files)
>>> print prefix
"/home"
>>> suffixes = os.path.suffixes(prefix, files)
>>> print suffixes
["swen", "swanson", "jules"]

Skip