[Python-Dev] A wart which should have been repaired in 3.0? (original) (raw)

Steven D'Aprano steve at pearwood.info
Sat Dec 27 07:37:20 CET 2008


On Sat, 27 Dec 2008 10:58:07 am Nick Coghlan wrote:

skip at pobox.com wrote: > The doc for os.path.commonprefix states: > > Return the longest path prefix (taken character-by-character) > that is a prefix of all paths in list. If list is empty, return the > empty string (''). Note that this may return invalid paths because > it works a character at a time. > > I remember encountering this in an earlier version of Python 2.x > (maybe 2.2 or 2.3?) and "fixed" it to work by pathname components > instead of by characters. That had to be reverted because it was a > behavior change and broke code which used it for strings which > didn't represent paths. After the reversion I then forgot about > it. > > I just stumbled upon it again. It seems to me this would have been > a good thing to fix in 3.0. Is this something which could change > in 3.1 (or be deprecated in 3.1 with deletion in 3.2)?

Why can't we add an "allowfragment" keyword that defaults to True? Then "allowfragment=False" will stop at the last full directory name and ignore any partial matches on the filenames or the next subdirectory (depending on where the common prefix ends).

For what it's worth, I think that the two pieces of functionality are different enough that in an ideal world they should be two different functions rather than one function with a switch. I think os.path.commonprefix should only operate on path components, and if character-by-character prefix matching on general strings is useful, then it should be a string method.

-- Steven D'Aprano



More information about the Python-Dev mailing list