[Python-Dev] Proposal to revert r54204 (splitext change) (original) (raw)

Phillip J. Eby pje at telecommunity.com
Tue Mar 20 19:24:12 CET 2007


At 04:47 PM 3/20/2007 +0100, Ronald Oussoren wrote:

On 20 Mar, 2007, at 15:54, Phillip J. Eby wrote:

At 09:24 AM 3/20/2007 +0100, Ronald Oussoren wrote: I don't agree. "allext=True" is won't do the right thing in a significant subset of filenames

Yes, that's understood. The problem is that splitext() in general "won't do the right thing", for many definitions of "the right thing", unless you're applying it to a fairly constrained range of filenames, or unless you add other code. This won't change, unless we get rid of splitext() altogether. I know that, I actually read most of the messages in this thread. The reason I'm pointing this out for the 'allext=True' case is that adding this flag could give naive users even more reason to believe that splitext will magicly do the right thing.

Well, that's where we need to shore up the documentation, which needs to point out the folly of expecting DWIM. We should give some examples of where splitext() will not DWIM.

If you're trying to match an archive extension, for example, you'll probably need to loop on repeated splitext() calls until you find an extension that matches. One benefit of using both the new keyword arguments together is that it allows you to make your loop proceed from longest match to shortest, so that if you are matching product-X.Y.Z.tar.gz, you're going to go through matching .Y.Z.tar.gz, then .Z.tar.gz, then .tar.gz. I don't know if this is worth the additional API complexity. Especially given the inherit problems of a splitext function.

The ignoreleadingdot argument also doesn't buy you anything that can't trivially be implemented in other ways. I don't understand. Example? You conveniently ignored my other arguments ;-). Given a splitext that ignores leading dot's the following function doesn't: # from os.path import * def splitext2(path): dn = dirname(path) bn, ext = splitext(basename(path)) if bn.startswith('.') and ext == '': return dn, bn + ext else: return join(dn, bn), ext I'd say that's a trivial function. What I don't understand is why 'ignoreleadingdot==False' is considered to be a valid usecase at all, except for the fact that os.path.splitext did this until py2.5. I'm definitely in the camp that considers '.profile' not to have an extension.

Okay, the part I'm confused about is what's your position on what should be done about this. Are you favoring no change? Deprecating it and ripping it out? Or what?



More information about the Python-Dev mailing list