[Python-Dev] Path object design (original) (raw)
Mike Orr sluggoster at gmail.com
Thu Nov 2 02:46:49 CET 2006
- Previous message: [Python-Dev] Path object design
- Next message: [Python-Dev] Path object design
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 11/1/06, glyph at divmod.com <glyph at divmod.com> wrote:
On 06:14 pm, fredrik at pythonware.com wrote: >glyph at divmod.com wrote: > >> I assert that it needs a better[1] interface because the current >> interface can lead to a variety of bugs through idiomatic, apparently >> correct usage. All the more because many of those bugs are related to >> critical errors such as security and data integrity. >instead of referring to some esoteric knowledge about file systems that >us non-twisted-using mere mortals may not be evolved enough to under- >stand, On the contrary, twisted users understand even less, because (A) we've been demonstrated to get it wrong on numerous occasions in highly public and embarrassing ways and (B) we already have this class that does it all for us and we can't remember how it works :-).
This is ironic coming from one of Python's celebrity geniuses. "We made this class but we don't know how it works." Actually, it's downright alarming coming from someone who knows Twisted inside and out yet still can't make sense of path patform oddities.
* This is confusing as heck: >>> os.path.join("hello", "/world") '/world'
That's in the documentation. I'm not sure it's "wrong". What should it do in this situation? Pretend the slash isn't there?
This came up in the directory-tuple proposal. I said there was no reason to change the existing behavior of join. Noam favored an exception.
>>> os.path.join("hello", "slash/world") 'hello/slash/world'
That has always been a loophole in the function, and many programs depend on it. Again, is it "wrong"? Should an embedded separator in an argument be an error? Obviously this depends on the user's knowledge that the separator happens to be slash.
>>> os.path.join("hello", "slash//world") 'hello/slash//world'
Again a case of what "should" it do? The filesystem treats it as a single slash. The user didn't call normpath, so should we normalize it anyway?
* Sometimes a path isn't a path; the zip "paths" in sys.path are a good example. This is why I'm a big fan of including a polymorphic interface of some kind: this information is already being persisted in an ad-hoc and broken way now, so it needs to be represented; it would be good if it were actually represented properly. URL manipulation-as-path-manipulation is another; the recent perforce use-case mentioned here is a special case of that, I think.
Good point, but exactly what functionality do you want to see for zip files and URLs? Just pathname manipulation? Or the ability to see whether a file exists and extract it, copy it, etc?
* you have to care about unicode sometimes. rarely enough that none of your tests will ever account for it, but often enough that some users will notice breakage if your code is ever widely distributed.
This is a Python-wide problem. The move to universal unicode will lessen this, or at least move the problem to one place (creating the unicode object), where every Python programmer will get bitten by it and we'll develop a few standard strategies to deal with it.
(The problem is that if str and unicode are mixed in expressions, Python will promote the str to unicode and you'll get a UnicodeDecodeError if it contains non-ASCII characters. Figuring out all the ways such strings can slip into a program is difficult if you're dealing with user strings from an unknown charset, or your MySQL server is configured differently than you thought it was, or the string contains Windows curly quotes et al which are undefined in Latin-1.)
* the documentation really can't emphasize enough how bad using 'os.path.exists/isfile/isdir', and then assuming the file continues to exist when it is a contended resource, is. It can be handy, but it is always a race condition.
What else can you do? It's either os.path.exists()/os.remove() or "do it anyway and catch the exception". And sometimes you have to check the filetype in order to determine what to do.
-- Mike Orr <sluggoster at gmail.com>
- Previous message: [Python-Dev] Path object design
- Next message: [Python-Dev] Path object design
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]