[Python-Dev] Defining a path protocol (original) (raw)

Paul Moore p.f.moore at gmail.com
Wed Apr 6 15:32:39 EDT 2016


On 6 April 2016 at 19:32, Brett Cannon <brett at python.org> wrote:

> Now we need clear details. :) Some open questions are: > > 1. Name: path, fspath, or something else?

fspath +1 for path, +0 for fspath (I don't know how widespread the notion that "fs" means "file system" is).

Agreed. But if we have a builtin, it should follow the name of the special attribute/method. And I'm not that keen on having a builtin with a generic name like 'path'.

> 2. Method or attribute? (changes what kind of one-liner you might use > in libraries, but I think historically all protocols have been > methods and the serialized string representation might be costly to > build)

I would prefer an attribute, but yeah I think dunders are typically methods, and I don't see this being special enough to not follow that trend. Depends on what we want to tell 3rd-party libraries to do to support pathlib if they are on 3.3 or if they are worried about people using Python 3.4.2 or 3.5.1. An attribute still works with getattr(path, '_path_', path). But with a method you probably want either path._path_() if hasattr(path,_ _'_path_') else path or getattr(path, '_path_', lambda: path)().

I'm a little confused by this. To support the older pathlib, they have to do patharg = str(patharg), because none of the proposed attributes (path or path) will exist.

The getattr trick is needed to support the new pathlib, when you need a real string. Currently you need a string if you call stdlib functions or builtins. If we fix the stdlib/builtins, the need goes away for those cases, but remains if you need to call libraries that don't support pathlib (os.path will likely be one of those) or do direct string manipulation.

In practice, I see the getattr trick as an "easy fix" for libraries that want to add support but in a minimally-intrusive way. On that basis, making the trick easy to use is important, which argues for an attribute.

> 3. Built-in? (name is dependent on #1 if we add one)

fspath() -- and it would be handy to have a function that return either the fspath results, or the string (if it was one), or raise an exception if neither of the above work out.

fspath regardless of the name chosen in #1 - a new builtin called path just has too much likelihood of clashing with user code.

But I'm not sure we need a builtin. I'm not at all clear how frequently we expect user code to need to use this protocol. Users can't use the builtin if they want to be backward compatible, But code that doesn't need backward compatibility can probably just work with pathlib (and the stdlib support for it) directly. For display, the implicit conversion to str is fine. For "get me a string representing the path", is the "path" attribute being abandoned in favour of this special method? I'm inclined to think that if you are writing "pure pathlib" code, pathobj.path looks more readable than fspath(pathobj) - certainly no less readable.

But I'm not one of the people who disliked using .path, so I'm probably not best placed to judge. It would be good if someone who does feel strongly could explain why fspath(pathobj) is better than pathobj.path.

So:

# Attribute def fspath(path): hasattr(path, 'path'): return path.path if isinstance(path, str): return path raise NotImplementedError # Or TypeError? # Method def fspath(path): try: return path.path() except AttributeError: if isinstance(path, str): return path raise TypeError # Or NotImplementedError?

You could of course use try/except for the attribute case. Or hasattr for the method case (where it would avoid masking AttributeError exceptions raised within the dunder method call (a possibility if user classes implement their own version of the protocol).

Paul



More information about the Python-Dev mailing list