[Python-Dev] PEP 420 - dynamic path computation is missing rationale (original) (raw)

PJ Eby pje at telecommunity.com
Mon May 21 20:08:16 CEST 2012


On Mon, May 21, 2012 at 9:55 AM, Guido van Rossum <guido at python.org> wrote:

Ah, I see. But I disagree that this is a reasonable constraint on sys.path. The magic path object of a toplevel namespace module should know it is a toplevel module, and explicitly refetch sys.path rather than just keeping around a copy.

That's fine by me - the class could actually be defined to take a module name and attribute (e.g. 'sys', 'path' or 'foo', 'path'), and then there'd be no need to special case anything: it would behave exactly the same way for subpackages and top-level packages.

This leaves the magic path objects for namespace modules, which I could live with, as long as their repr was not the same as a list, and assuming a good rationale is given. Although I'd still prefer plain lists here as well; I'd like to be able to manually construct a namespace package and force its directories to be a specific set of directories that I happen to know about, regardless of whether they are related to sys.path or not. And I'd like to know that my setup in that case would not be disturbed by changes to sys.path.

To do that, you just assign to path, the same as now, ala path = pkgutil.extend_path(). The auto-updating is in the initially-assigned path object, not the module object or some sort of generalized magic.

I'd like to hear more about this from Philip -- is that feature

actually widely used?

Well, it's built into setuptools, so yes. ;-) It gets used any time a dynamically specified dependency is used that might contain a namespace package. This means, for example, that every setup script out there using "setup.py test", every project using certain paste.deploy features... it's really difficult to spell out the scope of things that are using this, in the context of setuptools and distribute, because there are an immense number of ways to indirectly rely on it.

This doesn't mean that the feature can't continue to be implemented inside setuptools' dynamic dependency system, but the code to do it in setuptools is MUCH more complicated than the PEP 420 code, and doesn't work if you manually add something to sys.path without asking setuptools to do it. It's also somewhat timing-sensitive, depending on when and whether you import 'site' and pkg_resources, and whether you are mixing eggs and non-eggs in your namespace packages.

In short, the implementation is a huge mess that the PEP 420 approach would vastly simplify.

But... that wasn't the original reason why I proposed it. The original reason was simply that it makes namespace packages act more like the equivalents do in other languages. While being able to override path can be considered a feature of Python, its being static by default is NOT a feature, in the same way that requiring an init.py is not really a feature.

The principle of least surprise says (at least IMO) that if you add a directory to sys.path, you should be able to import stuff from it. That whether it works depends on whether or not you already imported part of a namespace package earlier is both surprising and confusing. (More on this below.)

What would a package have to do if the feature didn't exist?

Continue to depend on setuptools to do it for them, or use some hypothetical update API... but that's not really the right question. ;-)

The right question is, what happens to package users if the feature didn't exist?

And the answer to that question is, "you must call this hypothetical update API every time you change sys.path, because otherwise your imports might break, depending on whether or not some other package imported something from a namespace before you changed sys.path".

And of course, you also need to make sure that any third-party code you use does this too, if it adds something to sys.path for you.

And if you're writing cross-Python-version code, you need to check to make sure whether the API is actually available.

And if you're someone helping Python newbies, you need to add this to your list of debugging questions for import-related problems.

And remember: if you forget to do this, it might not break now. It'll break later, when you add that other plugin or update that random module that dynamically decides to import something that just happens to be in a namespace package, so be prepared for it to break your application in the field, when an end-user is using it with a collection of plugins that you haven't tested together, or in the same import sequence...

The people using setuptools won't have these problems, but new Python users will, as people begin using a PEP 420 that lacks this feature.

The key scope question, I think, is: "How often do programs change sys.path at runtime, and what have they imported up to that point?" (Because for the other part of the scope, I think it's a fairly safe bet that namespace packages are going to become even more popular than they are now, once PEP 420 is in place.)

But the key API/usability question is: "What's the One Obvious Way to add/change what's importable?"

And I believe the answer to that question is, "change sys.path", not "change sys.path, and then import some other module to call another API to say, 'yes, I really meant to update sys.path, thank you very much.'"

(Especially since NOT requiring that extra API isn't going to break any existing code.)

I'd really much rather not have this feature, which reeks of too much magic to me. (An area where Philip and I often disagree. :-)

My take on it is that it only SEEMS like magic, because we're used to static path. But other languages don't have per-package path in the first place, so there's nothing to "automatically update", and so it's not magic at all that other subpackages/modules can be found when the system path changes!

So, under the PEP 420 approach, it's static path that's really the weird special case, and should be considered so. (After all, path is and was primarily an implementation optimization and compatibility hack, rather than a user-facing "feature" of the import system.)

For example, when would you want to explicitly spell out a namespace package path, and restrict it from seeing sys.path changes? I've not seen anybody ask for this feature in the context of setuptools; it's only ever been bug reports about when the more complicated implementation fails to detect an update.

So, to wrap up:



More information about the Python-Dev mailing list