(original) (raw)

Ugh; this was supposed to be sent to the list, not just Guido. �(I wish Gmail defaulted to reply-all in the edit box.)

---------- Forwarded message ----------
From: PJ Eby <pje@telecommunity.com>

Date: Mon, Mar 12, 2012 at 12:16 AM
Subject: Re: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)?
To: Guido van Rossum <guido@python.org>


On Sun, Mar 11, 2012 at 10:39 PM, Guido van Rossum <guido@python.org> wrote:
I'm leaning towards PEP 402 or some variant. Let's have a pow-wow at
the sprint tomorrow (I'll arrive in Santa Clara between 10 and 10:30).
I do want to understand Nick's argument better; I haven't studied PEP
395 yet.

Note that PEP 395 can stay compatible with PEP 402 by a fairly straightforward change: instead of implicitly and automagically guessing the needed sys.path\[0\] change, it could be made explicit by adding something like this to the top of script/modules that are inside a package:

� � import pkgutil
� � pkgutil.script\_module(\_\_name\_\_, 'mypackage.thismodule')

Assuming \_\_name\_\_=='\_\_main\_\_', the API would set \_\_main\_\_.\_\_qualname\_\_, set sys.modules\[qualname\] = \_\_main\_\_, and fix up sys.path\[0\] if and only if it still is the parent directory of \_\_main\_\_.\_\_file\_\_. �(If \_\_name\_\_!=='\_\_main\_\_' and it's not equal to the second argument either, it'd be an error.)

Then, in the event of broken relative imports or module aliasing, the error message can suggest adding a script\_module() declaration to explicitly make the file a "dual citizen" -- i.e., script/module. �(It's already possible for PEP 395 to be confused by stray \_\_init\_\_.py files or \_\_path\_\_ manipulation; using error messages and explicit declaration instead of guessing seems like a better route for 395 to take.)

Of course, it's also possible to fix the 395/402 incompatibility by reintroducing some sort of marker, such as .pyp directory extensions or by including \*.pyp marker files within package directories. �The problem is that these markers work against the intuitive nature of PEP 402 if they are required, and they do not help 395 if nobody uses them due to their optionality. �;-)

(Last, but not least, the compromise approach: allow explicit script/module declaration as a workaround for virtual packages, AND support automagic \_\_qualname\_\_ recognition for self-contained packages... �but still give error messages for broken relative imports and aliasing that suggest the explicit declaration.)

Anyway, the other open issues for 402 are:

\* Dealing with updates to sys.path
\* Iterating available virtual packages

There was a Python-Dev discussion about the first, in which I realized that sys.path updates can actually be handled transparently by making virtual \_\_path\_\_ objects be special iterables rather than lists; but the PEP hasn't been updated to reflect that. �(I was actually waiting for some sign of BDFL interest before adding a potential complication like that to the PEP.) �The relevant proposal was:

> This seems to lean in favor of making a simple reiterable wrapper   
> type for the \_\_path\_\_, that only allows you to take the length and   
> iterate over it. With an appropriate design, it could actually   
> update itself automatically, given a subname and a parent   
> \_\_path\_\_/sys.path. That is, it could keep a tuple copy of the   
> last-seen parent path, and before iteration, compare   
> tuple(self.parent\_path) to self.last\_seen\_path. If they're   
> different, it rebuilds the value to be iterated over.

> Voila: transparent updating of all virtual __path__ values from
> sys.path changes (or modifications to self-contained __path__
> parents, btw), and trying to change it (or read an item from it
> positionally) will not create any silent failures.

> Alright... *if* we support automatic updates to virtual __paths__,
> this is probably how we should do it. (It will require, though, that
> imp.find_module be changed to use a different iteration method than
> PyList_GetItem, as it's quite possible a virtual __path__ will get
> passed into it.)

I actually drafted an implementation of this to work with importlib, so it seems pretty feasible to support automatically-updated virtual paths that change on the next import attempt if sys.path (or any parent __path__) has changed since the last time.


Iterating virtual packages is a somewhat harder problem, since it's not really practical to do an unbounded subdirectory search for importable files. Probably, the pkgutil module-walking APIs just need to grow some extra flags for virtual package searching, with some reasonable defaults.