(original) (raw)

On Sun Nov 30 2014 at 3:55:39 PM Guido van Rossum <guido@python.org> wrote:

On Sun, Nov 30, 2014 at 11:29 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Sun, Nov 30, 2014 at 2:54 AM, Guido van Rossum <guido@python.org> wrote:

> All the use cases seem to be about adding some kind of getattr hook to

> modules. They all seem to involve modifying the CPython C code anyway. So

> why not tackle that problem head-on and modify module_getattro() to look for

> a global named __getattr__ and if it exists, call that instead of raising

> AttributeError?

You need to allow overriding __dir__ as well for tab-completion, and

some people wanted to use the properties API instead of raw

__getattr__, etc. Maybe someone will want __getattribute__ semantics,

I dunno.

Hm... I agree about __dir__ but the other things feel too speculative.

So since we're *so close* to being able to just use the

subclassing machinery, it seemed cleaner to try and get that working

instead of reimplementing bits of it piecewise.

That would really be option 1, right? It's the one that looks cleanest from the user's POV (or at least from the POV of a developer who wants to build a framework using this feature -- for a simple one-off use case, __getattr__ sounds pretty attractive). I think that if we really want option 1, the issue of PyModuleType not being a heap type can be dealt with.

That said, __getattr__ + __dir__ would be enough for my immediate use cases.

Perhaps it would be a good exercise to try and write the "lazy submodule import"(*) use case three ways: (a) using only CPython 3.4; (b) using __class__ assignment; (c) using customizable __getattr__ and __dir__. I think we can learn a lot about the alternatives from this exercise. I presume there's already a version of (a) floating around, but if it's been used in practice at all, it's probably too gnarly to serve as a useful comparison (though its essence may be extracted to serve as such).

FWIW I believe all proposals here have a big limitation: the module *itself* cannot benefit much from all these shenanigans, because references to globals from within the module's own code are just dictionary accesses, and we don't want to change that.

(*) I originally wrote "lazy import", but I realized that messing with the module class object probably isn't the best way to implement that -- it requires a proxy for the module that's managed by an import hook. But if you think it's possible, feel free to use this example, as "lazy import" seems a pretty useful thing to have in many situations. (At least that's how I would do it. And I would probably add some atrocious hack to patch up the importing module's globals once the module is actually loaded, to reduce the cost of using the proxy over the lifetime of the process.

Start at https://hg.python.org/cpython/file/64bb01bce12c/Lib/importlib/util.py#l207 and read down the rest of the file. It really only requires changing __class__ to drop the proxy and that's done immediately after the lazy import. The approach also occurs after the finder so you don't get ImportError for at least missing a file.