[Python-Dev] requirements for moving import over to importlib? (original) (raw)

Brett Cannon brett at python.org
Thu Feb 9 16:05:22 CET 2012


On Wed, Feb 8, 2012 at 20:26, Nick Coghlan <ncoghlan at gmail.com> wrote:

On Thu, Feb 9, 2012 at 2:09 AM, Antoine Pitrou <solipsis at pitrou.net> wrote: > I guess my point was: why is there a function call in that case? The > "import" statement could look up sys.modules directly. > Or the built-in import could still be written in C, and only defer > to importlib when the module isn't found in sys.modules. > Practicality beats purity.

I quite like the idea of having builtin import be a very thin veneer around importlib that just does the "is this in sys.modules already so we can just return it from there?" checks and delegates other more complex cases to Python code in importlib. Poking around in importlib.import [1] (as well as importlib.gcdimport), I'm thinking what we may want to do is break up the logic a bit so that there are multiple helper functions that a C version can call back into so that we can optimise certain simple code paths to not call back into Python at all, and others to only do so selectively. Step 1: separate out the "fromlist" processing from import into a separate helper function def processfromlist(module, fromlist): # Perform any required imports as per existing code: # http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/bootstrap.py#l987 Fine by me.

Step 2: separate out the relative import resolution from gcdimport into a separate helper function. def resolverelativename(name, package, level): assert hasattr(name, 'rpartition') assert hasattr(package, 'rpartition') assert level > 0 name = # Recalculate as per the existing code: # http://hg.python.org/cpython/file/aba513307f78/Lib/importlib/bootstrap.py#l889 return name

I was actually already thinking of exposing this as importlib.resolve_name() so breaking it out makes sense.

I also think it might be possible to expose a sort of importlib.find_module() that does nothing more than find the loader for a module (if available).

Step 3: Implement builtin import in C (pseudo-code below): def import(name, globals={}, locals={}, fromlist=[], level=0): if level > 0: name = importlib.resolverelativeimport(name) try: module = sys.modules[name] except KeyError: # Not cached yet, need to invoke the full import machinery # We already resolved any relative imports though, so # treat it as an absolute import return importlib.import(name, globals, locals, fromlist, 0) # Got a hit in the cache, see if there's any more work to do if not fromlist: # Duplicate relevant importlib.import logic as C code # to find the right module to return from sys.modules elif hasattr(module, "path"): importlib.processfromlist(module, fromlist) return module This would then be similar to the way main.c already works when it interacts with runpy - simple cases are handled directly in C, more complex cases get handed over to the Python module.

I suspect that if people want the case where you load from bytecode is fast then this will have to expand beyond this to include C functions and/or classes which can be used as accelerators; while this accelerates the common case of sys.modules, this (probably) won't make Antoine happy enough for importing a small module from bytecode (importing large modules like decimal are already fast enough). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20120209/f33360d2/attachment.html>



More information about the Python-Dev mailing list