msg153566 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 16:38 |
This patch makes importlib's _FileFinder more than 10x faster by keeping a cache of directory entries. It's actually faster than imp.find_module()! -> imp.find_module: $ ./python -m timeit -s "import imp" "imp.find_module('logging', None)" 10000 loops, best of 3: 69.9 usec per loop $ ./python -m timeit -s "import imp" "imp.find_module('bisect', None)" 10000 loops, best of 3: 108 usec per loop -> unpatched importlib: $ ./python -m timeit -s "from importlib._bootstrap import _DefaultPathFinder as Finder" "Finder.find_module('logging', None)" 1000 loops, best of 3: 431 usec per loop $ ./python -m timeit -s "from importlib._bootstrap import _DefaultPathFinder as Finder" "Finder.find_module('bisect', None)" 1000 loops, best of 3: 468 usec per loop -> patched importlib: $ ./python -m timeit -s "from importlib._bootstrap import _DefaultPathFinder as Finder" "Finder.find_module('logging', None)" 10000 loops, best of 3: 37.5 usec per loop $ ./python -m timeit -s "from importlib._bootstrap import _DefaultPathFinder as Finder" "Finder.find_module('bisect', None)" 10000 loops, best of 3: 36.9 usec per loop |
|
|
msg153572 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 17:13 |
Note that the cost of filling the cache itself isn't trivial: $ ./python -m timeit -s "import sys; from importlib import _bootstrap" "_bootstrap._file_path_hook('Lib')._fill_cache()" 1000 loops, best of 3: 1.88 msec per loop |
|
|
msg153575 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 17:17 |
Updated patch with faster cache filling. |
|
|
msg153582 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2012-02-17 17:37 |
I have not had a chance to deep-dive in the patch, but I just wanted to double-check that this (a) implemented the idea PJE and I discussed on python-dev, and (b) you re-generate the patch as I pushed some changes to importlib today. |
|
|
msg153583 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 17:41 |
> I have not had a chance to deep-dive in the patch, but I just wanted > to double-check that this (a) implemented the idea PJE and I discussed > on python-dev I haven't followed this discussion. > , and (b) you re-generate the patch as I pushed some changes to > importlib today. Will do. |
|
|
msg153584 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 17:45 |
Regenerated patch. |
|
|
msg153587 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2012-02-17 17:59 |
Just to fill you in, the discussion centred on the idea of doing a listdir() of the directory the FileFinder was in charge of watching and caching that. Then, when it had to look up a file all it had to do was stat the directory to look for a change before it simply looked at a frozenset of directory contents. That does away with having to stat for every file type (directory, module, extension, etc.) and replaces it with one upfront listdir() and then a single directory stat per lookup. |
|
|
msg153588 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 18:02 |
> Just to fill you in, the discussion centred on the idea of doing a > listdir() of the directory the FileFinder was in charge of watching > and caching that. Then, when it had to look up a file all it had to do > was stat the directory to look for a change before it simply looked at > a frozenset of directory contents. That does away with having to stat > for every file type (directory, module, extension, etc.) and replaces > it with one upfront listdir() and then a single directory stat per > lookup. Ah, then it's basically what my patch does (except that it also computes all potential module names instead of dumping the os.listdir() result in a frozenset). |
|
|
msg153590 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2012-02-17 18:57 |
OK, I have gone ahead and done a review over on rietveld for the code as-is. But I do have a design question. Why pre-calculate everything? In the most common case any single module will be imported once, if at all. And once it is imported it will get cached in sys.modules, alleviating the need to hit the finder again. So from a performance standpoint wouldn't it be better not to do all of the pre-calculation and instead do that as needed assuming that sys.modules will shield the finder from having to do repetitive things like figuring out what loader is needed? You will have to do the calculation regardless for the first import, but you might not need to import every module in a directory. Plus if the finder gets its cache invalidated frequently it will simply be wasting its time. I'm not going to argue from the perspective that most modules won't get imported as that's purely a stdlib thing; I am willing to bet most projects import nearly all of their modules in each execution. Otherwise it's good to know three of us now have independently come up with fundamentally the same idea for speeding up imports. =) |
|
|
msg153592 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 19:31 |
> Why pre-calculate everything? In the most common case any single > module will be imported once, if at all. And once it is imported it > will get cached in sys.modules, alleviating the need to hit the finder > again. So from a performance standpoint wouldn't it be better not to > do all of the pre-calculation and instead do that as needed assuming > that sys.modules will shield the finder from having to do repetitive > things like figuring out what loader is needed? I figured it would avoid repetitive tests for all 10 suffixes. That said, I have now tried the alternative: find_module() is around 50% slower, but building the cache is 10x faster. Perhaps this is a winner. It would depend on the situation (short or long sys.path, few or many imports, etc.). Perhaps you can try both patches on your bootstrap repo? > Plus if the finder gets its cache invalidated frequently it will > simply be wasting its time. Well, in real-world situations I don't think the cache will ever get invalidated: because imports are mostly done at startup, and because invalidating the cache means you are installing new libraries or updating existing ones while a running program is about to import something. > Otherwise it's good to know three of us now have independently come up > with fundamentally the same idea for speeding up imports. =) Yup :) |
|
|
msg153593 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 19:32 |
Patch with light-weight alternative. |
|
|
msg153596 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2012-02-17 20:24 |
On Fri, Feb 17, 2012 at 14:31, Antoine Pitrou <report@bugs.python.org>wrote: > > Antoine Pitrou <pitrou@free.fr> added the comment: > > > Why pre-calculate everything? In the most common case any single > > module will be imported once, if at all. And once it is imported it > > will get cached in sys.modules, alleviating the need to hit the finder > > again. So from a performance standpoint wouldn't it be better not to > > do all of the pre-calculation and instead do that as needed assuming > > that sys.modules will shield the finder from having to do repetitive > > things like figuring out what loader is needed? > > I figured it would avoid repetitive tests for all 10 suffixes. > That said, I have now tried the alternative: find_module() is around 50% > slower, but building the cache is 10x faster. Perhaps this is a winner. > What is the time increase for find_module() vs. the speed-up of building the cache? I.e. how many imports are needed before doing the full calculation is a benefit? And would it make sense to have a hybrid of caching the contents for fast start-up but then caching full details after a successful find? That would mean no work is ever simply tossed out and forgotten. > It would depend on the situation (short or long sys.path, few or many > imports, etc.). Perhaps you can try both patches on your bootstrap repo? > Yep, that's not hard (and it will only get faster as I replace the bodies of __import__() and _gcd_import() with C code so that sys.modules is C-fast again). Question is what to benchmark against? I should probably get the standard benchmarks up and running and see how those are affected (especially the start-up ones). > > > Plus if the finder gets its cache invalidated frequently it will > > simply be wasting its time. > > Well, in real-world situations I don't think the cache will ever get > invalidated: because imports are mostly done at startup, and because > invalidating the cache means you are installing new libraries or > updating existing ones while a running program is about to import > something. > I agree, but it was just something to consider. > > > Otherwise it's good to know three of us now have independently come up > > with fundamentally the same idea for speeding up imports. =) > > Yup :) > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue14043> > _______________________________________ > |
|
|
msg153597 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2012-02-17 20:28 |
And while I'm thinking about it, the hybrid caching approach of caching just the directory contents to start and then caching full details on successful searches (and failures) would also let you cache the full file path as the strings will be interned and used again by the loaders for setting __file__. |
|
|
msg153600 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 21:41 |
> And while I'm thinking about it, the hybrid caching approach of > caching just the directory contents to start and then caching full > details on successful searches (and failures) would also let you cache > the full file path as the strings will be interned and used again by > the loaders for setting __file__. Well, you shouldn't call find_module() on the same module twice (except for benchmarks), so I'm not sure what caching the results would bring. |
|
|
msg153601 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 21:48 |
Updated light-weight caching patch (now checks PYTHONCASEOK dynamically). |
|
|
msg153605 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2012-02-17 22:01 |
The lastest patch (listdir_cache2) LGTM. Have you by any chance patched it in and run ./python -m importlib.test.regrtest (runs the entire test suite with builtins.__import__() substituted with importlib.__import__())? If not I can do it on my ride home tonight to make sure semantics didn't surprisingly change (I don't see how). And then I can benchmark against the standard benchmark suite to see if it speeds things up in my bootstrap branch. |
|
|
msg153607 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 22:36 |
> The lastest patch (listdir_cache2) LGTM. Have you by any chance > patched it in and run ./python -m importlib.test.regrtest (runs the > entire test suite with builtins.__import__() substituted with > importlib.__import__())? There are a couple of test failures (in test_import, test_runpy, test_reprlib). Apparently this is due to false positives when comparing directory modification times. If I sleep() enough between tests, the failures disappear... I wonder if this could be a problem in real-life (say, someone compiling some DSL to Python code on-the-fly and expecting import to pick up without any timestamp problems). |
|
|
msg153608 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2012-02-17 22:45 |
Well, it's the reason you added file size to .pyc files. =) I assume if you store the system clock as well to see if no time has changed it would kill the performance gain. We could also have two versions of _FileFinder such that people could choose which they preferred. |
|
|
msg153609 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-17 22:50 |
> Well, it's the reason you added file size to .pyc files. =) pyc files only store the integer timestamp. Here I store the original floating-point timestamp, and I assumed it would be sufficient :) > I assume if you store the system clock as well to see if no time has > changed it would kill the performance gain. I don't know. I might give it a try but it might still be fragile. |
|
|
msg153655 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-18 19:55 |
Another possibility would be to include an explicit invalidate_caches() function in importlib. |
|
|
msg153658 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-18 20:52 |
This is a patch demonstrating invalidate_caches(). As for test_runpy, it seems runpy uses imp.get_loader(), so I don't see how it can remain synchronized with importlib's state. |
|
|
msg153660 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2012-02-18 23:42 |
I wouldn't worry too much about test_runpy as it has always been a touchy test for me. |
|
|
msg153747 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-02-20 00:55 |
New changeset bbaab666e6c7 by Antoine Pitrou in branch 'default': Issue #14043: Speed up importlib's _FileFinder by at least 8x, and add a new importlib.invalidate_caches() function. http://hg.python.org/cpython/rev/bbaab666e6c7 |
|
|
msg153749 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-02-20 00:57 |
Committed now (with invalidate_caches() as well). |
|
|