[Python-Dev] Rethinking intern() and its data structure (original) (raw)
Robert Collins robert.collins at canonical.com
Fri Apr 10 11:19:39 CEST 2009
- Previous message: [Python-Dev] Lazy importing (was Rethinking intern() and its data structure)
- Next message: [Python-Dev] Rethinking intern() and its data structure
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, 2009-04-09 at 21:26 -0700, Guido van Rossum wrote:
Just to add some skepticism, has anyone done any kind of instrumentation of bzr start-up behavior?
We sure have. 'bzr --profile-imports' reports on the time to import different modules (both cumulative and individually).
We have a lazy module loader that allows us to defer loading modules we might not use (though if they are needed we are in fact going to pay for loading them eventually).
We monkeypatch the standard library where modules we want are unreasonably expensive to import (for instance by making a regex we wouldn't use be lazy compiled rather than compiled at import time).
IIRC every time I was asked to reduce the start-up cost of some Python app, the cause was too many imports, and the solution was either to speed up import itself (.pyc files were the first thing ever that came out of that -- importing from a single .zip file is one of the more recent tricks) or to reduce the number of modules imported at start-up (or both :-). Heavy-weight frameworks are usually the root cause, but usually there's nothing that can be done about that by the time you've reached this point. So, amen on the good luck, but please start with a bit of analysis.
Certainly, import time is part of it: robertc at lifeless-64:~$ python -m timeit -s 'import sys; import bzrlib.errors' "del sys.modules['bzrlib.errors']; import bzrlib.errors" 10 loops, best of 3: 18.7 msec per loop
(errors.py is 3027 lines long with 347 exception classes).
We've also looked lower - python does a lot of stat operations search for imports and determining if the pyc is up to date; these appear to only really matter on cold-cache imports (but they matter a lot then); in hot-cache situations they are insignificant.
Uhm, there's probably more - but I just wanted to note that we have done quite a bit of analysis. I think a large chunk of our problem is having too much code loaded when only a small fraction will be used in any one operation. Consider importing bzrlib errors - 10% of the startup time for 'bzr help'. In any operation only a few of those exceptions will be used - and typically 0.
-Rob -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: <http://mail.python.org/pipermail/python-dev/attachments/20090410/d0d2126f/attachment.pgp>
- Previous message: [Python-Dev] Lazy importing (was Rethinking intern() and its data structure)
- Next message: [Python-Dev] Rethinking intern() and its data structure
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]