Issue 13150: Most of Python's startup time is sysconfig (original) (raw)

Issue13150

process

Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, barry, doko, eric.araujo, ezio.melotti, jcon, nadeem.vawda, ncoghlan, pitrou, python-dev, rosslagerwall, rpetrov, tarek, terry.reedy, vstinner
Priority: normal Keywords: patch

Created on 2011-10-11 03:02 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
sysconfigdata.patch pitrou,2011-10-11 14:53 review
Messages (24)
msg145328 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 03:02
sysconfig is imported and used by site.py. $ time ./python -S -c '' real 0m0.019s user 0m0.013s sys 0m0.005s $ time ./python -S -c 'import sysconfig' real 0m0.047s user 0m0.046s sys 0m0.002s $ time ./python -S -c 'import sysconfig; sysconfig.get_path("purelib")' real 0m0.053s user 0m0.047s sys 0m0.005s $ time ./python -c '' real 0m0.058s user 0m0.054s sys 0m0.003s
msg145342 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 13:37
Actually, a big part of that is compiling some regexes in the tokenize module. Just relying on the re module's internal caching shaves off 20% of total startup time. Before: $ time ./python -S -c 'import tokenize' real 0m0.034s user 0m0.030s sys 0m0.003s $ time ./python -c '' real 0m0.055s user 0m0.050s sys 0m0.005s After: $ time ./python -S -c 'import tokenize' real 0m0.021s user 0m0.019s sys 0m0.001s $ time ./python -c '' real 0m0.044s user 0m0.038s sys 0m0.006s
msg145344 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-10-11 13:49
New changeset df950158dc33 by Antoine Pitrou in branch 'default': Issue #13150: The tokenize module doesn't compile large regular expressions at startup anymore. http://hg.python.org/cpython/rev/df950158dc33
msg145346 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2011-10-11 14:07
I am curious: wouldn't be a way of keeping the compiled expressions in a static cache somewhere, so we would compile them just once and have both import time and runtime fast ?
msg145347 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-10-11 14:11
New changeset ed0bc92fed68 by Antoine Pitrou in branch 'default': Use a dict for faster sysconfig startup (issue #13150) http://hg.python.org/cpython/rev/ed0bc92fed68
msg145348 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 14:13
> I am curious: wouldn't be a way of keeping the compiled expressions in > a static cache somewhere, so we would compile them just once and have > both import time and runtime fast ? Runtime shouldn't be affected. The re module has its own LRU caching. That said, it seems regular expressions are pickleable: b'\x80\x03cre\n_compile\nq\x00X\x00\x00\x00\x00q\x01K \x86q\x02Rq\x03.'
msg145349 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 14:13
Arg damn roundup e-mail gateway. I wanted to paste: >>> pickle.dumps(re.compile('')) b'\x80\x03cre\n_compile\nq\x00X\x00\x00\x00\x00q\x01K \x86q\x02Rq\x03.'
msg145350 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-11 14:53
Pre-parsing and building a cached module of built-time variables (from Makefile and pyconfig.h) under POSIX also removes more than 15% of startup time. Patch attached.
msg145358 - (view) Author: Ross Lagerwall (rosslagerwall) (Python committer) Date: 2011-10-11 18:44
#11454 is another case where pre-parsing and pickling the regular expressions in the email module may improve import time considerably.
msg145397 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-10-12 15:47
#9878 should also help with start-up time.
msg145398 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-10-12 15:50
Actually, #9878 should supersede this bug: it proposes to generate a C module to avoid parsing Makefile and pyconfig.h, and your patch proposes to generate a Python module with the same goal.
msg145402 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-12 16:01
> Actually, #9878 should supersede this bug: it proposes to generate a C > module to avoid parsing Makefile and pyconfig.h, and your patch > proposes to generate a Python module with the same goal. Well, #9878 doesn't have a patch, but perhaps Barry is willing to work on one. Also, if we have a pure Python solution, perhaps a C module isn't needed. The main advantage of the C solution, though, would be to avoid dubious parsing altogher, even at build time.
msg145423 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-10-12 19:52
Since #9878 proposes an *alternate* solution to *part* of the sysconfig problem, I disagree with 'supersede'. A Python solution would be more useful for other implementations if enough of the sysconfig info is not CPython specific. A CPython design feature is that it parses and compiles Python code just once per run, and imported modules just once until the code changes (or might have). For functions, everything possible is put into a behind-the-scenes code object. So even inner functions are parsed and compiled just once. The problem with sysconfig, it appears, is that lacks the equivalent design feature but instead does the equivalent of re-parsing and re-compiling inner functions with each outer function call.
msg145458 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-10-13 15:40
> Since #9878 proposes an *alternate* solution to *part* of the > sysconfig problem, I disagree with 'supersede'. It’s also an older issue. > A Python solution would be more useful for other implementations > if enough of the sysconfig info is not CPython specific. That’s the point: the info currently parsed at runtime by sysconfig is specific to CPython (Makefile and pyconfig.h), so adding a CPython-specific C module was thought the way to go.
msg145462 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-13 15:50
> > A Python solution would be more useful for other implementations > > if enough of the sysconfig info is not CPython specific. > That’s the point: the info currently parsed at runtime by sysconfig is > specific to CPython (Makefile and pyconfig.h), so adding a > CPython-specific C module was thought the way to go. A module doesn't have to be written in C to be CPython-specific.
msg145822 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-10-18 15:59
New changeset 70160b53117f by Antoine Pitrou in branch 'default': Issue #13150: sysconfig no longer parses the Makefile and config.h files http://hg.python.org/cpython/rev/70160b53117f
msg145823 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-10-18 16:00
Done! If someone wants to give life to the C approach, they are welcome :)
msg145838 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-10-18 16:35
BTW, distutils2 backports the sysconfig module and cfg file from 3.3, so now the two versions will diverge.
msg145870 - (view) Author: Roumen Petrov (rpetrov) * Date: 2011-10-18 21:39
10x for solution, 10x for commit . Good bye cross compilation! Any attempt to improve python build system to support cross-build, multilib build, build outside source tree with different options is useless.
msg145983 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-10-19 22:40
New changeset 677e625e2ef1 by Victor Stinner in branch 'default': Issue #13150: Add a comment in _sysconfigdata to explain the origin of this file http://hg.python.org/cpython/rev/677e625e2ef1
msg165323 - (view) Author: Matthias Klose (doko) * (Python committer) Date: 2012-07-12 16:58
the current ability to cross-build python now relies on being able to run the build python with the host library, using the _sysconfigdata.py from the host. if somebody decides to implement _sysconfigdata as a C extension, please ensure that this information still can be passed to the build python.
msg184916 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-03-21 22:02
New changeset 66e30c4870bb by doko in branch '2.7': - Issue #13150: sysconfig no longer parses the Makefile and config.h files http://hg.python.org/cpython/rev/66e30c4870bb
msg184967 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-03-22 14:37
New changeset d174cb3f5b9e by Benjamin Peterson in branch '2.7': backout 66e30c4870bb for breaking OSX (#13150) http://hg.python.org/cpython/rev/d174cb3f5b9e
msg186332 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-04-08 19:20
New changeset be3b4aa2ad28 by doko in branch '2.7': - Issue #13150, #17512: sysconfig no longer parses the Makefile and config.h http://hg.python.org/cpython/rev/be3b4aa2ad28
History
Date User Action Args
2022-04-11 14:57:22 admin set github: 57359
2013-04-08 19:20:25 python-dev set messages: +
2013-03-22 14:37:59 python-dev set messages: +
2013-03-21 22:02:22 python-dev set messages: +
2012-07-12 16:58:07 doko set nosy: + dokomessages: +
2011-10-19 22:40:57 python-dev set messages: +
2011-10-18 21:39:19 rpetrov set nosy: + rpetrovmessages: +
2011-10-18 17:32:50 Arfrever set nosy: + Arfrever
2011-10-18 16:35:46 eric.araujo set messages: +
2011-10-18 16:00:25 pitrou set status: open -> closedresolution: fixedmessages: + stage: resolved
2011-10-18 15:59:33 python-dev set messages: +
2011-10-13 15:50:28 pitrou set messages: +
2011-10-13 15:40:52 eric.araujo set messages: +
2011-10-13 00:07:52 jcon set nosy: + jcon
2011-10-12 19:52:06 terry.reedy set messages: +
2011-10-12 16:01:11 pitrou set nosy: + barrymessages: +
2011-10-12 15:50:01 eric.araujo set messages: +
2011-10-12 15:48:38 vstinner set nosy: + vstinner
2011-10-12 15:47:32 eric.araujo set messages: +
2011-10-11 18:44:56 rosslagerwall set nosy: + rosslagerwallmessages: +
2011-10-11 14:53:08 pitrou set files: + sysconfigdata.patchkeywords: + patchmessages: +
2011-10-11 14:13:56 pitrou set messages: +
2011-10-11 14:13:31 pitrou set messages: +
2011-10-11 14:11:16 python-dev set messages: +
2011-10-11 14:07:34 tarek set messages: +
2011-10-11 13:49:42 python-dev set nosy: + python-devmessages: +
2011-10-11 13:37:04 pitrou set messages: +
2011-10-11 09:32:26 nadeem.vawda set nosy: + nadeem.vawda
2011-10-11 03:04:20 ezio.melotti set nosy: + ezio.melotti
2011-10-11 03:02:57 pitrou create