Issue 29585: site.py imports relatively large sysconfig module. (original) (raw)

Created on 2017-02-17 09:57 by methane, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 136 merged methane,2017-02-17 09:57
PR 2476 closed methane,2017-06-28 15:43
PR 2477 merged vstinner,2017-06-28 16:02
PR 2478 closed vstinner,2017-06-28 16:15
PR 2483 merged methane,2017-06-29 05:59
PR 2927 merged ned.deily,2017-07-28 06:43
PR 2928 merged methane,2017-07-28 11:27
Messages (22)
msg287981 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-02-17 09:57
site.py uses sysconfig (and sysconfigdata, _osx_support) module for user-site package. But sysconfig module is not so lightweight, and very rarely used. Actually speaking, only tests and distutils uses sysconfig in stdlibs. And it takes about 7% of startup time, only for searching user-site path. I tried to port minimal subset of sysconfig into site.py (GH-136). But 'PYTHONFRAMEWORK' is only in sysconfigdata. So I couldn't get rid sysconfig dependency completely. How can I do to solve this? a) Drop "osx_framework_user" (`~/Library/Python/3.7/`) support completely. b) Add "sys._osx_framework" attribute c) Create minimal sysconfigdata only for site.py d) anything else?
msg287983 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-02-17 10:20
Instead of using slow sysconfig and loading the big _sysconfig_data dictionary in memory, would it be possible to extract the minimum set of sysconfig needed by the site module and put it in a builtin module? In site.py, I only found 4 variables: from sysconfig import get_config_var USER_BASE = get_config_var('userbase') from sysconfig import get_path USER_SITE = get_path('purelib', 'osx_framework_user') USER_SITE = get_path('purelib', '%s_user' % os.name) from sysconfig import get_config_var framework = get_config_var("PYTHONFRAMEWORK") Because of the site module, the _sysconfig_data module dictionary is always loaded in memory even for for a dummy print("Hello World!"). I suggest to start building a _site builtin module: subset of site.py which would avoid sysconfig and reimplement things in C for best performances. speed.python.org: * python_startup: 14 ms * python_startup_nosite: 8 ms Importing site takes 6 ms: 42% of 14 ms... I'm interested to know if it would be possible to reduce these 6 ms by rewriting some parts of site.py in C.
msg287984 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-02-17 10:23
Serhiy collected interesting numbers, copy/paste of this message: http://bugs.python.org/issue28637#msg280380 On my computer: Importing empty module: 160 us Creating empty class: 30 us Creating empty function: 0.16 us Creating empty Enum/IntEnum: 125/150 us Creating Enum/IntEnum member: 25/27 us Creating empty namedtuple: 600 us Creating namedtuple member: 50 us Importing the itertools module: 40 us Importing the io module: 900 us Importing the os module: 1600 us Importing the functools module: 2100 us Importing the re module (with all sre submodules): 3300 us Python startup time: 43000 us
msg287985 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-02-17 10:33
What's your platform, Inada? Are you running macOS? I optimized site.py for Linux and BSD users a couple of years ago.
msg287988 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-02-17 11:16
Christian: I'm using macOS on office and Linux on home. sysconfig is imported even on Linux https://github.com/python/cpython/blob/master/Lib/site.py#L247-L248 https://github.com/python/cpython/blob/master/Lib/site.py#L263-L271
msg287990 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2017-02-17 11:45
I don't think rewriting party of site.py in C is a good idea. It's a rather maintenance intense module. However, optimizing access is certainly something that's possible, e.g. by placing the few variables that are actually needed by site.py into a bootstrap module for sysconfig, which only contains the few variables needed by interpreter startup. Alternatively, sysconfig data could be made available via a C lookup function; with the complete dictionary only being created on demand. get_config_var() already is such a lookup API which could be used as front-end.
msg287997 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-02-17 12:06
Marc-Andre Lemburg added the comment: > I don't think rewriting party of site.py in C is a good idea. It's a rather maintenance intense module. > > However, optimizing access is certainly something that's possible, e.g. by placing the few variables that are actually needed by site.py into a bootstrap module for sysconfig, which only contains the few variables needed by interpreter startup. Right, I don't propose to rewrite the 598 lines of site.py in C, but only rewrite the parts which have a huge impact on the startup time. It seems like the minimum part would be to write a _site module which provide the 4 variables currently read from sysconfig. I'm proposing to add a new private module because I don't want to pollute site which already contains too many things. I looked at site.py history: I don't see *major* changes last 2 years. Only small enhancements, updates and fixes. > Alternatively, sysconfig data could be made available via a C lookup function; with the complete dictionary only being created on demand. get_config_var() already is such a lookup API which could be used as front-end. I don't think that it's worth it to reimplement partially sysconfig in C. This module is huge, complex, and platform dependant. Well, I'm not sure about what is the best approach, but I'm sure that we can do something to optimize site.py. 6 ms is a lot! I never liked site.py. It seems like a huge workaround. I also dislike having a different behaviour if site is imported or not. That's why I asked Steve Dower to removing the code to create the cpXXX alias for the mbcs codec from site.py to encodings/__init__.py: see commit f5aba58480bb0dd45181f609487ac2ecfcc98673. I'm happy that this code was removed from site.py!
msg287999 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-02-17 12:14
Instead of _site, would it make sense to include the four vars in sys, perhaps as named structure like sys.flags?
msg288000 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-02-17 12:28
FYI, here is profile of site: https://gist.github.com/methane/1f1fe4385dad84f03eb429359f0f917b
msg288001 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-02-17 12:46
no-site python_startup_no_site: Median +- std dev: 9.13 ms +- 0.02 ms default: python_startup: Median +- std dev: 15.6 ms +- 0.0 ms GH-136 + skip abs_paths(). python_startup: Median +- std dev: 14.2 ms +- 0.0 ms profile of GH-136 + skip abs_paths(): https://gist.github.com/methane/26fc0a2382207655a6819a92f867620c Most of time is consumed by importlib.
msg288012 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2017-02-17 15:30
On 17.02.2017 13:06, STINNER Victor wrote: >> Alternatively, sysconfig data could be made available via a C lookup function; with the complete dictionary only being created on demand. get_config_var() already is such a lookup API which could be used as front-end. > > I don't think that it's worth it to reimplement partially sysconfig in > C. This module is huge, complex, and platform dependant. Sorry, I was just referring to the data part of sysconfig, not sysconfig itself. Having a lookup function much like we have for unicodedata makes things much more manageable, since you don't need to generate a dictionary in memory for all the values in the config data. Creating that dictionary takes a while (in terms of ms).
msg288020 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-02-17 16:42
I create #29592 for abs_paths(). Let's focus on sysconfig in this issue. PR 136 ports really needed part of sysconfig into site.py already. 'PYTHONFRAMEWORK' on macOS is the only variable we need import from sysconfig. Adding `site.cfg` like `pyvenv.cfg` make sense?
msg288057 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-02-18 05:06
PR 136 now adds `sys._framework` and 'PYTHONFRAMEWORK' macro in pyconfig.h.
msg297192 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-06-28 15:31
New changeset a8f8d5b4bd30dbe0828550469d98f12d2ebb2ef4 by INADA Naoki in branch 'master': bpo-29585: optimize site.py startup time (GH-136) https://github.com/python/cpython/commit/a8f8d5b4bd30dbe0828550469d98f12d2ebb2ef4
msg297194 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-28 16:10
Test fails on macOS: http://buildbot.python.org/all/builders/x86-64%20Sierra%203.x/builds/402/steps/test/logs/stdio ====================================================================== FAIL: test_getsitepackages (test.test_site.HelperFunctionsTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.billenstein-sierra/build/Lib/test/test_site.py", line 266, in test_getsitepackages self.assertEqual(len(dirs), 2) AssertionError: 1 != 2
msg297197 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-28 16:34
New changeset b01c574ad6d796025152b5d605eceb7816e6f7a7 by Victor Stinner in branch 'master': bpo-29585: Define PYTHONFRAMEWORK in PC/pyconfig.h (#2477) https://github.com/python/cpython/commit/b01c574ad6d796025152b5d605eceb7816e6f7a7
msg297258 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-06-29 06:31
New changeset 6b42eb17649bed9615b6e6cecaefdb2f46990b2c by INADA Naoki in branch 'master': bpo-29585: Fix sysconfig.get_config_var("PYTHONFRAMEWORK") (GH-2483) https://github.com/python/cpython/commit/6b42eb17649bed9615b6e6cecaefdb2f46990b2c
msg299368 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-07-28 06:30
test_get_path fails on macOS installed framework builds: ====================================================================== FAIL: test_get_path (test.test_site.HelperFunctionsTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/nad/Projects/PyDev/active/dev/3x/root/fwd_macports/Library/Frameworks/pytest_10.12.framework/Versions/3.7/lib/python3.7/test/test_site.py", line 188, in test_get_path sysconfig.get_path('purelib', os.name + '_user')) AssertionError: '/Users/nad/Library/pytest_10.12/3.7/lib/python/site-packages' != '/Users/nad/Library/pytest_10.12/3.7/lib/python3.7/site-packages' - /Users/nad/Library/pytest_10.12/3.7/lib/python/site-packages + /Users/nad/Library/pytest_10.12/3.7/lib/python3.7/site-packages ? +++ ---------------------------------------------------------------------- Ran 27 tests in 0.471s FAILED (failures=1, skipped=4)
msg299371 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-07-28 07:02
New changeset c22bd58d933efaec26d1f77f263b2845473b7e15 by Ned Deily in branch 'master': bpo-28095: Re-enable temporarily disabled part of test_startup_imports on macOS (#2927) https://github.com/python/cpython/commit/c22bd58d933efaec26d1f77f263b2845473b7e15
msg299379 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-07-28 11:16
https://docs.python.org/3.6/library/site.html#site.USER_SITE > ~/Library/Python/X.Y/lib/python/site-packages for Mac framework builds So it seems I broke sysconfig.get_path('purelib', 'posix_user').
msg299380 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-07-28 11:20
https://github.com/python/cpython/pull/136/files - if sys.platform == 'darwin': - from sysconfig import get_config_var - if get_config_var('PYTHONFRAMEWORK'): - USER_SITE = get_path('purelib', 'osx_framework_user') - return USER_SITE + if USER_SITE is None: + USER_SITE = _get_path(userbase) OK, I need to use `osx_framework_user` instead of os.name + '_user' on framework build.
msg299383 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-07-28 12:28
New changeset ba9ddb7eea39a651ba7f1ab3eb012e4129c03620 by INADA Naoki in branch 'master': bpo-29585: fix test fail on macOS Framework build (GH-2928) https://github.com/python/cpython/commit/ba9ddb7eea39a651ba7f1ab3eb012e4129c03620
History
Date User Action Args
2022-04-11 14:58:43 admin set github: 73771
2017-07-28 12:35:27 methane set status: open -> closedresolution: fixedstage: needs patch -> resolved
2017-07-28 12:28:22 methane set messages: +
2017-07-28 11:27:37 methane set pull_requests: + <pull%5Frequest2980>
2017-07-28 11:20:22 methane set messages: +
2017-07-28 11:16:21 methane set messages: +
2017-07-28 07:02:13 ned.deily set messages: +
2017-07-28 06:43:24 ned.deily set pull_requests: + <pull%5Frequest2979>
2017-07-28 06:30:57 ned.deily set status: closed -> opennosy: + ned.deilymessages: + resolution: fixed -> (no value)stage: resolved -> needs patch
2017-06-30 22:03:22 ned.deily link issue30795 superseder
2017-06-29 06:32:11 methane set status: open -> closedresolution: fixedstage: resolved
2017-06-29 06:31:40 methane set messages: +
2017-06-29 05:59:46 methane set pull_requests: + <pull%5Frequest2542>
2017-06-28 16:34:44 vstinner set messages: +
2017-06-28 16:15:04 vstinner set pull_requests: + <pull%5Frequest2532>
2017-06-28 16:10:22 vstinner set messages: +
2017-06-28 16:02:02 vstinner set pull_requests: + <pull%5Frequest2531>
2017-06-28 15:43:05 methane set pull_requests: + <pull%5Frequest2530>
2017-06-28 15:31:56 methane set messages: +
2017-02-20 23:28:55 gregory.p.smith set nosy: + gregory.p.smith
2017-02-18 05:06:25 methane set messages: +
2017-02-17 18:53:40 eric.araujo set nosy: + eric.araujo
2017-02-17 16:42:40 methane set messages: +
2017-02-17 15:30:39 lemburg set messages: +
2017-02-17 12:46:11 methane set messages: +
2017-02-17 12:28:46 methane set messages: +
2017-02-17 12:14:05 christian.heimes set messages: +
2017-02-17 12:06:23 vstinner set messages: +
2017-02-17 11:45:59 lemburg set nosy: + lemburgmessages: +
2017-02-17 11:16:31 methane set messages: +
2017-02-17 10:33:35 christian.heimes set messages: +
2017-02-17 10:32:02 christian.heimes set nosy: + christian.heimes
2017-02-17 10:23:08 vstinner set messages: +
2017-02-17 10:20:36 vstinner set nosy: + vstinnermessages: +
2017-02-17 09:57:03 methane create