[Python-Dev] startup time repeated? why not daemon (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Thu Jul 20 23:49:53 EDT 2017
- Previous message (by thread): [Python-Dev] startup time repeated? why not daemon
- Next message (by thread): [Python-Dev] startup time repeated? why not daemon
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 21 July 2017 at 10:19, Nathaniel Smith <njs at pobox.com> wrote:
I'm not sure either of these make much sense when python startup is already in the single digit milliseconds. While it's certainly great if we can lower that further, my impression is that for any real application, startup time is overwhelmingly spent importing user packages, not in the interpreter start up itself. And this is difficult to optimize with a daemon or memory dump, because you need a full list of modules to preload and it'll differ between programs.
This suggests that optimizations to finding/loading/executing modules are likely to give the biggest startup time wins.
Agreed, and this is where both lazy loading and Cython precompilation are genuinely interesting:
- Cython precompilation can have a significant impact on startup time, as it replaces module level code execution at import time with a combination of Cython translation to C code at build time and Python C API calls at import time
- Lazy loading can have a significant impact on startup time, as it means you don't have to pay for the cost of finding and loading modules that you don't actually end up using on that particular run
We've historically resisted adopting these techniques for the standard library because they do make things more complicated and harder to debug relative to plain old eagerly imported dynamic Python code. However, if we're going to recommend them as good practices for 3rd party developers looking to optimise the startup time of their Python applications, then it makes sense for us to embrace them for the standard library as well, rather than having our first reaction be to write more hand-crafted C code.
On that last point, it's also worth keeping in mind that we have a much harder time finding new C-level contributors than we do new Python-level ones, and have every reason to expect that problem to get worse over time rather than better (since writing and maintaining handcrafted C code is likely to go the way of writing and maintaining handcrafted assembly code as a skillset: while it will still be genuinely necessary in some contexts, it will also be an increasingly niche technical specialty).
Starting to migrate to using Cython for our acceleration modules instead of plain C should thus prove to be a win for everyone:
- Cython structurally avoids a lot of typical bugs that arise in hand-coded extensions (e.g. refcount bugs)
- by design, it's much easier to mentally switch between Python & Cython than it is between Python & C
- Cython accelerated modules are easier to adapt to other interpeter implementations than handcrafted C modules
- keeping Python modules and their C accelerated counterparts in sync will be easier, as they'll mostly be using the same code
- we'd be able to start writing C API test cases in Cython rather than in handcrafted C (which currently mostly translates to only testing them indirectly)
- CPython's own test suite would naturally help test Cython compatibility with any C API updates
- we'd have an inherent incentive to help enhance Cython to take advantage of new C API features
The are some genuine downsides in increasing the complexity of bootstrapping CPython when all you're starting with is a VCS clone and a C compiler, but those complications are ultimately no worse than those we already have with Argument Clinic, and hence amenable to the same solution: if we need to, we can check in the generated C files in order to make bootstrapping easier.
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message (by thread): [Python-Dev] startup time repeated? why not daemon
- Next message (by thread): [Python-Dev] startup time repeated? why not daemon
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]