[Python-Dev] startup time repeated? why not daemon (original) (raw)
Nathaniel Smith njs at pobox.com
Thu Jul 20 20:19:19 EDT 2017
- Previous message (by thread): [Python-Dev] startup time repeated? why not daemon
- Next message (by thread): [Python-Dev] startup time repeated? why not daemon
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Jul 20, 2017 14:18, "Eric Snow" <ericsnowcurrently at gmail.com> wrote:
On Thu, Jul 20, 2017 at 11:53 AM, Jim J. Jewett <jimjjewett at gmail.com> wrote:
I agree that startup time is a problem, but I wonder if some of the pain could be mitigated by using a persistent process.
[snip] Is it too hard to create a daemon server? Is the communication and context switch slower than a new startup? Is the pattern just not well-enough advertised?
A couple years ago I suggested the same idea (i.e. "pythond") during a conversation with MAL at PyCon UK. IIRC, security and complexity were the two major obstacles. Assuming you use fork, you must ensure that the daemon gets into just the right state. Otherwise you're leaking (potentially sensitive) info into the forked processes or you're wasting cycles/memory. Relatedly, at PyCon this year Barry and I were talking about the idea of bootstrapping the interpreter from a memory snapshot on disk, rather than from scatch (thus drastically reducing the number of IO events). From what I gather, emacs does (or did) something like this.
There's a fair amount of prior art for both of these. The prestart/daemon approach is apparently somewhat popular in the Java world, because the jvm is super slow to start up. E.g.: https://github.com/ninjudd/drip The interesting thing there is probably their README's comparison of their strategy to previous attempts at the same thing. (They've explicitly moved away from a persistent daemon approach.)
The emacs memory dump approach is really challenging. They've been struggling to move away from it for years. Have you ever wondered why jemalloc is so much better than the default glibc malloc on Linux? Apparently it's because for many years it was impossible to improve glibc malloc's internal memory layout because it would break emacs.
I'm not sure either of these make much sense when python startup is already in the single digit milliseconds. While it's certainly great if we can lower that further, my impression is that for any real application, startup time is overwhelmingly spent importing user packages, not in the interpreter start up itself. And this is difficult to optimize with a daemon or memory dump, because you need a full list of modules to preload and it'll differ between programs.
This suggests that optimizations to finding/loading/executing modules are likely to give the biggest startup time wins.
-n -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20170720/9bb1b6c8/attachment-0001.html>
- Previous message (by thread): [Python-Dev] startup time repeated? why not daemon
- Next message (by thread): [Python-Dev] startup time repeated? why not daemon
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]