[Python-Dev] Investigating time for import requests (original) (raw)

Koos Zevenhoven k7hoven at gmail.com
Sun Oct 8 11:13:19 EDT 2017


On Sun, Oct 8, 2017 at 11:02 AM, David Cournapeau <cournape at gmail.com> wrote:

On Mon, Oct 2, 2017 at 6:42 PM, Raymond Hettinger <_ _raymond.hettinger at gmail.com> wrote:

> On Oct 2, 2017, at 12:39 AM, Nick Coghlan <ncoghlan at gmail.com> wrote: > > "What requests uses" can identify a useful set of > avoidable imports. A Flask "Hello world" app could likely provide > another such sample, as could some example data analysis notebooks). Right. It is probably worthwhile to identify which parts of the library are typically imported but are not ever used. And likewise, identify a core set of commonly used tools that are going to be almost unavoidable in sufficiently interesting applications (like using requests to access a REST API, running a micro-webframework, or invoking mercurial). Presumably, if any of this is going to make a difference to end users, we need to see if there is any avoidable work that takes a significant fraction of the total time from invocation through the point where the user first sees meaningful output. That would include loading from nonvolatile storage, executing the various imports, and doing the actual application. I don't expect to find anything that would help users of Django, Flask, and Bottle since those are typically long-running apps where we value response time more than startup time. For scripts using the requests module, there will be some fruit because not everything that is imported is used. However, that may not be significant because scripts using requests tend to be I/O bound. In the timings below, 6% of the running time is used to load and run python.exe, another 16% is used to import requests, and the remaining 78% is devoted to the actual task of running a simple REST API query. It would be interesting to see how much of the 16% could be avoided without major alterations to requests, to urllib3, and to the standard library. It is certainly true that for a CLI tool that actually makes any network I/O, especially SSL, import times will quickly be negligible. It becomes tricky for complex tools, because of error management. For example, a common pattern I have used in the past is to have a high level "catch all exceptions" function that dispatch the CLI command: try: mainfunction(...) except ErrorKind1: .... except requests.exceptions.SSLError: # gives complete message about options when receiving SSL errors, e.g. invalid certificate This pattern requires importing requests every time the command is run, even if no network IO is actually done. For complex CLI tools, maybe most command don't use network IO (the tool in question was a complete packages manager), but you pay ~100 ms because of requests import for every command. It is particularly visible because commands latency starts to be felt around 100-150 ms, and while you can do a lot in python in 100-150 ms, you can't do much in 0-50 ms. Yes. ​OTOH, ​it can also happen that the imports are in fact what use the network IO. At the office, I usually import from a network drive. For instance, import requests takes a little less than a second, and import IPython usually takes more than a second, with some variation.

––Koos

--



More information about the Python-Dev mailing list