gh-69605: Add module autocomplete to PyREPL by tomasr8 · Pull Request #129329 · python/cpython (original) (raw)
Thanks for taking the time to review this, I really appreciate it!
@tomasr8 can you try to play a bit with some edge cases (tons of packages, etc) to see how the performance is for extreme circumstances? I want to be aware of the ways this can "break"
When it comes to performance, the biggest bottleneck is the call to pkgutil.iter_modules() to find all top-level packages.
(the result is cached so we pay the cost only the first time you hit TAB).
Here are some timings for lots of packages. I installed the top 1000 pypi packages (at least those that can be installed on 3.14 so 836 packages installed):
The timings are on a laptop with Intel ultra 9 185H CPU and a new-ish SSD (relevant because pkgutil.iter_modules interacts with the file system).
For those ~800 packages in a single location, pkgutil.iter_modules takes about 0.03s:
import time, pkgutil start = time.time(); list(pkgutil.iter_modules()); time.time() - start 0.03072071075439453
Typing import <tab> takes about 0.03s as well so most of the cost is really inside pkgutil.iter_modules:
import 0.03162884712219238
Subsequent completion requests are much faster since the results are cached:
import 0.002469778060913086
Another thing I tried is multiple search locations (i.e. multiple sys.path entries):
5 different locations and ~4000 packages total:
pkgutil.iter_modules: 0.3535459041595459s
Initial import <tab>: 0.3592982292175293s
Subsequent import <tab>: 0.0021576881408691406s
10 different locations and ~8000 packages total:
pkgutil.iter_modules: 0.6836235523223877s
Initial import <tab>: 0.6652779579162598s
Subsequent import <tab>: 0.0002522468566894531s
The initial search is about 0.35s and 0.68s respectively, while the subsequent ones are again very cheap.