bpo-35936: Updates to modulefinder by brandtbucher · Pull Request #11787 · python/cpython (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation36 Commits43 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
This patch includes a few useful fixes to Lib/modulefinder.py
. Namely:
- It doesn't crash if it encounters a syntax error (bpo-17396).
- It doesn't report certain name collisions as bad (bpo-35376).
- It doesn't use mutable default arguments in its initializer.
- Most importantly, it now uses
importlib
instead ofimp
, which is deprecated (bpo-25160, bpo-20020) As a benefit, frozen built-in modules (zipimport
, for example) are now correctly reported.
With the exception of these bug fixes, I've been very careful to preserve the original behavior, and have not changed any part of the API. This patch also includes 2 new regression tests.
https://bugs.python.org/issue35936
SyntaxErrors in the target module will rise normally, while SyntaxErrors in dependencies will be added to badmodules. This includes a new regression test.
This fixes an issue where a "fromlist" import with the same name as a previously failed import would be incorrectly added to badmodules. This includes a new regression test.
Bound empty lists have been replaced with the "if param is None" idiom.
Constants imported from imp have been moved to private module-level constants, and ModuleFinder.find_module has been refactored to use importlib. Other than an improvement on how frozen builtin imports are reported (as the frozen imports they are, rather than the stdlib modules they may have originated from), these changes maintain complete compatibility with past versions... including odd behavior for returning relative (below current directory, but not a C extension) vs. absolute (above current directory, or a C extension) paths.
Specifically, when checking whether the returned path should be relative or absolute.
Hopefully, this will help the failing tests on Windows.
This changeset makes a substantial change to the semantics of modulefinder: using importlib.util.find_spec
can change the import state of the script. In particular find_spec("package.module")
results in the importing of package
(and adds it to sys.modules
).
Furthermore using importlib
makes it close to impossible to use an alternate search path (the "path" argument for the constructor) due to the API of importlib. There are workarounds for this, but that probably requires replacing the entire import state (not just sys.path
as this pull request does) around calls to importlib functions.
P.S. For modulegraph2, which is a rewrite of modulegraph which itself is a library that uses code heavily inspired by modulefinder I choosen to take away the possibility to use an alternate search path.
@ronaldoussoren I see, thanks for the input. As a concrete example, we can see import side-effects of the parent module when running $ python -c 'import importlib; importlib.util.find_spec("this.that")'
. I'll dig into workarounds a bit more today. If nothing seems viable, as you mentioned, I'll probably revert those changes and leave the other fixes for merging.
It may be worth exploring a ground-up redesign, since the documented API is so small. I need a side-project, so I'd be willing to work on that if others feel it would be valuable.
importlib.util.find_spec(fullname, parent), where fullname contains a dot, results in the parent module being imported. This fix ensures that children are correctly located while also avoiding import side-effects by temporarily clearing sys.modules and modifying sys.path. Now, neither a dotted name nor parent are required.
I have a hunch the failing tests are due to a race condition in how sys.path updates. Sleep a bit here to see.
More test are passing, which supports my theory.
I guess that wasn't the issue...
I’ve tweaked the implementation to not need a dotted fullname
, avoiding the issues mentioned above. I am still getting (what appear to be) random test failures, I suspect somehow stemming from my modifications to sys.path
. This weird behavior is even referenced in an old comment in test_modulefinder.py
. I’m not able to reproduce the failures on my own machine.
Unless someone sees a clear fix for this behavior, I think I’ll just revert the new importlib
bit of this PR. Bummer.
This will be replaced with several more specific entries detailing the various bug fixes.
Perhaps there is some underlying reference issue with straight-up sys.path replacement. Once again, fingers crossed!
Perhaps test threads are stomping on each others' sys.path?
These are processes, after all...
This is a longshot, but it's pretty much all I have left in terms of ideas.
I've also added helpful documentation outlining the reasoning behind various design decisions.
Identity checking is sufficient here.
This is more robust than using a string comparison with "init".
This isn't actually needed (or wanted) here.
This is probably a better idea than using the extension alone.
Interestingly, it looks like PathFinder does this for us! That's great, because it really cleans up this section a lot.
Hopefully we can get a tiny boost here by checking likely outcomes first.
This makes the unintuitive reasoning behind this call explicit.
Thanks @ronaldoussoren for the suggestions... everything is much cleaner now!
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same situation as before - while I'm personally happy with the way the importlib APIs are now being used, @ronaldoussoren is the main reviewer from the actual modulefinder
functionality side.
It's been six weeks since @ronaldoussoren last requested changes. Should I keep waiting for a response from him, or can this be merged?
@brandtbucher Aye, I believe you've properly addressed all of @ronaldoussoren's feedback, so I've gone ahead and merged it since it appears he isn't currently available for a final review. Thanks for the update!
@ronaldoussoren If you spot anything further that I missed, just add a new comment and we can look at doing a further follow-up PR.