bpo-35936: Updates to modulefinder by brandtbucher · Pull Request #11787 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation36 Commits43 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

brandtbucher

This patch includes a few useful fixes to Lib/modulefinder.py. Namely:

With the exception of these bug fixes, I've been very careful to preserve the original behavior, and have not changed any part of the API. This patch also includes 2 new regression tests.

https://bugs.python.org/issue35936

@brandtbucher

SyntaxErrors in the target module will rise normally, while SyntaxErrors in dependencies will be added to badmodules. This includes a new regression test.

@brandtbucher

This fixes an issue where a "fromlist" import with the same name as a previously failed import would be incorrectly added to badmodules. This includes a new regression test.

@brandtbucher

Bound empty lists have been replaced with the "if param is None" idiom.

@brandtbucher

Constants imported from imp have been moved to private module-level constants, and ModuleFinder.find_module has been refactored to use importlib. Other than an improvement on how frozen builtin imports are reported (as the frozen imports they are, rather than the stdlib modules they may have originated from), these changes maintain complete compatibility with past versions... including odd behavior for returning relative (below current directory, but not a C extension) vs. absolute (above current directory, or a C extension) paths.

@brandtbucher

@brandtbucher

Specifically, when checking whether the returned path should be relative or absolute.

@brandtbucher

@blurb-it

@brandtbucher

Hopefully, this will help the failing tests on Windows.

pganssle

@ronaldoussoren

This changeset makes a substantial change to the semantics of modulefinder: using importlib.util.find_spec can change the import state of the script. In particular find_spec("package.module") results in the importing of package (and adds it to sys.modules).

Furthermore using importlib makes it close to impossible to use an alternate search path (the "path" argument for the constructor) due to the API of importlib. There are workarounds for this, but that probably requires replacing the entire import state (not just sys.path as this pull request does) around calls to importlib functions.

P.S. For modulegraph2, which is a rewrite of modulegraph which itself is a library that uses code heavily inspired by modulefinder I choosen to take away the possibility to use an alternate search path.

@brandtbucher

@ronaldoussoren I see, thanks for the input. As a concrete example, we can see import side-effects of the parent module when running $ python -c 'import importlib; importlib.util.find_spec("this.that")'. I'll dig into workarounds a bit more today. If nothing seems viable, as you mentioned, I'll probably revert those changes and leave the other fixes for merging.

It may be worth exploring a ground-up redesign, since the documented API is so small. I need a side-project, so I'd be willing to work on that if others feel it would be valuable.

@brandtbucher

importlib.util.find_spec(fullname, parent), where fullname contains a dot, results in the parent module being imported. This fix ensures that children are correctly located while also avoiding import side-effects by temporarily clearing sys.modules and modifying sys.path. Now, neither a dotted name nor parent are required.

@brandtbucher

I have a hunch the failing tests are due to a race condition in how sys.path updates. Sleep a bit here to see.

@brandtbucher

More test are passing, which supports my theory.

@brandtbucher

@brandtbucher

I guess that wasn't the issue...

@brandtbucher

I’ve tweaked the implementation to not need a dotted fullname, avoiding the issues mentioned above. I am still getting (what appear to be) random test failures, I suspect somehow stemming from my modifications to sys.path. This weird behavior is even referenced in an old comment in test_modulefinder.py. I’m not able to reproduce the failures on my own machine.

Unless someone sees a clear fix for this behavior, I think I’ll just revert the new importlib bit of this PR. Bummer.

@brandtbucher

This will be replaced with several more specific entries detailing the various bug fixes.

@brandtbucher

Perhaps there is some underlying reference issue with straight-up sys.path replacement. Once again, fingers crossed!

@brandtbucher

Perhaps test threads are stomping on each others' sys.path?

@brandtbucher

These are processes, after all...

@brandtbucher

This is a longshot, but it's pretty much all I have left in terms of ideas.

@brandtbucher

@blurb-it

@blurb-it

@blurb-it

@brandtbucher

I've also added helpful documentation outlining the reasoning behind various design decisions.

ronaldoussoren

@brandtbucher

Identity checking is sufficient here.

@brandtbucher

This is more robust than using a string comparison with "init".

@brandtbucher

This isn't actually needed (or wanted) here.

@brandtbucher

This is probably a better idea than using the extension alone.

@brandtbucher

Interestingly, it looks like PathFinder does this for us! That's great, because it really cleans up this section a lot.

@brandtbucher

Hopefully we can get a tiny boost here by checking likely outcomes first.

@brandtbucher

This makes the unintuitive reasoning behind this call explicit.

@brandtbucher

Thanks @ronaldoussoren for the suggestions... everything is much cleaner now!

@brandtbucher

@brandtbucher

ncoghlan

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same situation as before - while I'm personally happy with the way the importlib APIs are now being used, @ronaldoussoren is the main reviewer from the actual modulefinder functionality side.

@brandtbucher

@brandtbucher

It's been six weeks since @ronaldoussoren last requested changes. Should I keep waiting for a response from him, or can this be merged?

@ncoghlan

@brandtbucher Aye, I believe you've properly addressed all of @ronaldoussoren's feedback, so I've gone ahead and merged it since it appears he isn't currently available for a final review. Thanks for the update!

@ronaldoussoren If you spot anything further that I missed, just add a new comment and we can look at doing a further follow-up PR.