Issue 2439: Patch to add a get_data function to pkgutil (original) (raw)

Issue2439

Created on 2008-03-20 20:58 by paul.moore, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pkgutil.patch paul.moore,2008-03-21 21:09
Messages (21)
msg64208 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-20 20:58
This patch adds a new get_data function to the pkgutil module, allowing access to data stored in the package directory. It wraps the PEP 302 loader "get_data" function, so that it works with such loaders (for example, zipimport). The patch includes documentation and tests (I created a new test_pkgutil test module, but did not add tests for the existing pkgutil functionality). All tests pass on Windows against svn rev 61679 (except test_tcl, which was skipped as I didn't build TCL support).
msg64211 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2008-03-20 21:36
Hi Paul. AFAICT, this doesn't look like it will actually work for filesystem data. get_loader() will return None for "regular", filesystem-installed modules, at least in Python 2.5. Perhaps you should add a test case for that?
msg64219 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-20 22:51
I'm not sure I understand. It uses pkgutil.get_loader, which returns a wrapper for filesystem modules. You pointed me to it. It seems to work, that's what test_getdata_filesys is testing in test_pkgutil.py. Can you supply a testcase that fails? (BTW, this is a patch for the trunk - ie 2.6-to be. I can't see why 2.5 would be different, but maybe that's the problem for you?)
msg64224 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2008-03-20 23:28
Oops, my bad. I'm thinking of an entirely unrelated get_loader() function. Meanwhile, I misread your patch entirely, and thought you had some dead code for os.path processing that is in fact live. So there is "another" problem (really the only one) that I spotted on this re-read. Your patch is calling load_module() even if the module is already in sys.modules. This will *reload* the module, potentially causing problems elsewhere in the system. You can test this by adding an assertion to your test's load_module(), that the module isn't already in sys.modules. Then call get_data for the same module twice. Sorry again for the mixup.
msg64225 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-20 23:55
Is that true? loader.load_module(pkg) should return sys.modules[pkg] without reloading if it already exists. See PEP 302 "Specification part 1: The Importer Protocol": The load_module() method has a few responsibilities that it must fulfil *before* it runs any code: - If there is an existing module object named 'fullname' in sys.modules, the loader must use that existing module. (Otherwise, the reload() builtin will not work correctly.) If a module named 'fullname' does not exist in sys.modules, the loader must create a new module object and add it to sys.modules. I've added a test for this, but I'm not 100% sure it's right. It certainly works with unchanged code. If you can make it fail, let me know and I'll work up a fix. But I think repeated calls to load_module() should be safe (assuming loaders work as required by PEP 302).
msg64226 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2008-03-21 00:04
reload() is implemented by calling the PEP 302 load_module() method on the existing module object.
msg64231 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2008-03-21 04:49
To clarify the intent of the section of PEP 302 Paul quoted: the *module* object gets reused, but the contents of mod.__dict__ are clobbered and the module code re-executed. So it's the same module object, but with brand new contents (this is why "from foo import bar" and "reload(foo)" do not play nicely with each other, but doing "import foo" and then invoking "foo.bar" picks up the new version of "bar" after a reload).
msg64237 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-21 12:24
Nick, thanks I now see the issue. I'll work up a test to check for this (and then correct it :-)).
msg64238 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2008-03-21 12:35
An easy way to test it: just change your load_module() to raise an AssertionError if the module is already in sys.modules, and the main body of the test to make two get_data() calls for the same module.
msg64240 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-21 13:07
But that's not a valid loader. I'm still struggling here. I see what you're trying to get at, but I can't see how I can write a valid loader that does this. To test the problem you're suggesting (that the code calls load_module when the module is already loaded) I need a valid loader which does something detectably different of the module is already loaded when it runs. Obviously, I can fix the get_data code - that's not even remotely hard. But I'd rather create a failing test with the current code, so that I can confirm that the "problem" is fixed. At the moment, I can't even demonstrate a problem!
msg64241 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2008-03-21 13:51
Why does it need to be a valid loader? It's a mock, not a real loader. But if it really bothers you, have it increment a global in the module or something, and put the assertion in the test proper. Heck, have it update a counter in the module it returns, making it into a loader that keeps track of how many times you reload the same module. Then it's a useful example loader. :)
msg64242 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-21 14:15
It has to be a valid loader, as I see no reason to support loaders that aren't valid. In any case, I did try incrementing a counter and it doesn't demonstrate the problem. If you try the currently attached patch, you should see that. (I assume you've tried or at least read the current patch - but the fact that you're suggesting the approach I have implemented makes me wonder. I did re-upload the patch after you reported the issue - msg 64225 - maybe you didn't notice this, as I deleted the old patch?) If you do see what I mean, please tell me where my code is wrong. I don't want to add a fix without a test showing why the current behaviour is wrong. The test_alreadyloaded test is intended to do that, but the current pkgutil code doesn't fail the test - so either the test is wrong (and I'd appreciate help fixing the test) or the "problem" isn't real, and I can leave the code as is.
msg64243 - (view) Author: PJ Eby (pje) * (Python committer) Date: 2008-03-21 15:04
But I'm getting a failure on that test, and the other one, too: $ ./python Lib/test/test_pkgutil.py test_alreadyloaded (__main__.PkgutilTests) ... FAIL test_getdata_filesys (__main__.PkgutilTests) ... FAIL test_getdata_pep302 (__main__.PkgutilTests) ... ok ====================================================================== FAIL: test_alreadyloaded (__main__.PkgutilTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_pkgutil.py", line 45, in test_alreadyloaded self.assertEqual(foo.loads, 1) AssertionError: 2 != 1 ====================================================================== FAIL: test_getdata_filesys (__main__.PkgutilTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_pkgutil.py", line 30, in test_getdata_filesys self.assert_('PkgutilTests' in this_file) AssertionError ---------------------------------------------------------------------- Ran 3 tests in 0.017s FAILED (failures=2) I'm rebuilding my entire 2.6 checkout to double-check there's not something else going on, but I know my Lib/ tree is up-to-date.
msg64260 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-21 19:30
Thanks for the update. Something's seriously screwy here. I am getting no failures, so you can probably see why I was confused. Can you tell me what platform, etc (anything that might be relevant) and I'll try to see what's going on.
msg64265 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-21 19:49
By the way, for comparison, I'm running the test (under Windows) using rt -q -v test_pkgutil from the PCBuild directory. I can't see how test_getdata_filesys can fail, as long as you're running it from the correct place - it reads the test_pkgutil.py file relative to the test package, so it won't run outside of there (maybe I should change this, but that's a separate issue for now). Here's my output: >rt -q -v test_pkgutil .\\python -E -tt ../lib/test/regrtest.py -v test_pkgutil test_pkgutil test_alreadyloaded (test.test_pkgutil.PkgutilTests) ... ok test_getdata_filesys (test.test_pkgutil.PkgutilTests) ... ok test_getdata_pep302 (test.test_pkgutil.PkgutilTests) ... ok ---------------------------------------------------------------------- Ran 3 tests in 0.000s OK 1 test OK.
msg64266 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-21 19:52
Ah, no. I see your command line. I get the same failure as you in that case. I can see why test_getdata_filesys fails in that case, I'll fix that. I'm confused as to why test_alreadyloaded fails there but not via rt.bat, but nevertheless I can fix that now I can see it. Thanks for your patience. Leave it with me.
msg64282 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-03-21 21:09
OK, I found it. There were 2 problems, the double-loading one, and a problem with adding my importer to sys.meta_path - if the test failed, I wasn't removing it (so it was there for the next test, and interfering with it). Here's a fixed patch. Thanks, Phillip, for persevering!
msg65508 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2008-04-15 09:49
I should be able to review/commit this for the next alpha (looking into it right now).
msg65509 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2008-04-15 10:19
Thanks, Paul.
msg65510 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2008-04-15 10:26
Commited as rev 62350
msg65511 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2008-04-15 10:28
Test file added in rev 62351 (forgot to svn add it the first time around)
History
Date User Action Args
2022-04-11 14:56:32 admin set github: 46691
2008-04-15 10:28:51 ncoghlan set messages: +
2008-04-15 10:26:34 ncoghlan set status: open -> closedresolution: acceptedmessages: +
2008-04-15 10:19:59 paul.moore set messages: +
2008-04-15 09:49:20 ncoghlan set assignee: ncoghlanmessages: +
2008-03-21 21:09:32 paul.moore set files: - pkgutil.patch
2008-03-21 21:09:17 paul.moore set files: + pkgutil.patchmessages: +
2008-03-21 19:52:50 paul.moore set messages: +
2008-03-21 19:49:38 paul.moore set messages: +
2008-03-21 19:30:42 paul.moore set messages: +
2008-03-21 15:04:55 pje set messages: +
2008-03-21 14:15:07 paul.moore set messages: +
2008-03-21 13:51:36 pje set messages: +
2008-03-21 13:07:05 paul.moore set messages: +
2008-03-21 12:35:08 pje set messages: +
2008-03-21 12:24:04 paul.moore set messages: +
2008-03-21 04:49:31 ncoghlan set nosy: + ncoghlanmessages: +
2008-03-21 00:04:33 pje set messages: +
2008-03-20 23:55:50 paul.moore set files: - pkgutil.patch
2008-03-20 23:55:32 paul.moore set files: + pkgutil.patchmessages: +
2008-03-20 23:28:51 pje set messages: +
2008-03-20 22:51:52 paul.moore set messages: +
2008-03-20 21:36:27 pje set nosy: + pjemessages: +
2008-03-20 20:58:32 paul.moore create