Issue 15651: PEP 3121, 384 refactoring applied to elementtree module (original) (raw)

Issue15651

Created on 2012-08-14 18:25 by Robin.Schreiber, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
_elementtree_pep3121-384_v0.patch Robin.Schreiber,2012-08-14 18:25
_elementtree_pep3121-384_v1.patch Robin.Schreiber,2012-12-14 16:37
etree_3121.patch pitrou,2013-08-06 20:29 review
pystate-findmodule-clarify.1.patch eli.bendersky,2013-08-11 00:45 review
Messages (26)
msg168217 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2012-08-14 18:25
Changes proposed in PEP3121 and PEP384 have now been applied to the elementtree module!
msg168231 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-14 19:12
See review comments :)
msg177465 - (view) Author: Robin Schreiber (Robin.Schreiber) * (Python triager) Date: 2012-12-14 16:37
Patch updated to work with current 3.4 Branch version of elementtree.
msg178501 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-12-29 15:12
Thanks for the patch. I'll take a look.
msg179899 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-01-13 22:52
I looked at the patch a bit more in depth and must admit that I'm reluctant to apply it. It's a very large patch with very little documentation about what steps are taken and why, and I just don't see the motivation. The way I see it, PEP 384 is great for compatibility of third-party extensions and embeddings of Python, but much less critical for a module that's always distributed as part of stdlib and thus is kept in exact sync with the ABI of the Python version it comes with. Correct me if I'm wrong. That said, I won't object to some refactoring if it improves the code. But when such large changes are proposed, I really prefer to see small, incremental patches that replace just a part of the code. Such patches should come with an explanation of why the change is made (i.e. which part of PEP 384 does it adhere to).
msg194570 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-06 20:29
Here is a simplified patch tackling only the PEP 3121 compliance. Eli, I think this would be good to go.
msg194582 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-06 21:26
Bless you Antoine, I've been just planning to do this myself to tackle the re-importing troubles I was having in tests the other day :-) I'll take a look at this soon, promise!
msg194607 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-07 13:11
Antoine, some questions about the patch: First, I think it omits expat_capi from the state. Is that intentional? Second, I'm not sure if this approach is fully aligned with PEP 3121. A global, shared state is still used. Instead of actually having a different module state per subinterpreter, this patch will have shared state. Another problem seems to be using PyModule_FindModule without using PyModule_AddModule first. These problems could be shared to all of Robin's original patches. Of course, there's also the possibility that I don't fully understand PEP 3121 yet :)
msg194608 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-07 13:28
> First, I think it omits expat_capi from the state. Is that > intentional? What would it do in the state? There's nothing to release. > Second, I'm not sure if this approach is fully aligned with PEP 3121. > A global, shared state is still used. Instead of actually having a > different module state per subinterpreter, this patch will have > shared state. I don't understand what you are talking about. Perhaps you haven't looked what PyState_FindModule() does?
msg194610 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-07 13:47
On Wed, Aug 7, 2013 at 6:28 AM, Antoine Pitrou <report@bugs.python.org>wrote: > > Antoine Pitrou added the comment: > > > First, I think it omits expat_capi from the state. Is that > > intentional? > > What would it do in the state? There's nothing to release. > That's true, but I thought one of the goals of PEP 3121 is to separate states between sub-interpreters. So that one can't corrupt another. I'm not sure how much it matters in practice in this case of the pyexpat capsule; need to look into it more. > > Second, I'm not sure if this approach is fully aligned with PEP 3121. > > A global, shared state is still used. Instead of actually having a > > different module state per subinterpreter, this patch will have > > shared state. > > I don't understand what you are talking about. Perhaps you haven't looked > what PyState_FindModule() does? > I did not look at the implementation yet. But the documentation says: """Returns the module object that was created from *def* for the current interpreter. This method requires that the module object has been attached to the interpreter state with PyState_AddModule()<http://docs.python.org/dev/c-api/module.html?highlight=pymoduledef_base#PyState_AddModule>beforehand. In case the corresponding module object is not found or has not been attached to the interpreter state yet, it returns NULL.""" I don't see a call to PyState_AddModule. What am I missing?
msg194612 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-07 14:00
> That's true, but I thought one of the goals of PEP 3121 is to > separate > states between sub-interpreters. So that one can't corrupt another. > I'm not > sure how much it matters in practice in this case of the pyexpat > capsule; > need to look into it more. pyexpat's "capi" object is a static struct inside pyexpat.c, so that wouldn't change anything. Separating states between sub-interpreters only matters when said state is mutable, which it isn't here. > I don't see a call to PyState_AddModule. What am I missing? It is called implicitly when an extension module is imported.
msg194674 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-08 13:56
Thanks Antoine. I think I understand the patch better now. Just a couple small questions and otherwise LGTM This code in the beginning in PyInit__elementtree: m = PyState_FindModule(&elementtreemodule); if (m) { Py_INCREF(m); return m; } Can you explain what use case it tries to cover? I couldn't find similar code in other modules we have that implement PEP 3121 (_csv, readline, io, etc.) This code has at least one adverse effect, for testing. The problem with re-importing _elementtree I raised in http://mail.python.org/pipermail/python-dev/2013-August/127766.html is solved by moving to PEP 3121, but this piece of code above ruins it. This is because I want to set sys.modules['pyexpat'] = None and re-import _elementtree (this is what support.import_fresh_module does). But with this code in place, if _elementtree was imported any time in the past (say, in a previous test), I'll just get the instance back without attempting to do the full module initialization. >> I don't see a call to PyState_AddModule. What am I missing? >It is called implicitly when an extension module is imported. Do you think this should be documented in the C API docs? The way they read now, it seems that calling PyState_AddModule is needed manually by extension writers.
msg194675 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-08 13:58
> Can you explain what use case it tries to cover? What I'm trying to say is that although I think I understand its effect (if the same sub-interpreter re-imports _elementtree bypassing the Python module cache, this is an additional level of caching), I'm not sure what *real* use cases it aims for.
msg194677 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-08 14:00
> This code in the beginning in PyInit__elementtree: > > m = PyState_FindModule(&elementtreemodule); > if (m) { > Py_INCREF(m); > return m; > } > > Can you explain what use case it tries to cover? I couldn't find > similar code in other modules we have that implement PEP 3121 (_csv, > readline, io, etc.) I don't know :-) I just re-used Robin's original patch. > >> I don't see a call to PyState_AddModule. What am I missing? > >It is called implicitly when an extension module is imported. > > Do you think this should be documented in the C API docs? The way > they read now, it seems that calling PyState_AddModule is needed > manually by extension writers. Well, how to deal with module state should probably be better documented. Not sure how, though.
msg194679 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-08 14:08
On Thu, Aug 8, 2013 at 7:00 AM, Antoine Pitrou <report@bugs.python.org>wrote: > > Antoine Pitrou added the comment: > > > This code in the beginning in PyInit__elementtree: > > > > m = PyState_FindModule(&elementtreemodule); > > if (m) { > > Py_INCREF(m); > > return m; > > } > > > > Can you explain what use case it tries to cover? I couldn't find > > similar code in other modules we have that implement PEP 3121 (_csv, > > readline, io, etc.) > > I don't know :-) I just re-used Robin's original patch. > Would you mind removing it from the patch, due to the case described above? ISTM that in real scenarios the sys.modules cache kicks in anyway. It should not be really bypassed for any given sub-interpreter in sane code. > > > >> I don't see a call to PyState_AddModule. What am I missing? > > >It is called implicitly when an extension module is imported. > > > > Do you think this should be documented in the C API docs? The way > > they read now, it seems that calling PyState_AddModule is needed > > manually by extension writers. > > Well, how to deal with module state should probably be better > documented. Not sure how, though. > I'll think about it some more and will try to propose a documentation patch. This can be done incrementally; we don't have to go to perfect docs on the first try ;-)
msg194680 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-08 14:14
> Would you mind removing it from the patch, due to the case described > above? > ISTM that in real scenarios the sys.modules cache kicks in anyway. It > should not be really bypassed for any given sub-interpreter in sane > code. Yeah, I think you're right. I'll submit an updated patch or, if it's the only issue with it, perhaps you can simply remove the 3 offending lines?
msg194684 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-08 15:16
On Thu, Aug 8, 2013 at 7:14 AM, Antoine Pitrou <report@bugs.python.org>wrote: > > Antoine Pitrou added the comment: > > > Would you mind removing it from the patch, due to the case described > > above? > > ISTM that in real scenarios the sys.modules cache kicks in anyway. It > > should not be really bypassed for any given sub-interpreter in sane > > code. > > Yeah, I think you're right. I'll submit an updated patch or, if it's > the only issue with it, perhaps you can simply remove the 3 offending > lines? > Sure, I'll do that. Thanks.
msg194797 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-08-10 15:01
New changeset 8a060e2de608 by Eli Bendersky in branch 'default': Issue #15651: PEP 3121 refactoring for _elementtree http://hg.python.org/cpython/rev/8a060e2de608
msg194798 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-10 15:07
Antoine, I committed your patch (with a bit of comments added), *leaving the module caching in*. This is because removing it breaks the tests, unfortunately. The _elementtree tests are so crooked that they manage to create a situation in which the module under test throws ParseError which is a different class from ET.ParseError. This is "achieved" by multiple invocations of import_fresh_module with various fresh & blocked parameters, and I still haven't fully traced all the problems yet. Since this caching only potentially harms other tests, it's ok to leave in. A longer term solution to all this will be http://mail.python.org/pipermail/python-dev/2013-August/127766.html - I want to eventually run all "monkey-patch the import environment to simulate some situation" sub-tests of ET in different subprocesses to they are kept independent.
msg194822 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-10 17:38
> Antoine, I committed your patch (with a bit of comments added), > *leaving the module caching in*. Thanks! > A longer term solution to all this will be > http://mail.python.org/pipermail/python-dev/2013-August/127766.html - > I want to eventually run all "monkey-patch the import environment to > simulate some situation" sub-tests of ET in different subprocesses to > they are kept independent. I find it useful that the test suite stresses module unloading or reloading. There's probably a bug either in ET or in the ET tests. I'm not saying it's a very important issue of course, but IMHO it would be better if we don't try to swipe it under the carpet :-)
msg194835 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-10 21:12
On Sat, Aug 10, 2013 at 10:38 AM, Antoine Pitrou <report@bugs.python.org>wrote: > > Antoine Pitrou added the comment: > > > Antoine, I committed your patch (with a bit of comments added), > > *leaving the module caching in*. > > Thanks! > > > A longer term solution to all this will be > > http://mail.python.org/pipermail/python-dev/2013-August/127766.html - > > I want to eventually run all "monkey-patch the import environment to > > simulate some situation" sub-tests of ET in different subprocesses to > > they are kept independent. > > I find it useful that the test suite stresses module unloading or > reloading. There's probably a bug either in ET or in the ET tests. I'm > not saying it's a very important issue of course, but IMHO it would be > better if we don't try to swipe it under the carpet :-) > I have no intention swiping things under the carpet. I'll get to the bottom of this to understand the exact flow that causes this to happen. But I still think ET tests should be logically separated into subprocesses. If we want to stress test module unloading and reloading, let's have specific, targeted tests for that.
msg194837 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-08-10 21:36
> If we want to stress test module unloading and reloading, let's have > specific, targeted tests for that. Agreed.
msg194855 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-11 00:30
Found some problems in the interaction of PEP 3121 and import_fresh_module: http://mail.python.org/pipermail/python-dev/2013-August/127862.html I'd still like to see the in-PyInit__elementtree caching deleted eventually, without harming test coverage. So this issue will remain open for now, until we decide what to do.
msg194856 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-08-11 00:44
BTW, Antoine, w.r.t documentation - I agree that the whole PyState_* sequence needs better documentation and examples, but in the meantime I'm attaching a simple patch for c-api/module.rst. It clarifies that PyState_AddModule doesn't really have to be called explicitly in extensions.
msg383280 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-18 00:45
Fixed by: commit a6109ef68d421712ba368ef502c4789e8de113e0 Author: Erlend Egeberg Aasland <erlend.aasland@innova.no> Date: Fri Nov 20 13:36:23 2020 +0100 bpo-1635741: Convert _sre types to heap types and establish module state (PEP 384) (GH-23393)
msg383286 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-18 01:06
> Fixed by: (...) bpo-1635741: Convert _sre types to heap types... Oops, I commented the wrong issue. I reopen it.
History
Date User Action Args
2022-04-11 14:57:34 admin set github: 59856
2021-12-08 15:25:16 iritkatriel set status: open -> closedsuperseder: PEP 3121, 384 refactoring applied to elementtree moduleresolution: duplicate
2021-12-08 15:25:16 iritkatriel link issue15651 superseder
2020-12-18 01:06:06 vstinner set status: closed -> opensuperseder: Py_Finalize() doesn't clear all Python objects at exit -> (no value)resolution: duplicate -> (no value)messages: +
2020-12-18 00:45:03 vstinner set status: open -> closedsuperseder: Py_Finalize() doesn't clear all Python objects at exitnosy: + vstinnermessages: + resolution: duplicatestage: patch review -> resolved
2013-08-11 00:45:03 eli.bendersky set files: + pystate-findmodule-clarify.1.patch
2013-08-11 00:44:52 eli.bendersky set messages: +
2013-08-11 00:30:09 eli.bendersky set messages: +
2013-08-10 21:36:21 pitrou set messages: +
2013-08-10 21:12:21 eli.bendersky set messages: +
2013-08-10 17:38:42 pitrou set messages: +
2013-08-10 15:07:26 eli.bendersky set messages: +
2013-08-10 15:01:21 python-dev set nosy: + python-devmessages: +
2013-08-08 15:16:39 eli.bendersky set messages: +
2013-08-08 14:14:32 pitrou set messages: +
2013-08-08 14:08:32 eli.bendersky set messages: +
2013-08-08 14:00:10 pitrou set messages: +
2013-08-08 13:58:13 eli.bendersky set messages: +
2013-08-08 13:56:34 eli.bendersky set messages: +
2013-08-07 14:00:07 pitrou set messages: +
2013-08-07 13:47:16 eli.bendersky set messages: +
2013-08-07 13:28:36 pitrou set messages: +
2013-08-07 13:11:17 eli.bendersky set messages: +
2013-08-06 21:26:01 eli.bendersky set messages: +
2013-08-06 20:29:38 pitrou set files: + etree_3121.patchmessages: +
2013-01-13 22:52:36 eli.bendersky set messages: +
2012-12-29 15:12:05 eli.bendersky set assignee: eli.benderskymessages: +
2012-12-14 17:44:45 Arfrever set nosy: + Arfrever
2012-12-14 16:37:24 Robin.Schreiber set files: + _elementtree_pep3121-384_v1.patchkeywords: + patchmessages: +
2012-11-08 13:39:49 Robin.Schreiber set keywords: + pep3121, - patch
2012-08-27 03:41:57 belopolsky link issue15787 dependencies
2012-08-17 16:36:58 asvetlov set nosy: + asvetlov
2012-08-14 19:12:39 pitrou set nosy: + eli.bendersky, pitroumessages: + stage: patch review
2012-08-14 18:33:34 Robin.Schreiber set nosy: + effbot
2012-08-14 18:25:20 Robin.Schreiber create