msg309010 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2017-12-24 18:51 |
Currently, the Python implementation of the Element class in xml.etree.ElementTree defines a method called copy() which the C implementation does not define, whereas the C implementation defines a __copy__() method (and a __deepcopy__() method) which the Python implementation does not define. Given that the documentation makes no mention of a copy() method and that its definition would be masked by a standard import of xml.etree.ElementTree, I propose that it be renamed to __copy__() so that copy.copy() can make use of it. |
|
|
msg309074 - (view) |
Author: Julien Palard (mdk) *  |
Date: 2017-12-26 23:54 |
As the C implementation is already shadowing the Python implementation when available (last lines of Lib/xml/etree/ElementTree.py), using `.copy()` is already a bug magnet (works when the Python implem is used, does not work when shadowed by the C implem). So renaming the copy method to __copy__ in the python implementation looks good to me. Ultimately the Python implementation would require a __deepcopy__ too? |
|
|
msg309089 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2017-12-27 15:23 |
Ultimately, yeah, the Python version should probably define __deepcopy__ as well. But since this is just a rename of an existing method, I figure we can defer that to another time. |
|
|
msg309116 - (view) |
Author: Srinivas Reddy Thatiparthy(శ్రీనివాస్ రెడ్డి తాటిపర్తి) (thatiparthy) * |
Date: 2017-12-28 07:13 |
Renaming the method `copy()` to `__copy__` breaks the API. I would rather have an alias for `__copy__`. I agree that it is not documented, but some users tend to assume public methods are documented and use them. So i think it is better to not to break their code. |
|
|
msg309121 - (view) |
Author: Andrew Svetlov (asvetlov) *  |
Date: 2017-12-28 11:12 |
Agree with Srinivas |
|
|
msg309122 - (view) |
Author: Julien Palard (mdk) *  |
Date: 2017-12-28 11:19 |
I don't agree with the API breakage, the C implementation of the module does not expose copy(), so code using copy() is already broken (since 2005) in cases the C implementation is used. I also expect the C implementation to be used almost anywhere, but I may be wrong on this point: in which case the Python implementation may still be used? |
|
|
msg309124 - (view) |
Author: Andrew Svetlov (asvetlov) *  |
Date: 2017-12-28 11:38 |
pypy for example has Element.copy() |
|
|
msg309131 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2017-12-28 15:07 |
Two notes: * It appears that pypy is based on no more recent than Python 3.5, so this wouldn't immediately break them. (3.6 support is maybe in development?) * pypy appears to have already made other adjustments due to the differences between the Python and C implementations: https://bitbucket.org/pypy/pypy/diff/lib-python/3/xml/etree/ElementTree.py?diff2=0939e3a8a08d&at=py3.5 That said, I'm neutral on the subject. I'm happy to implement whichever option you decide on. |
|
|
msg309140 - (view) |
Author: Andrew Svetlov (asvetlov) *  |
Date: 2017-12-28 19:42 |
I mean dropping `.copy()` breaks not only compatibility between pure python versions but also between pypy releases and pypy/CPython code. That's why I suggest to keep `.copy()` as an alias for `__copy__()`. |
|
|
msg309142 - (view) |
Author: Paul Ganssle (p-ganssle) *  |
Date: 2017-12-28 21:48 |
> I mean dropping `.copy()` breaks not only compatibility between pure python versions but also between pypy releases and pypy/CPython code. That's why I suggest to keep `.copy()` as an alias for `__copy__()`. If the main issue is that pypy users have been using this, can't pypy just keep an alias for `copy` around, possibly with a DeprecationWarning? |
|
|
msg309143 - (view) |
Author: Andrew Svetlov (asvetlov) *  |
Date: 2017-12-28 22:16 |
For derivative Python implementation there is a standard strategy: they uses a copy of CPython standard library. Tthat's why any new module should have a pure Python implementation and that's why https://www.python.org/dev/peps/pep-0399/ exists. Jython has copy() method as well. IronPython has it. I doubt if an implementation without xml.etree.ElementTree.Element.copy exists at all. The method removal breaks backward compatibility without a reason, __copy__ alias is valuable and backward compatible change. |
|
|
msg309144 - (view) |
Author: Paul Ganssle (p-ganssle) *  |
Date: 2017-12-28 22:22 |
@Andrew Are you suggesting that a `copy` method be *added* to the C implementation, as an alias to __copy__? Because it makes no sense to keep the pure Python and C implementations out of sync just so that it's easier for projects forking the pure Python implementation to maintain backwards compatibility with their earlier implementations. |
|
|
msg309145 - (view) |
Author: Andrew Svetlov (asvetlov) *  |
Date: 2017-12-28 22:25 |
Yes. For designing from scratch copy() is not necessary but if we have it for decade -- better to unify pure Python and C implementation by adding missing methods. |
|
|
msg309146 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-12-28 22:48 |
Paul Ganssle: Even if Andrew were not suggesting adding copy to the C implementation (I have no opinion on that currently), it would still be correct to maintain backward compatibility in the python version in the standard library. We do not consider the standard library to be CPython specific, even though parts of it are. There is no "fork" involved in other projects using it, from our point of view. Quite the opposite, where possible. We have given commit rights to developers from other python implementations specifically so they can contribute things that improve compatibility in the standard library and standard library test suite for those other python implementations. |
|
|
msg309147 - (view) |
Author: Paul Ganssle (p-ganssle) *  |
Date: 2017-12-28 23:01 |
> There is no "fork" involved in other projects using it, from our point of view. I did not mean "fork" in some judgmental sense, I'm mostly just familiar with pypy, but I was under the impression that other projects were *literally* forking the standard library as a practical matter. I get the impression that other interpreters maintain a patched interpreter often for the purposes of adding fixes like the one pointed out in which bring the behavior of the Python implementation of the standard library into line with the C implementations, since the C version is the de facto standard (since most code is written and tested only for CPython). I'd be fine with adding a C alias for `copy`, though as a purely practical matter, I strongly suspect that dropping `copy` from the pure Python implementation would not be a huge deal, but if that breaks the contract it breaks the contract. |
|
|
msg309149 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-12-28 23:29 |
Yes, that's why I said "from our point of view" :) I know there is usually a fork in the practical sense, but we want to make it as easy as practical to sync that fork, which includes not breaking things in the python versions without good reason. |
|
|
msg309179 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2017-12-29 16:29 |
Given the discussion, I've gone ahead and created a new PR that synchronizes the three copy methods between implementations and deprecates Element.copy(). |
|
|
msg309184 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-12-29 18:03 |
There is an existing test for copying. No new tests are needed. Just add a case for the copy() method in test_copy. Special __deepcopy__() method for Python implementation is not needed. copy.deepcopy() already works well. |
|
|
msg309263 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2017-12-31 04:21 |
As discussed above, starting with , __deepcopy__() is being added to the Python implementation because it already exists in the C implementation. And additional tests have in fact uncovered further discrepancies between the Python and C implementations with regard to copying behavior. PR 5046 has changes that I believe are ready for review. |
|
|
msg309953 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-01-15 08:12 |
> As discussed above, starting with , __deepcopy__() is being added to the Python implementation because it already exists in the C implementation. Python implementation shouldn't copy all implementation details of the C implementation. This is cumbersome and often impossible. Both implementation should have the same behavior, and they do have. > And additional tests have in fact uncovered further discrepancies between the Python and C implementations with regard to copying behavior. What are these discrepancies? In any case it is better to extend existing tests by adding missed checks than duplicate tests. |
|
|
msg310362 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2018-01-20 22:20 |
>> As discussed above, starting with , __deepcopy__() is being added to the Python implementation because it already exists in the C implementation. > > Python implementation shouldn't copy all implementation details of the C implementation. This is cumbersome and often impossible. Both implementation should have the same behavior, and they do have. As it stands, calling Element.__deepcopy__() will succeed with the C implementation and fail with the Python implementation. That seems problematic to me, and making that *not* be the case is trivial, so I don't understand the generalization to "all implementation details". Additionally, if copy.deepcopy() already works well with the Python implementation, why would it not work on the C implementation? What is different about the C implementation that requires the definition of __deepcopy__() in the first place? >> And additional tests have in fact uncovered further discrepancies between the Python and C implementations with regard to copying behavior. > > What are these discrepancies? As I mentioned in the PR, the C implementation does not make use of Element.__init__() when it creates new instances of the class, opting instead to use create_new_element() (which is not used in the C implementation of Element.__init__()). However, create_new_element() does not make a copy of the attrib dict that is passed in, meaning that the internal attrib dict of an Element instance is mutable by changing the dict that was passed in. I have modified the C implementation to fix this problem by adding a copy line in create_new_element(). Without this change, some of the new tests in the PR will fail in the C implementation. The Python implementation does not have this problem because it inherently always goes through Element.__init__(). > In any case it is better to extend existing tests by adding missed checks than duplicate tests. As I mentioned in the PR: I have added the unit tests in the way that I have because the existing checks are generally integration tests, and most of the test coverage for this module comes as a side effect of pickle tests. It is my goal to expand the unit tests later to test each method individually, so starting that work here makes it easier. |
|
|
msg340381 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-04-17 06:50 |
> As it stands, calling Element.__deepcopy__() will succeed with the C > implementation and fail with the Python implementation. You should not call it directly. Use copy.deepcopy(). > What is > different about the C implementation that requires the definition of > __deepcopy__() in the first place? __deepcopy__() in the C implementation is merely optimization. It is less efficient to convert from the efficient internal representation of Element object to tuples and dicts and back. > However, create_new_element() > does not make a copy of the attrib dict that is passed in, meaning that > the internal attrib dict of an Element instance is mutable by changing > the dict that was passed in. create_new_element() usually is called with a new copy of the attrib dict. Making yet one copy inside it is just a waste of time. If in some cases it is called with an externally referred dict, fix these cases by making a copy before calling create_new_element(). |
|
|
msg340589 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2019-04-21 00:00 |
Taking a step back, I want to make sure I understand the action items for resolving this: * Add copy() as an alias to __copy__() in the C implementation to match the Python implementation. * Deprecate copy() in favor of copy.copy() in both the Python and C implementations, bypassing a PendingDeprecation step because the implementations have been out of sync for so long as it is. Any other issues discussed here are orthogonal to the goal of this ticket. Are we in agreement on this? |
|
|
msg340592 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2019-04-21 00:09 |
Opened for discussion of the attrib copy issue. |
|
|
msg340600 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-21 06:36 |
We should not add anything to the implementation that we consider legacy elsewhere. Py3 has "always" used the C accelerator module instead of the Python implementation, which suggests that its interface is probably the righter one. So: make sure that the Obvious Ways, copy.copy() and copy.deepcopy(), work nicely for both implementations, and deprecate Element.copy() in Py3.8. |
|
|
msg340602 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-04-21 07:29 |
Concur with Stefan. Adding the method deprecated from the born looks silly. We should either deprecate the copy() method in the Python implementation, or add an undeprecated method in the C implementation. The latter should be considered as a new feature. |
|
|
msg341021 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-04-28 04:41 |
New changeset 50fed0b64faa3053300008ef5607b570fe209de6 by Stefan Behnel (Gordon P. Hemsley) in branch 'master': bpo-32424: Improve test coverage for xml.etree.ElementTree (GH-12891) https://github.com/python/cpython/commit/50fed0b64faa3053300008ef5607b570fe209de6 |
|
|
msg341447 - (view) |
Author: Gordon P. Hemsley (gphemsley) * |
Date: 2019-05-05 13:55 |
It seems the final open question on this is whether the mismatch between the Python and C implementations is enough to bypass PendingDeprecationWarning for copy() in favor of jumping straight to DeprecationWarning. |
|
|
msg351698 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-09-10 15:21 |
Let's just deprecate it directly (in Py3.9, I guess), given that it was never documented. |
|
|
msg351699 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2019-09-10 15:22 |
New changeset 7d952ded6813c896ea3f4234bb8db5247dcb5484 by Stefan Behnel (Gordon P. Hemsley) in branch 'master': bpo-32424: Deprecate xml.etree.ElementTree.Element.copy() in favor of copy.copy() (GH-12995) https://github.com/python/cpython/commit/7d952ded6813c896ea3f4234bb8db5247dcb5484 |
|
|