Issue 9257: cElementTree iterparse requires events as bytes; ElementTree uses strings (original) (raw)

Issue9257

Created on 2010-07-14 03:27 by eric-talevich, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (12)
msg110252 - (view) Author: Eric Talevich (eric-talevich) Date: 2010-07-14 03:27
In xml.etree, ElementTree and cElementTree implement different interfaces for the iterparse function/class. In ElementTree, the argument "events" must be a tuple of strings: from xml.etree import ElementTree as ET for result in ET.iterparse('example.xml', events=('start', 'end')): print(result) That works, given a valid XML file 'example.xml'. If the event names are given as bytes instead of strings (b'start', b'end'), there's no crash, but no events are recognized. In cElementTree, it's the opposite: the events argument must be a tuple of bytes: from xml.etree import cElementTree as cET for result in cET.iterparse('example.xml', events=(b'start', b'end')): print(result) Giving a tuple of strings instead of bytes results in: >>> for result in cET.iterparse('example.xml', events=('start', 'end')): ... print(result) TypeError: invalid event tuple This makes it difficult to use ElementTree as a backup for cElementTree, or at least very awkward.
msg110574 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-07-17 16:16
It seems that this has been fixed in the py3k branch (r78942). Now both bytes and unicode are accepted. Can someone check?
msg113739 - (view) Author: Eric Talevich (eric-talevich) Date: 2010-08-13 01:40
This bug seems to be still present in Python 3.1.2. (Unless I'm doing something wrong.) Was r78942 included in the 3.1.2 release?
msg114042 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-08-16 10:04
No, apparently, r78942 was not included in 3.1.2.
msg126287 - (view) Author: Peter (maubp) Date: 2011-01-14 19:01
This wasn't fixed in Python 3.1.3 either. Is the trunk commit Amaury identified from py3k branch (r78942) suitable to back port to Python 3.1.x?
msg126290 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-01-14 19:15
r78942 is quite large unfortunately. But just patching _elementree.c::xmlparser_setevents() should be possible. This would at least fix the "invalid event tuple" error.
msg152837 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-02-08 03:45
At this point, 3.1 won't be fixed with such changes any longer. Is this fixed in 3.2/3.3?
msg152864 - (view) Author: Eric Talevich (eric-talevich) Date: 2012-02-08 14:42
It's more-or-less fixed in Python 3.2: - With cElementTree, both bytes and strings are accepted for events; - With ElementTree, only strings are accepted, and bytes raise a ValueError (unknown event). A small inconsistency remains, but it's fine to just use strings in all cases.
msg152926 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-02-09 04:21
Eric, Thanks for checking. I agree that this behavior is acceptable, but a documentation fix would be appropriate. The documentation of iterparse should mention the events it accepts, also saying that those are strings. The events are listed at http://effbot.org/zone/element-iterparse.htm Would you like to try your hand at submitting a patch for Python 3.2? I will review and apply it to 3.2 and 3.3
msg152927 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-02-09 04:22
Changing the target version(s) and adding some documentation experts to the nosy list
msg153072 - (view) Author: Eric Talevich (eric-talevich) Date: 2012-02-10 18:46
Well, this is not the best month for me to try digging into a new codebase... I would not mind if someone else did the patch for this.
msg155999 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-03-16 06:44
New changeset 84e4d76bd146 by Eli Bendersky in branch '3.2': Issue #9257: clarify the events iterparse accepts http://hg.python.org/cpython/rev/84e4d76bd146 New changeset 00c7142ee54a by Eli Bendersky in branch 'default': Issue #9257: clarify the events iterparse accepts http://hg.python.org/cpython/rev/00c7142ee54a
History
Date User Action Args
2022-04-11 14:57:03 admin set github: 53503
2012-03-16 06:44:40 eli.bendersky set status: open -> closedresolution: fixedstage: needs patch -> resolved
2012-03-16 06:44:20 python-dev set nosy: + python-devmessages: +
2012-02-10 18:46:05 eric-talevich set messages: +
2012-02-09 04:22:37 eli.bendersky set nosy: + ezio.melotti, eric.araujo, sandro.tosimessages: + versions: + Python 3.2, Python 3.3, - Python 3.1
2012-02-09 04:21:40 eli.bendersky set messages: +
2012-02-08 14:42:49 eric-talevich set messages: +
2012-02-08 03:45:45 eli.bendersky set nosy: + eli.benderskymessages: +
2011-11-08 22:48:17 ezio.melotti set nosy: + floxtype: behavior
2011-01-18 21:43:43 vstinner set nosy: + vstinner
2011-01-14 19:15:31 amaury.forgeotdarc set messages: +
2011-01-14 19:01:55 maubp set messages: +
2010-08-16 10:04:43 amaury.forgeotdarc set messages: +
2010-08-13 01:40:35 eric-talevich set messages: +
2010-07-23 10:03:30 maubp set nosy: + maubp
2010-07-17 16:16:56 amaury.forgeotdarc set nosy: + amaury.forgeotdarcmessages: + stage: needs patch
2010-07-14 03:27:45 eric-talevich create