Issue 9257: cElementTree iterparse requires events as bytes; ElementTree uses strings (original) (raw)
Issue9257
Created on 2010-07-14 03:27 by eric-talevich, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Messages (12) | ||
---|---|---|
msg110252 - (view) | Author: Eric Talevich (eric-talevich) | Date: 2010-07-14 03:27 |
In xml.etree, ElementTree and cElementTree implement different interfaces for the iterparse function/class. In ElementTree, the argument "events" must be a tuple of strings: from xml.etree import ElementTree as ET for result in ET.iterparse('example.xml', events=('start', 'end')): print(result) That works, given a valid XML file 'example.xml'. If the event names are given as bytes instead of strings (b'start', b'end'), there's no crash, but no events are recognized. In cElementTree, it's the opposite: the events argument must be a tuple of bytes: from xml.etree import cElementTree as cET for result in cET.iterparse('example.xml', events=(b'start', b'end')): print(result) Giving a tuple of strings instead of bytes results in: >>> for result in cET.iterparse('example.xml', events=('start', 'end')): ... print(result) TypeError: invalid event tuple This makes it difficult to use ElementTree as a backup for cElementTree, or at least very awkward. | ||
msg110574 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2010-07-17 16:16 |
It seems that this has been fixed in the py3k branch (r78942). Now both bytes and unicode are accepted. Can someone check? | ||
msg113739 - (view) | Author: Eric Talevich (eric-talevich) | Date: 2010-08-13 01:40 |
This bug seems to be still present in Python 3.1.2. (Unless I'm doing something wrong.) Was r78942 included in the 3.1.2 release? | ||
msg114042 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2010-08-16 10:04 |
No, apparently, r78942 was not included in 3.1.2. | ||
msg126287 - (view) | Author: Peter (maubp) | Date: 2011-01-14 19:01 |
This wasn't fixed in Python 3.1.3 either. Is the trunk commit Amaury identified from py3k branch (r78942) suitable to back port to Python 3.1.x? | ||
msg126290 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2011-01-14 19:15 |
r78942 is quite large unfortunately. But just patching _elementree.c::xmlparser_setevents() should be possible. This would at least fix the "invalid event tuple" error. | ||
msg152837 - (view) | Author: Eli Bendersky (eli.bendersky) * ![]() |
Date: 2012-02-08 03:45 |
At this point, 3.1 won't be fixed with such changes any longer. Is this fixed in 3.2/3.3? | ||
msg152864 - (view) | Author: Eric Talevich (eric-talevich) | Date: 2012-02-08 14:42 |
It's more-or-less fixed in Python 3.2: - With cElementTree, both bytes and strings are accepted for events; - With ElementTree, only strings are accepted, and bytes raise a ValueError (unknown event). A small inconsistency remains, but it's fine to just use strings in all cases. | ||
msg152926 - (view) | Author: Eli Bendersky (eli.bendersky) * ![]() |
Date: 2012-02-09 04:21 |
Eric, Thanks for checking. I agree that this behavior is acceptable, but a documentation fix would be appropriate. The documentation of iterparse should mention the events it accepts, also saying that those are strings. The events are listed at http://effbot.org/zone/element-iterparse.htm Would you like to try your hand at submitting a patch for Python 3.2? I will review and apply it to 3.2 and 3.3 | ||
msg152927 - (view) | Author: Eli Bendersky (eli.bendersky) * ![]() |
Date: 2012-02-09 04:22 |
Changing the target version(s) and adding some documentation experts to the nosy list | ||
msg153072 - (view) | Author: Eric Talevich (eric-talevich) | Date: 2012-02-10 18:46 |
Well, this is not the best month for me to try digging into a new codebase... I would not mind if someone else did the patch for this. | ||
msg155999 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2012-03-16 06:44 |
New changeset 84e4d76bd146 by Eli Bendersky in branch '3.2': Issue #9257: clarify the events iterparse accepts http://hg.python.org/cpython/rev/84e4d76bd146 New changeset 00c7142ee54a by Eli Bendersky in branch 'default': Issue #9257: clarify the events iterparse accepts http://hg.python.org/cpython/rev/00c7142ee54a |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:03 | admin | set | github: 53503 |
2012-03-16 06:44:40 | eli.bendersky | set | status: open -> closedresolution: fixedstage: needs patch -> resolved |
2012-03-16 06:44:20 | python-dev | set | nosy: + python-devmessages: + |
2012-02-10 18:46:05 | eric-talevich | set | messages: + |
2012-02-09 04:22:37 | eli.bendersky | set | nosy: + ezio.melotti, eric.araujo, sandro.tosimessages: + versions: + Python 3.2, Python 3.3, - Python 3.1 |
2012-02-09 04:21:40 | eli.bendersky | set | messages: + |
2012-02-08 14:42:49 | eric-talevich | set | messages: + |
2012-02-08 03:45:45 | eli.bendersky | set | nosy: + eli.benderskymessages: + |
2011-11-08 22:48:17 | ezio.melotti | set | nosy: + floxtype: behavior |
2011-01-18 21:43:43 | vstinner | set | nosy: + vstinner |
2011-01-14 19:15:31 | amaury.forgeotdarc | set | messages: + |
2011-01-14 19:01:55 | maubp | set | messages: + |
2010-08-16 10:04:43 | amaury.forgeotdarc | set | messages: + |
2010-08-13 01:40:35 | eric-talevich | set | messages: + |
2010-07-23 10:03:30 | maubp | set | nosy: + maubp |
2010-07-17 16:16:56 | amaury.forgeotdarc | set | nosy: + amaury.forgeotdarcmessages: + stage: needs patch |
2010-07-14 03:27:45 | eric-talevich | create |