msg282376 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-12-04 23:03 |
The Element class in the xml.etree.ElementTree module is a collection. It can contain other Element's. But unlike to common Python collections (list, dict, etc) and pure Python classes, C implementation of Element doesn't support unlimited recursion. As result, destroying very deep Element tree can cause stack overflow. Example: import xml.etree.cElementTree as ElementTree y = x = ElementTree.Element('x') for i in range(200000): y = ElementTree.SubElement(y, 'x') del x |
|
|
msg283211 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-12-14 18:47 |
Proposed patch fixes the crash. |
|
|
msg283739 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-12-21 10:56 |
New changeset 957091874ea0 by Serhiy Storchaka in branch '3.5': Issue #28871: Fixed a crash when deallocate deep ElementTree. https://hg.python.org/cpython/rev/957091874ea0 New changeset 78bf34b6a713 by Serhiy Storchaka in branch '2.7': Issue #28871: Fixed a crash when deallocate deep ElementTree. https://hg.python.org/cpython/rev/78bf34b6a713 |
|
|
msg284145 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-12-28 02:11 |
bm_xml_etree.py benchmark started to crash on Python 2.7 because of the change 78bf34b6a713. Python 2.7 @ 78bf34b6a713: bug.py does crash Python 2.7 @ 32cc37a89b58: no crash Full script: https://github.com/python/performance/blob/master/performance/benchmarks/bm_xml_etree.py |
|
|
msg284146 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-12-28 02:15 |
haypo@selma$ gdb -args ./python ~/bug.py (gdb) run python: Objects/object.c:2453: _PyTrash_thread_deposit_object: Assertion `PyObject_IS_GC(op)' failed. Program received signal SIGABRT, Aborted. (gdb) py-bt Traceback (most recent call first): File "/home/haypo/bug.py", line 130, in bench_parse root1 = etree.parse(xml_file).getroot() File "/home/haypo/bug.py", line 171, in bench_etree bench_func(etree, file_path, xml_data, xml_root) File "/home/haypo/bug.py", line 197, in bench_etree(1, ET, bench_func) (gdb) where #0 0x00007ffff711892f in raise () from /lib64/libc.so.6 #1 0x00007ffff711a52a in abort () from /lib64/libc.so.6 #2 0x00007ffff7110e37 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007ffff7110ee2 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000000463abe in _PyTrash_thread_deposit_object (op=<Element at remote 0x7fffec9152e0>) at Objects/object.c:2453 #5 0x00007fffecd29b17 in element_dealloc (self=0x7fffec9152e0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:566 #6 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9152e0>) at Objects/object.c:2262 #7 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915280) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #8 0x00007fffecd29abc in element_dealloc (self=0x7fffec915280) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #9 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915280>) at Objects/object.c:2262 #10 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915220) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #11 0x00007fffecd29abc in element_dealloc (self=0x7fffec915220) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #12 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915220>) at Objects/object.c:2262 #13 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec9151c0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #14 0x00007fffecd29abc in element_dealloc (self=0x7fffec9151c0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #15 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9151c0>) at Objects/object.c:2262 #16 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915160) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #17 0x00007fffecd29abc in element_dealloc (self=0x7fffec915160) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #18 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915160>) at Objects/object.c:2262 #19 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915100) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #20 0x00007fffecd29abc in element_dealloc (self=0x7fffec915100) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #21 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915100>) at Objects/object.c:2262 #22 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec9150a0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #23 0x00007fffecd29abc in element_dealloc (self=0x7fffec9150a0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #24 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9150a0>) at Objects/object.c:2262 #25 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915040) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #26 0x00007fffecd29abc in element_dealloc (self=0x7fffec915040) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #27 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915040>) at Objects/object.c:2262 #28 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90ffa0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #29 0x00007fffecd29abc in element_dealloc (self=0x7fffec90ffa0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #30 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90ffa0>) at Objects/object.c:2262 #31 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90ff40) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #32 0x00007fffecd29abc in element_dealloc (self=0x7fffec90ff40) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #33 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90ff40>) at Objects/object.c:2262 #34 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fee0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #35 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fee0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #36 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fee0>) at Objects/object.c:2262 #37 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fe80) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #38 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fe80) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #39 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fe80>) at Objects/object.c:2262 #40 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fe20) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #41 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fe20) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #42 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fe20>) at Objects/object.c:2262 #43 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fdc0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #44 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fdc0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #45 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fdc0>) at Objects/object.c:2262 #46 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fd60) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 #47 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fd60) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561 #48 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fd60>) at Objects/object.c:2262 #49 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fd00) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301 (...) |
|
|
msg284155 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-12-28 07:13 |
Ah, I tested only with non-debug build in which asserts were ignored! In 2.7 Element doesn't support garbage collection, and the trashcan mechanism Py_TRASHCAN_SAFE_BEGIN/Py_TRASHCAN_SAFE_END can't be applied. I see three alternatives: 1. Just revert the changes. Let deep ElementTree crashing. 2. Add the support of garbage collection. This will increase the size of empty Element by 1.5 times. This looks less appropriate that the first option since this harms working code. 3. Try to implement different mechanism. By using external list object as a stack or using other field for creating a linked list. I'll revert the patch (except tests fix) and will try to implement different mechanism. |
|
|
msg284203 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-12-28 22:12 |
This issue seems theorical to me, whereas the breakage of benchmarks is very concrete for me. So I suggest to revert the change in Python 2.7. (2) looks like the right design and it was implemented in Python 3 (no?). I don't think that it's worth it to backport the change to Python 2. You are the first one to report the issue and the backport is risky. |
|
|
msg285365 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-01-13 07:31 |
Changes were reverted by 78bf34b6a713. It is very uneasy to implement an alternative mechanism (without increasing the size of Element objects). It adds duplication of low level garbage collecting code. I think it is better to left all as is in 2.7. Yet one argument for moving to Python 3. |
|
|
msg285367 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-01-13 07:50 |
> Yet one argument for moving to Python 3. Yep, thanks ;-) |
|
|