msg302101 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2017-09-13 17:28 |
The "XMLParser.__init__()" method in "_elementtree.c" contains this code: self->handle_start = PyObject_GetAttrString(target, "start"); self->handle_data = PyObject_GetAttrString(target, "data"); self->handle_end = PyObject_GetAttrString(target, "end"); self->handle_comment = PyObject_GetAttrString(target, "comment"); self->handle_pi = PyObject_GetAttrString(target, "pi"); self->handle_close = PyObject_GetAttrString(target, "close"); self->handle_doctype = PyObject_GetAttrString(target, "doctype"); PyErr_Clear(); This ignores all exceptions, not only AttributeError. It also passes live exceptions into the later lookup calls, which may execute arbitrary user code. |
|
|
msg302103 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-09-13 18:20 |
Oh, I see many other issues with C implementation of ElementTree. 1. If XMLParser.__init__ is called twice, it leaks references and the Expat parser. Possible solution: use the Py_XSETREF() macro instead of simple assignment. The Expat parser needs special handling. Other possible solution: drop __init__() and make all initialization in __new__(). But this is a solution only for 3.7 because can break compatibility. 2. If XMLParser.__init__ is not called or if it fails to initialize the Expat parser, self->entity and self->target are NULL. Later in xmlparser_getattro() they are increfed unconditionally. |
|
|
msg302107 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2017-09-13 19:13 |
Feel free to provide a separate pull request. These issues seem independent of the exception handling problem that I wrote a fix for. |
|
|
msg302169 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-09-14 12:11 |
I agree. While these issues are not totally independent (they changes the same code), my issue needs uncommon use of __new__ and __init__, while your issue can be exposed in normal cases. |
|
|
msg302205 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-09-14 20:00 |
New changeset c8d8e15bfc24abeeaaf3d8be9073276b0c011cdf by Serhiy Storchaka (scoder) in branch 'master': bpo-31455: Fix an assertion failure in ElementTree.XMLParser(). (#3545) https://github.com/python/cpython/commit/c8d8e15bfc24abeeaaf3d8be9073276b0c011cdf |
|
|
msg302224 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-09-14 23:13 |
New changeset 49caab46f687eb201898fb6c2c40d47bdcb0e58b by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6': [3.6] bpo-31455: Fix an assertion failure in ElementTree.XMLParser(). (GH-3545) (#3585) https://github.com/python/cpython/commit/49caab46f687eb201898fb6c2c40d47bdcb0e58b |
|
|
msg304324 - (view) |
Author: Oren Milman (Oren Milman) * |
Date: 2017-10-13 08:05 |
Serhiy, in addition to the problems you mentioned with not calling __init__(), it seems that calling every method of an uninitialized XMLParser object would crash. If you don't mind, i would be happy to open an issue to fix these crashes. |
|
|
msg304338 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-10-13 13:51 |
It would be nice. But I see you already have opened for reference leaks. I think that other problems can be solved in the same issue. Do you mind to backport your patch to 2.7 Stefan? If this makes sense. Otherwise I'll just close this issue. Live exceptions don't cause crash in 2.7. |
|
|
msg304383 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2017-10-14 06:58 |
Backport PR for 2.7 added: 3992 |
|
|
msg314358 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-03-24 06:01 |
New changeset 0694b6a651ba2a53f6323ffb3b23358f43885815 by Serhiy Storchaka (scoder) in branch '2.7': bpo-31544: Avoid calling "PyObject_GetAttrString()" (and potentially executing user code) with a live exception set. (GH-3992) https://github.com/python/cpython/commit/0694b6a651ba2a53f6323ffb3b23358f43885815 |
|
|
msg314722 - (view) |
Author: Zachary Ware (zach.ware) *  |
Date: 2018-03-30 21:38 |
This added a refleak on 2.7, see http://buildbot.python.org/all/#/builders/78/builds/122 and later builds. Also, the bug ID in the blurb is incorrect (31544 vs 31455). |
|
|
msg314733 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2018-03-31 08:28 |
Right, Zachary, thanks for noticing. Py2.7 is actually way more different than I thought, and I hadn't paid enough attention to that. Py3 does all of this in "__init__", whereas Py2 essentially implements "__new__" in C, which requires more cleanup. BTW, the implementation in Py3 would also benefit from refactoring the error handling code and moving it all in one place. But it shouldn't suffer from the same kind of problem, at least. |
|
|
msg314735 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2018-03-31 10:13 |
PR 6318 fixes the reference leak for Py2.7. |
|
|
msg314741 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-03-31 12:28 |
New changeset c498cd8bf81fc47acf2f1f702e8b3bc9bd4aed65 by Serhiy Storchaka (scoder) in branch '2.7': bpo-31544: Fix a reference leak to 'self' after the previous target error handling fixes. (GH-6318) https://github.com/python/cpython/commit/c498cd8bf81fc47acf2f1f702e8b3bc9bd4aed65 |
|
|