msg331604 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-12-11 10:54 |
There is an issue with using PyDict_GetItem(). Since it silences all exceptions, it can return incorrect result when an exception like MemoryError or KeyboardInterrupt was raised in the user __hash__() and __eq__(). In addition PyDict_GetItemString() and _PyDict_GetItemId() swallow a MemoryError raised when fail to allocate a temporary string object. In addition, PyDict_GetItemWithError() is a tiny bit faster than PyDict_GetItem(), because it avoids checking the exception state in successful case. The proposed PR replaces most calls of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() with calls of PyDict_GetItemWithError(), _PyDict_GetItemStringWithError() and _PyDict_GetItemIdWithError(). |
|
|
msg331605 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-12-11 10:56 |
My previous attempt: bpo-20615. |
|
|
msg331621 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-12-11 13:31 |
Opened for documenting flaws of PyDict_GetItem(). |
|
|
msg331744 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-12-13 07:13 |
Most of changes are straightforward. Just replaced PyDict_GetItem*() with PyDict_GetItem*WithError() and added the check for PyErr_Occurred(). PyDict_GetItemString() with constant argument was replaced with _PyDict_GetItemIdWithError() for performance. Some code was left unchanged. This was mostly in files where errors are very and error checking is not performed or errors are silenced in any case (Python/compile.c, Python/symtable.c, Objects/structseq.c, etc). These cases needed separate issues. The most non-trivial change is in Objects/typeobject.c. The check for duplicated descriptors (in add_methods(), add_members() and add_getset()) was moved after creating the descriptor object. This improves performance by avoiding to create a temporary string objects. Duplicate descriptor names is a very uncommon case -- there were only two cases in the stdlib (besides tests), and one of them already is fixed (PR 11053). |
|
|
msg335541 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-02-14 16:37 |
Eric requested to run the benchmark suite. Here are results. I do not know how to interpret them. Likely all differences are random. |
|
|
msg335651 - (view) |
Author: Eric Snow (eric.snow) *  |
Date: 2019-02-15 22:20 |
Thanks, Serhiy. While the benchmark suite is our best tool available for measuring performance, I'm not sure what slowdown is significant in those results. @Victor, any thoughts? |
|
|
msg336536 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-02-25 15:59 |
New changeset a24107b04c1277e3c1105f98aff5bfa3a98b33a0 by Serhiy Storchaka in branch 'master': bpo-35459: Use PyDict_GetItemWithError() instead of PyDict_GetItem(). (GH-11112) https://github.com/python/cpython/commit/a24107b04c1277e3c1105f98aff5bfa3a98b33a0 |
|
|
msg336563 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-02-25 21:47 |
It seems like the change introduced a regression: bpo-36110. |
|
|
msg336647 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-02-26 11:21 |
> It seems like the change introduced a regression: bpo-36110. While the test has been fixed, IMHO we need to better document this subtle behavior change since it can be surprising. The test failed because PyObject_GetAttr() no longer ignores exceptions when getting the attribute from the type dictionary. It's a significant change compared to Python 3.7. Right now, the change has not even a NEWS entry, whereas it's a backward incompatible change! Don't get my wrong: the change is correct, we don't want to ignore arbitrary exceptions, it's just a matter of documentation. |
|
|
msg336686 - (view) |
Author: Josh Rosenberg (josh.r) *  |
Date: 2019-02-26 15:51 |
#36110 was closed as a duplicate; the superseder is #36109 (which has been fixed). The change should still be documented, just in case anyone gets bitten by it. |
|
|
msg339490 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2019-04-05 10:13 |
Serhiy, can this issue be closed? |
|
|
msg339616 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-04-08 11:34 |
New changeset 7a0630c530121725136526a88c49589b54da6492 by Serhiy Storchaka in branch 'master': Add a What's New entry for bpo-35459. (GH-12706) https://github.com/python/cpython/commit/7a0630c530121725136526a88c49589b54da6492 |
|
|
msg339830 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-04-10 07:31 |
There are few occurrences of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() in cases where they are unlikely failed. These cases will be considered in separate issues. |
|
|
msg355667 - (view) |
Author: Raphaël M (raphaelm) |
Date: 2019-10-29 18:39 |
I stumbled upon this issue while reading Python 3.8 and this made me curious. I've tried writing some Python code to reproduce this bug, but I'm unable to -- I should be missing something. Is there a simple snippet showing the issue? Also, the changelog states that "The CPython interpreter can swallow exceptions in some circumstances". Are there other documented cases of those circumstances? |
|
|
msg359480 - (view) |
Author: Raphaël M (raphaelm) |
Date: 2020-01-06 22:58 |
Any pointer would also be welcome, and a piece of code showing the bug would help me a lot. Thank you. |
|
|
msg359489 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2020-01-07 05:38 |
> I've tried writing some Python code to reproduce this bug, but I'm unable to -- I should be missing something. Is there a simple snippet showing the issue? Note that this is a bug from long ago. Why this bug had lived long is it can not happen in regular cases. So it is difficult to reproduce. See PR 11112. _csv module is changed to use PyDict_GetItemWithError. Let's try it on Python 3.7. Python 3.7.6 (default, Dec 30 2019, 19:38:28) [Clang 11.0.0 (clang-1100.0.33.16)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> class S(str): ... def __hash__(self): ... raise MemoryError ... >>> import _csv >>> _csv.Dialect(S("excel")) Traceback (most recent call last): File "", line 1, in _csv.Error: unknown dialect You can see the MemoryError is suppressed. Let's try it on Python 3.8. $ python3 Python 3.8.1 (default, Jan 6 2020, 16:02:33) (snip) >>> _csv.Dialect(S("excel")) Traceback (most recent call last): File "", line 1, in File "", line 3, in __hash__ MemoryError You can see the MemoryError is not suppressed. |
|
|
msg359490 - (view) |
Author: Raphaël M (raphaelm) |
Date: 2020-01-07 06:42 |
Thank you very much! |
|
|