Issue 29816: Get rid of C limitation for shift count in right shift (original) (raw)

Created on 2017-03-15 08:55 by serhiy.storchaka, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
patchDraft1.diff Oren Milman,2017-03-15 21:00 a patch draft for reference only. also handles big positive ints review
long-shift-overflow-long-long.diff serhiy.storchaka,2017-03-20 19:02
long-shift-overflow-divrem1.diff serhiy.storchaka,2017-03-20 19:02
Pull Requests
URL Status Linked Edit
PR 680 merged serhiy.storchaka,2017-03-15 20:25
PR 1258 merged serhiy.storchaka,2017-04-22 17:50
Messages (19)
msg289650 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-15 08:55
Currently the value of right operand of the right shift operator is limited by C Py_ssize_t type. >>> 1 >> 10**100 Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t >>> (-1) >> 10**100 Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t >>> 1 >> -10**100 Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t >>> (-1) >> -10**100 Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t But this is artificial limitation. Right shift can be extended to support arbitrary integers. `x >> very_large_value` should be 0 for non-negative x and -1 for negative x. `x >> negative_value` should raise ValueError. >>> 1 >> 10 0 >>> (-1) >> 10 -1 >>> 1 >> -10 Traceback (most recent call last): File "", line 1, in ValueError: negative shift count >>> (-1) >> -10 Traceback (most recent call last): File "", line 1, in ValueError: negative shift count
msg289651 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-03-15 08:57
If we change something, I suggest to be consistent with lshift. I expect a memory error on "1 << (1 << 1024)" (no unlimited loop before a global system collapse please ;-))
msg289652 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-03-15 09:00
FYI I saw recently that the C limitation of len() was reported in the "owasp-pysec" project: https://github.com/ebranca/owasp-pysec/wiki/Overflow-in-len-function I don't understand what such "deliberate" limitation was reported in a hardened CPython project?
msg289654 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-15 09:19
> If we change something, I suggest to be consistent with lshift. I expect a memory error on "1 << (1 << 1024)" (no unlimited loop before a global system collapse please ;-)) I agree that left shift should raise an ValueError rather than OverflowError for large negative shifts. But is hard to handle large positive shifts. `1 << count` consumes `count*2/15` bytes of memory. There is a gap between the maximal value of bits represented as Py_ssize_t (PY_SSIZE_T_MAX) and the number of bits of maximal Python int (PY_SSIZE_T_MAX*15/2). _PyLong_NumBits() starves from the same issue. I think an OverflowError is appropriate here for denoting the platform and implementation limitation.
msg289658 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-15 09:52
This may be a part of this issue or a separate issue: bytes(-1) raises a ValueError, but bytes(-10**100) raises an OverflowError.
msg289660 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-03-15 10:03
> I think an OverflowError is appropriate here for denoting the platform and implementation limitation. It's common that integer overflow on memory allocation in C code raises a MemoryError, not an OverflowError. >>> "x" * (2**60) Traceback (most recent call last): File "", line 1, in MemoryError I suggest to raise a MemoryError.
msg289662 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-15 10:17
This is not MemoryError. On 32-bit platform `1 << (sys.maxsize + 1)` raises an OverflowError, but `1 << sys.maxsize << 1` can be calculated.
msg289692 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-15 20:30
Unfortunately it is hard to totally avoid OverflowError in right shift. Righ shift of huge positive value can get non-zero result even if shift count is larger than PY_SSIZE_T_MAX. PR 680 just decreases the opportunity of getting a OverflowError.
msg289697 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-03-15 21:00
i played a little with a patch earlier today, but stopped because I am short on time. anyway, just in case my code is not totally rubbish, I attach my patch draft, which should avoid OverflowError also for big positive ints. (of course, I don't suggest to use my code instead of PR 680. I just put it here in case it might be useful for someone.) (on my Windows 10, it passed some manual tests by me, and the test module (except for test_venv, which fails also without the patch))
msg289751 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-17 09:36
Thank you Oren, but your code doesn't work when PY_SSIZE_T_MAX < b < PY_SSIZE_T_MAX * PyLong_SHIFT and a > 2 ** b. When you drop wordshift and left only loshift_d you should drop lower wordshift digits in a. The code for left shift would be even more complex.
msg289767 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-17 16:06
Updated PR. Now OverflowError is never raised if the result is representable. Mark, could you please make a review?
msg289878 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-03-20 08:53
> Mark, could you please make a review? I'll try to find time this week. At least in principle, the change sounds good to me.
msg289898 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-20 19:01
Here are two patches. The first uses C long long arithmetic (it corresponds current PR 680), the second uses PyLong arithmetic. What is easier to read and verify?
msg289984 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-03-22 14:04
I much prefer the `divrem1`-based version: it makes fewer assumptions about relative sizes of long / long long / size_t and about the number of bits per digit. I'd rather not have another place that would have to be carefully examined in the future if the number of bits per digit changed again. Overall, Objects/longobject.c is highly portable, and I'd like to keep it that way as much as possible.
msg290011 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-22 19:35
Updated the PR to divrem1-based version. The drawback is that divrem1 can fail with MemoryError while C long long arithmetic always works for integers of the size less than 1 exbibyte.
msg290012 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-22 19:52
The special case would be not needed if limit Python ints on 32-bit platforms to approximately 2**2**28. int.bit_length() could be simpler too.
msg290824 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-30 06:47
New changeset 918403cfc3304d27e80fb792357f40bb3ba69c4e by Serhiy Storchaka in branch 'master': bpo-29816: Shift operation now has less opportunity to raise OverflowError. (#680) https://github.com/python/cpython/commit/918403cfc3304d27e80fb792357f40bb3ba69c4e
msg290826 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-30 07:00
Thank you for your review Mark.
msg292133 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-22 18:50
New changeset 997a4adea606069e01beac6269920709db3994d1 by Serhiy Storchaka in branch 'master': Remove outdated note about constraining of the bit shift right operand. (#1258) https://github.com/python/cpython/commit/997a4adea606069e01beac6269920709db3994d1
History
Date User Action Args
2022-04-11 14:58:44 admin set github: 74002
2017-04-22 18:50:11 serhiy.storchaka set messages: +
2017-04-22 17:50:19 serhiy.storchaka set pull_requests: + <pull%5Frequest1371>
2017-03-30 07:00:24 serhiy.storchaka set status: open -> closedresolution: fixedmessages: + stage: patch review -> resolved
2017-03-30 06:47:09 serhiy.storchaka set messages: +
2017-03-22 19:52:55 serhiy.storchaka set messages: +
2017-03-22 19:35:29 serhiy.storchaka set messages: +
2017-03-22 14:04:56 mark.dickinson set messages: +
2017-03-20 19:02:33 serhiy.storchaka set files: + long-shift-overflow-divrem1.diff
2017-03-20 19:02:18 serhiy.storchaka set files: + long-shift-overflow-long-long.diff
2017-03-20 19:01:21 serhiy.storchaka set messages: +
2017-03-20 08:53:27 mark.dickinson set messages: +
2017-03-17 16:06:38 serhiy.storchaka set messages: +
2017-03-17 09:36:10 serhiy.storchaka set messages: +
2017-03-17 08:36:16 serhiy.storchaka link issue29833 dependencies
2017-03-15 21:00:49 Oren Milman set files: + patchDraft1.diffkeywords: + patchmessages: +
2017-03-15 20:30:52 serhiy.storchaka set messages: + stage: needs patch -> patch review
2017-03-15 20:25:42 serhiy.storchaka set pull_requests: + <pull%5Frequest557>
2017-03-15 10:17:07 serhiy.storchaka set messages: +
2017-03-15 10:03:27 vstinner set messages: +
2017-03-15 09:52:04 serhiy.storchaka set messages: +
2017-03-15 09:19:13 serhiy.storchaka set messages: +
2017-03-15 09:06:14 serhiy.storchaka link issue15988 dependencies
2017-03-15 09:00:29 vstinner set messages: +
2017-03-15 08:57:40 vstinner set nosy: + vstinnermessages: +
2017-03-15 08:55:17 serhiy.storchaka create