Issue 17016: _sre: avoid relying on pointer overflow (original) (raw)

Issue17016

Created on 2013-01-22 15:54 by Nickolai.Zeldovich, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pp.patch Nickolai.Zeldovich,2013-01-22 15:54 Proposed patch for _sre.c review
pp-2.patch Nickolai.Zeldovich,2013-03-10 18:54 review
pp-3.patch Nickolai.Zeldovich,2013-03-11 14:38 review
Messages (14)
msg180403 - (view) Author: Nickolai Zeldovich (Nickolai.Zeldovich) * Date: 2013-01-22 15:54
Modules/_sre.c relies on pointer overflow in 5 places to check that the supplied offset does not cause wraparound when added to a base pointer; e.g.: SRE_CODE prefix_len; GET_ARG; prefix_len = arg; GET_ARG; /* Here comes the prefix string */ if (code+prefix_len < code | code+prefix_len > newcode) FAIL; however, pointer wraparound is undefined behavior in C, and gcc will optimize away (code+prefix_len < code) to (true), since prefix_len is an unsigned value. This will happen with -O2 and even with -fwrapv: nickolai@sahara:/tmp$ cat x.c void bar(); void foo(int *p, unsigned int x) { if (p + x < p) bar(); } nickolai@sahara:/tmp$ gcc x.c -S -o - -O2 -fwrapv ... foo: .LFB0: .cfi_startproc rep ret .cfi_endproc ... nickolai@sahara:/tmp$ On a 32-bit platform with the development version of cpython, prefix_len seems to end up being an 'unsigned int', so I suspect that supplying a large prefix_len value (perhaps 0xffffffff) could lead to the subsequent loop writing garbage all over memory, or worse (but I have not tried to construct a concrete input that triggers this bug, so maybe there are some checks that make it difficult to trigger the bug). In any case, this might be worth fixing -- the attached patch provides one proposed fix. Another option might be to add -fno-strict-overflow to the gcc flags, which may be a reasonable additional measure to take, to avoid such problems biting Python in the future, but I would suggest doing this in addition to fixing the code (since not all compilers support such a flag to disable certain optimizations).
msg180411 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-22 16:50
LGTM. There are other doubtful places, at lines: 658, 678, 1000, 1084, 2777, 3111.
msg180443 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2013-01-23 03:03
Lines 1000 and 1084 will be a problem only if you're near the top of the address space. This is because: 1. ctx->pattern[1] will always be <= ctx->pattern[2]. 2. A value of 65535 in ctx->pattern[2] means unlimited, even though SRE_CODE is now UCS4. See also issue #13169. If the 'unlimited' value is raised then fixing those lines will become more urgent.
msg180473 - (view) Author: Nickolai Zeldovich (Nickolai.Zeldovich) * Date: 2013-01-23 16:56
Lines 2777 and 3111 do indeed look suspect, because gcc can compile (ptr + offset < ptr) into (offset < 0): nickolai@sahara:/tmp$ cat x.c void bar(); void foo(char* ptr, int offset) { if (ptr + offset < ptr) bar(); } nickolai@sahara:/tmp$ gcc x.c -S -o - -O2 ... foo: .LFB0: .cfi_startproc testl %esi, %esi js .L4 rep ret .p2align 4,,10 .p2align 3 .L4: xorl %eax, %eax jmp bar .cfi_endproc ... nickolai@sahara:/tmp$ Lines 658, 678, 1000, 1084 are potentially problematic -- I don't know of current compilers that will do something unexpected, but it might be worth rewriting the code to avoid undefined behavior anyway.
msg180484 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2013-01-23 18:13
You're checking "int offset", but what happens with "unsigned int offset"?
msg180485 - (view) Author: Nickolai Zeldovich (Nickolai.Zeldovich) * Date: 2013-01-23 18:14
For an unsigned int offset, see my original bug report: gcc eliminates the check altogether, since offset >= 0 by definition.
msg182231 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-16 15:49
Nickolai, are you want to update your patch with fixes for other possible pointer overflows? Note, that the maximal repetition number has been increased now.
msg183891 - (view) Author: Nickolai Zeldovich (Nickolai.Zeldovich) * Date: 2013-03-10 18:54
Sorry for the delay. Attached is an updated patch that should fix all of the issues mentioned in this bug report.
msg183959 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-03-11 13:22
Nickolai, can you please submit a contributor form? http://python.org/psf/contrib/contrib-form/ http://python.org/psf/contrib/
msg183960 - (view) Author: Nickolai Zeldovich (Nickolai.Zeldovich) * Date: 2013-03-11 13:31
I just submitted the contributor form -- thanks for the reminder.
msg183965 - (view) Author: Nickolai Zeldovich (Nickolai.Zeldovich) * Date: 2013-03-11 14:38
I get an HTTP error when trying to upload another patch through Rietveld, so here's a revised patch that avoids the need for Py_uintptr_t (thanks Serhiy).
msg184184 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-03-14 19:51
Of course it would be nice to have the tests for so much cases as possible, but I am afraid that it will not be easy. The patch LGTM.
msg186778 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-04-13 18:18
New changeset 27162465316f by Serhiy Storchaka in branch '2.7': Issue #17016: Get rid of possible pointer wraparounds and integer overflows http://hg.python.org/cpython/rev/27162465316f New changeset 2673d207c524 by Serhiy Storchaka in branch '3.3': Issue #17016: Get rid of possible pointer wraparounds and integer overflows http://hg.python.org/cpython/rev/2673d207c524 New changeset f280786d0e64 by Serhiy Storchaka in branch 'default': Issue #17016: Get rid of possible pointer wraparounds and integer overflows http://hg.python.org/cpython/rev/f280786d0e64
msg186840 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-04-13 20:49
Thank you, Nickolai, for the patch.
History
Date User Action Args
2022-04-11 14:57:40 admin set github: 61218
2013-10-31 15:03:20 vstinner set nosy: + vstinner
2013-04-13 20:49:35 serhiy.storchaka set messages: +
2013-04-13 20:48:30 serhiy.storchaka set status: open -> closedstage: patch review -> resolvedresolution: fixedcomponents: + Extension Modulesversions: - Python 3.2
2013-04-13 18🔞52 python-dev set nosy: + python-devmessages: +
2013-03-14 19:51:09 serhiy.storchaka set messages: +
2013-03-11 14:38:42 Nickolai.Zeldovich set files: + pp-3.patchmessages: +
2013-03-11 13:31:44 Nickolai.Zeldovich set messages: +
2013-03-11 13:22:56 serhiy.storchaka set messages: +
2013-03-10 18:54:30 Nickolai.Zeldovich set files: + pp-2.patchmessages: +
2013-02-16 15:49:44 serhiy.storchaka set assignee: serhiy.storchakamessages: +
2013-01-23 18:14:41 Nickolai.Zeldovich set messages: +
2013-01-23 18:13:47 mrabarnett set messages: +
2013-01-23 16:56:35 Nickolai.Zeldovich set messages: +
2013-01-23 03:03:08 mrabarnett set messages: +
2013-01-22 16:50:43 serhiy.storchaka set messages: +
2013-01-22 15:56:51 ezio.melotti set nosy: + ezio.melotti, mark.dickinson, serhiy.storchaka, mrabarnettstage: patch reviewcomponents: + Regular Expressions, - Noneversions: + Python 2.7, Python 3.2, Python 3.3, Python 3.4, - Python 3.5
2013-01-22 15:54:28 Nickolai.Zeldovich create