Issue 18427: str.replace causes segfault for long strings (original) (raw)

Created on 2013-07-11 09:40 by tobiasmarschall, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue-18427-py27.txt ronaldoussoren,2013-07-11 10:50
Messages (12)
msg192855 - (view) Author: Tobias Marschall (tobiasmarschall) Date: 2013-07-11 09:40
Running the following two lines of code causes a segfault: s = 'A' * 3142044943 t = s.replace('\n','') My setup: Python 2.7.5 (default, Jul 11 2013, 11:17:50) [GCC 4.6.3] on linux2 Linux 3.2.0-45-generic #70-Ubuntu SMP Wed May 29 20:12:06 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux I could reproduce this behavior on Python 2.6.8 and 2.5.6. Please let me know if I can provide additional information.
msg192858 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-07-11 10:20
#0 memchr () at ../sysdeps/x86_64/memchr.S:155 #1 0x0000000000467ca4 in countchar (maxcount=, c=10 '\n', target_len=-1152922353, target=0x7fff3b196034 'A' <repeats 200 times>...) at Objects/stringobject.c:2341 #2 replace_delete_single_character (maxcount=, from_c=10 '\n', self=0x7fff3b196010) at Objects/stringobject.c:2427 #3 replace (maxcount=, to_len=, to_s=, from_len=1, from_s=, self=0x7fff3b196010) at Objects/stringobject.c:2780 #4 string_replace (self=0x7fff3b196010, args=) at Objects/stringobject.c:2854 target_len=-1152922353 looks fishy. I think countchar()'s target_len argument has a wrong argument type. It should be Py_ssize_t instead of int. -countchar(const char *target, int target_len, char c, Py_ssize_t maxcount) +countchar(const char *target, Py_ssize_t target_len, char c, Py_ssize_t maxcount)
msg192859 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-07-11 10:21
PS: The test case no longer segfaults with Py_ssize_t.
msg192861 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-11 10:39
I agree: target_len should have type Py_ssize_t. It's probably worthwhile to check other functions as well.
msg192862 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-11 10:45
There is also this one in string_print: fwrite(data, 1, (int)size, fp); "size" is a Py_ssize_t variable with the length of the string, and is casted to int which looses information. The exected argument to fwrite is size_t... These two appear to be the only suspect uses of 'int' in stringobject.h.
msg192863 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-11 10:50
The attached patch should fix both issues (but doesn't contain a test, not sure if its worthwhile to add a testcase that uses more than 4 GByte of memory) FWIW: the corresponding code in Objects/bytesobject.c in the 3.3 and default branches is already correct.
msg192865 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-07-11 11:05
LGTM!
msg192866 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-11 11:16
Great. I'll apply after running the full testsuite, and after rebooting my machine as it doesn't really like using that much additional memory and pushed some applications to disk :-(
msg192867 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-07-11 11:35
New changeset 2921f6c2009e by Ronald Oussoren in branch '2.7': Issue #18427: str.replace could crash the interpreter with huge strings. http://hg.python.org/cpython/rev/2921f6c2009e
msg192892 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-07-11 19:47
I think NEWS entry for this issue should be in the "Core and Builtins" section.
msg193117 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-07-15 16:35
New changeset 06d9f106c57e by Ronald Oussoren in branch '2.7': Move entry from #18427 to the right section in the NEWS file http://hg.python.org/cpython/rev/06d9f106c57e
msg193118 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-15 16:36
Serhiy: you're right. I've moved the message to the correct section.
History
Date User Action Args
2022-04-11 14:57:47 admin set github: 62627
2013-07-15 16:36:35 ronaldoussoren set messages: +
2013-07-15 16:35:57 python-dev set messages: +
2013-07-11 19:47:24 serhiy.storchaka set messages: +
2013-07-11 11:35:53 ronaldoussoren set status: open -> closedresolution: fixedstage: patch review -> resolved
2013-07-11 11:35:28 python-dev set nosy: + python-devmessages: +
2013-07-11 11:16:52 ronaldoussoren set messages: +
2013-07-11 11:05:31 christian.heimes set messages: +
2013-07-11 10:50:11 ronaldoussoren set keywords: + patch, needs reviewfiles: + issue-18427-py27.txtmessages: + stage: patch review
2013-07-11 10:45:39 ronaldoussoren set messages: +
2013-07-11 10:39:30 ronaldoussoren set nosy: + ronaldoussorenmessages: +
2013-07-11 10:21:23 christian.heimes set messages: +
2013-07-11 10:20:30 christian.heimes set nosy: + christian.heimesmessages: +
2013-07-11 10:06:09 serhiy.storchaka set nosy: + serhiy.storchaka
2013-07-11 09:40:39 tobiasmarschall create