Issue 1526585: Concatenation on a long string breaks (original) (raw)

Created on 2006-07-21 17:18 by exarkun, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
big-str.diff nnorwitz,2006-07-25 03:46 v1
big-str-2.diff arigo,2006-07-25 18:25 Fix namespace corruption problem too.
Messages (16)
msg29234 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2006-07-21 17:18
Consider this transcript: exarkun@charm:~/Projects/python/trunk$ ./python Python 2.5b2 (trunk:50698, Jul 18 2006, 10:08:36) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> x = 'x' * (2 ** 31 - 1) >>> x = x + 'x' Traceback (most recent call last): File "", line 1, in SystemError: Objects/stringobject.c:4103: bad argument to internal function >>> len(x) Traceback (most recent call last): File "", line 1, in NameError: name 'x' is not defined >>> I would expect some exception other than SystemError and for the locals namespace to not become corrupted.
msg29235 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2006-07-22 09:00
Logged In: YES user_id=6656 Confirmed with 2.4. Ouch.
msg29236 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-07-22 14:11
Logged In: YES user_id=31435 Part of the problem appears to be that ceval.c's string_concatenate() doesn't check for overflow in the second argument here: if (_PyString_Resize(&v, v_len + w_len) != 0) The Windows malloc on my box returns NULL (and so Python raises MemoryError) on the initial: x = 'x' * (2 ** 31 - 1) attempt, so I never get that far. I'm afraid this is muddier in strange ways on Linux because the Linux malloc() is pathologically optimistic (can return a non-NULL value pointing at "memory" that can't actually be used for anything).
msg29237 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-07-25 03:46
Logged In: YES user_id=33168 Attached a patch against 2.5. The patch should work against 2.4 too, but you will need to change all Py_ssize_t to int. Tim since we know both lens are >= 0, is this test sufficient for verifying overflow: Py_ssize_t new_len = v_len + w_len; if (new_len < 0) { Jp, can you test the patch?
msg29238 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-07-25 03:57
Logged In: YES user_id=31435 [Neal] > Tim since we know both lens are >= 0, is this test > sufficient for verifying overflow: > > Py_ssize_t new_len = v_len + w_len; > if (new_len < 0) { Right! That's all the new checking needed in string_concatenate().
msg29239 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-07-25 18:25
Logged In: YES user_id=4771 The check should be done before the variable is removed from the namespace, so that 'x' doesn't just disappear. Attached a patch that does this. Also, let's produce an exception consistent with what PyString_Concat() raises in the same case, which is an OverflowError instead of a MemoryError.
msg29240 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2006-07-25 20:06
Logged In: YES user_id=366566 Tested Armin's patch, looks like it addresses the issue. One thing I noticed though, ('x' * (2 ** 32 - 2) + 'x') fails the same way ('x' * (2 ** 32 - 1) + 'x'). Don't really understand why, just thought I'd mention it.
msg29241 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-07-26 04:19
Logged In: YES user_id=33168 Armin, can you check in your patch and backport? Also a news entry and a test in test_bigmem would be great. Thanks. If not let me know (assign to me) and I'll finish this up. This fix should be backported as well.
msg29242 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-07-26 09:01
Logged In: YES user_id=4771 I'm unsure about how the bigmem tests should be used. I think that I am not supposed to set a >2G limit on a 32-bit machine, right? When I do, I get 9 failures: 8 OverflowErrors and a stranger AssertionError in test_hash. I think that these tests are meant to test the int/Py_ssize_t difference on 64-bit machines instead. The bug this tracker is about only shows up on 32-bit machines. My concern is that if we add a test for the current bug in test_bigmem, then with a memory limit < 2GB the test is essentially skipped, and with a memory limit > 2GB then 9 other tests fail anyway.
msg29243 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-07-26 15:50
Logged In: YES user_id=33168 You're correct that bigmem is primarily for testing int/Py_ssize_t. But it doesn't have to be. It has support for machines with largish amounts of memory (and limiting test runs). I didn't know where else to put such a test. I agree that this bug would only occur on 32-bit platforms. Most machines can't run it, so about the only other option I can think of would be to put it in it's own file and add a -u option. That seemed like even more work. I'm not tied to bigmem at all, but it would be nice to have a test for this somewhere. I'm sure there are a bunch of other places we have this sort of overflow and it would be good to test them somewhere. Do whatever you think is best.
msg29244 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-07-27 08:38
Logged In: YES user_id=4771 We could reuse the --memlimit option of regrtest in the following way: At the moment it makes no sense to specify a --memlimit larger than Py_ssize_t, like 3GB on 32-bit systems. At least test_bigmem fails completely in this case. From this it seems that the --memlimit actually tells, more precisely, how much of its *address space* the Python test process is allowed to consume. So the value should be clamped to a maximum of MAX(Py_ssize_t). This would solve the current test_bigmem issue. If we do so, then the condition "--memlimit >= MAX(Py_ssize_t)" is precisely what should be checked to know if we can run the test for the bug in the present tracker, and other tests of the same kind, which check what occurs when the *address space* is exhausted. In this way, specifying --memlimit=3G would enable either test_bigmem (on 64-bit systems) or some new test_filladdressspace (on 32-bit systems), as appropriate. Sounds reasonable?
msg29245 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-07-27 08:48
Logged In: YES user_id=4771 Almost missed kuran's note. Kuran: I suppose you meant to use 2**31 instead of 2**32, but you've found another important bug: >>> s = 'x' * (2**32-2) >>> N = len(s) >>> N 2147483647 >>> 2**32 4294967296L Argh! Another check is missing somewhere.
msg29246 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-08-04 05:27
Logged In: YES user_id=33168 Armin, yes that sounds reasonable. Please checkin as soon as possible now that the trunk is not frozen.
msg29247 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-08-08 09:24
Logged In: YES user_id=4771 I was away. I will try to get around to it before release candidate one.
msg29248 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-08-09 15:39
Logged In: YES user_id=4771 Committed in rev 51178. Closing this report; the repetition problem is in another tracker and is mentioned in PEP 356 (the 2.5 release schedule).
msg29249 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-08-09 15:42
Logged In: YES user_id=4771 Committed in rev 51178. Closing this report; the repetition problem is in another tracker and is mentioned in PEP 356 (the 2.5 release schedule).
History
Date User Action Args
2022-04-11 14:56:19 admin set github: 43706
2006-07-21 17🔞58 exarkun create