msg76849 - (view) |
Author: (gumpy) |
Date: 2008-12-03 22:06 |
I'm unsure of the expected behavior in this case but it seems odd. The bytearray in the following example can be resized to a length of 5-10 bytes without throwing an exception. Python 3.0rc3 (r30rc3:67312, Dec 3 2008, 10:38:14) [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2 >>> b = bytearray(b'x' * 10) >>> v = memoryview(b) >>> b[:] = b'y' * 11 Traceback (most recent call last): File "", line 1, in BufferError: Existing exports of data: object cannot be re-sized >>> b[:] = b'y' * 5 >>> b bytearray(b'yyyyy') >>> v.tobytes() b'yyyyy\x00xxxx' >>> v2 = memoryview(b) >>> v2.tobytes() b'yyyyy' |
|
|
msg76880 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-12-04 12:25 |
It's not a memoryview bug, but a bytearray oddity. The bytearray uses a variable-sized buffer underneath, and it tries to minimize the number of reallocations when changing the object length through some simple heuristics. Therefore, a bytearray has both a logical size (the one which is seen from Python, e.g. len()) and a physical size (which can be greater than the logical size, due to those heuristics). The bug here is that the bytearray only prohibits changing the physical size, not the logical size when there is a buffer pointing to it. This also explains the "\00" you are witnessing at the 5th byte when calling tobytes() on the memoryview object: it is the end-of-string marker which has been inserted when changing the logical size. |
|
|
msg77054 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-12-05 18:56 |
Please see patch at http://codereview.appspot.com/10049 |
|
|
msg77095 - (view) |
Author: (gumpy) |
Date: 2008-12-06 02:10 |
I found another related bug. In bytes_setslice, when the buffer is resized to a smaller size, a memmove happens regardless of whether the resize is successful or not. >>> b = bytearray(range(10)) >>> m = memoryview(b) >>> b[1:8] = b'X' Traceback (most recent call last): File "", line 1, in BufferError: Existing exports of data: object cannot be re-sized >>> b bytearray(b'\x00\x01\x08\t\x04\x05\x06\x07\x08\t') The same problem also applies to bytes_remove: >>> b bytearray(b'\x02\x03\x04\x05\x06\x07\x08\t') >>> b.remove(2) Traceback (most recent call last): File "", line 1, in BufferError: Existing exports of data: object cannot be re-sized >>> b bytearray(b'\x03\x04\x05\x06\x07\x08\t\x00') There may be other places this can happen but I haven't checked yet. |
|
|
msg77098 - (view) |
Author: (gumpy) |
Date: 2008-12-06 04:11 |
Sorry, forgot to give this issue a more accurate title earlier. |
|
|
msg77100 - (view) |
Author: (gumpy) |
Date: 2008-12-06 05:25 |
I've found that arrays from the array module have similar issues: >>> a = array.array('i', range(2)) >>> m = memoryview(a) >>> bytes(m) b'\x00\x00\x00\x00\x01\x00\x00\x00' >>> a.pop(0) 0 >>> bytes(m) b'\x01\x00\x00\x00\x01\x00\x00\x00' |
|
|
msg77119 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-12-06 12:46 |
> There may be other places this can happen but I haven't checked yet. PyByteArray_Resize() is called in various places in bytearrayobject.c, some of them where it is mandatory to mutate the underlying storage before reallocating it. The solution would be to have a separate function to check whether resizing is allowed (but it would not solve the problem in the face of MemoryErrors). |
|
|
msg77135 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-12-06 15:31 |
New bytearray patch at http://codereview.appspot.com/10049. I think I've covered all bases. array.array will need another patch (I must admit I care a bit less about it, since it's not a builtin type). The patch will have to be backported for 2.6/2.7 as well, but memoryview doesn't exist there, so the tests will have to be disabled. |
|
|
msg77186 - (view) |
Author: (gumpy) |
Date: 2008-12-06 23:21 |
It turns out the problems in array are more serious than I thought and allow writing to unallocated memory through a memoryview leading to memory corruption, segfaults and possibly exploits. The following example extends an array enough to trigger a realloc of the array's buffer. Python 3.0 (r30:67503, Dec 4 2008, 13:30:57) [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from array import array >>> a = array('i', range(16)) >>> m = memoryview(a) >>> a.extend(array('i', range(48)) ... ) >>> m[:] = array('i', [0] * (len(m) // m.itemsize)) *** glibc detected *** python3.0: corrupted double-linked list: 0x0822c1f8 *** |
|
|
msg77196 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-12-07 00:07 |
The segfault happens even when the array is not being resized, I've opened a separate bug for it: #4509. |
|
|
msg77197 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-12-07 00:09 |
The bytearray patch has been committed to 2.6, 2.7, 3.0, 3.1. Now the array.array problem remains. |
|
|
msg77198 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-12-07 00:12 |
Sorry, typo: the segfault issue is #4569. |
|
|
msg77249 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-12-07 20:32 |
Found another bug with memoryview and arrays in #4580. |
|
|
msg77262 - (view) |
Author: (gumpy) |
Date: 2008-12-07 21:34 |
I've opened a new memoryview/array segfault issue since #4569 was closed: #4583 |
|
|
msg90143 - (view) |
Author: Alexandre Vassalotti (alexandre.vassalotti) *  |
Date: 2009-07-05 05:40 |
Fixed the array bug in r73850. Is there any bug left to fixed that were reported in this issue? |
|
|
msg90802 - (view) |
Author: Alexandre Vassalotti (alexandre.vassalotti) *  |
Date: 2009-07-22 04:15 |
Closing as I don't see any other bugs in this issue to fix. |
|
|