msg95952 - (view) |
Author: Pauli Virtanen (pv) * |
Date: 2009-12-03 22:40 |
The following code causes a segmentation fault (or glibc error, or other problems): >>> x = someobject() >>> y = memoryview(x) >>> z = memoryview(y) The problem is that someobject.bf_releasebuffer will be called two times with an identical Py_buffer structure. This can be seen in memoryobject.c: static int memory_getbuf(PyMemoryViewObject *self, Py_buffer *view, int flags) { int res = 0; /* XXX for whatever reason fixing the flags seems necessary */ if (self->view.readonly) flags &= ~PyBUF_WRITABLE; if (self->view.obj != NULL) res = PyObject_GetBuffer(self->view.obj, view, flags); if (view) dup_buffer(view, &self->view); return res; } At the end of the call, view and self->view contain identical data because of the call to dup_buffer. static void memory_releasebuf(PyMemoryViewObject *self, Py_buffer *view) { PyBuffer_Release(view); } But when the outer memoryview is destroyed, memory_releasebuf calls PyBuffer_Release for the original object once. And when the inner memoryview is destroyed, PyBuffer_Release is called by memory_dealloc the second time. Both calls supply an identical Py_buffer structure. Now, if the original object's bf_getbuffer and bf_releasebuffer allocate some memory dynamically, this will likely cause a double-free of memory, usually leading to a segmentation fault. There is no feasible way the bf_releasebuffer can keep track of how many calls to it have been made. So probably the code in memory_getbuf is wrong -- at least the dup_buffer function looks wrong. |
|
|
msg95961 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-12-04 13:29 |
Why do you say that: > There is no feasible way the bf_releasebuffer can keep track of how many > calls to it have been made. Because that's exactly what e.g. bytearray objects do (see the ob_exports field in Objects/bytearrayobject.c). |
|
|
msg95963 - (view) |
Author: Pauli Virtanen (pv) * |
Date: 2009-12-04 13:56 |
> Why do you say that: > > > There is no feasible way the bf_releasebuffer can keep track of how > > many calls to it have been made. I was probably thinking about allocating new temporary arrays for strides etc. on each *_getbuffer -- if that's done, then manually keeping track of all the allocated memory seems like a waste of effort (ie. not feasible). But yes, if memory allocated for entries in Py_buffer is shared between all exported buffer views, that sounds better -- for some reason I didn't think about that... So we'll do it like this in Numpy then. But still, I take it that the way it currently works is not the intended behavior? The segmentation faults caused by this came as a bit of a surprise to me, as the assumption about paired *_getbuffer and *_releasebuffer calls is very natural. |
|
|
msg95964 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-12-04 14:05 |
> I was probably thinking about allocating new temporary arrays for > strides etc. on each *_getbuffer -- if that's done, then manually > keeping track of all the allocated memory seems like a waste of effort > (ie. not feasible). Yes, I know it looks very painful to do so. I am not responsible for the Py_buffer / memorview design, however. Travis Oliphant is, and I hear he's a member of the Numpy community: you might want to ask him for advice. (the general problem is that managing Py_buffers can entail memory allocations, but Py_buffer is not a PyObject and therefore you can't take advantage of Python's general object management facilities) > But still, I take it that the way it currently works is not the intended > behavior? The segmentation faults caused by this came as a bit of a > surprise to me, as the assumption about paired *_getbuffer and > *_releasebuffer calls is very natural. Well, those calls still /are/ paired, aren't they? There is one odd thing which you must be careful about, it is that *_getbuffer can be called with a NULL Py_buffer pointer. |
|
|
msg95965 - (view) |
Author: Pauli Virtanen (pv) * |
Date: 2009-12-04 14:24 |
I think this is an implementation issue in MemoryView rather than an issue with the buffer interface. PEP 3118 states, "This same bufferinfo structure must be passed to bf_releasebuffer (if available) when the consumer is done with the memory." -- this is not guaranteed by the current MemoryView implementation. The calls are not paired: the *_getbuf calls fill in structures with data "view1", and "view2". The *_releasebuf calls receive structures with data "view1", and "view1". The data filled in the second getbuf call ("view2") is never passed back to *_releasebuf, as it is overwritten with "view1" data by dup_buffer. To work around this, *_releasebuf must be written so that it does not use the "view" pointer passed to it -- the data structure may have been shallow copied and any memory pointers in it may have already been freed. I can try to cook up a patch fixing this. |
|
|
msg95967 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-12-04 14:39 |
> To work around this, *_releasebuf must be written so that it does not > use the "view" pointer passed to it -- the data structure may have been > shallow copied and any memory pointers in it may have already been freed. > > I can try to cook up a patch fixing this. If you can do it without breaking the current unit tests (Lib/test/test_memoryview.py) this would be nice indeed. |
|
|
msg102277 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2010-04-03 15:38 |
> I can try to cook up a patch fixing this. Did you wrote something? I don't see any patch attached to this issue :-/ |
|
|
msg141861 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2011-08-10 12:43 |
This should be fixed in features/pep-3118. |
|
|
msg147971 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2011-11-19 21:10 |
My last comment could be misinterpreted: This *is* actually already fixed in features/pep-3118. |
|
|
msg152992 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2012-02-09 22:31 |
Closing, since it's fixed in #10181. |
|
|
msg154239 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-02-25 11:25 |
New changeset 3f9b3b6f7ff0 by Stefan Krah in branch 'default': - Issue #10181: New memoryview implementation fixes multiple ownership http://hg.python.org/cpython/rev/3f9b3b6f7ff0 |
|
|