[Python-Dev] Allocation of shape and strides fields in Py_buffer (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Tue Dec 9 14:37:11 CET 2008
- Previous message: [Python-Dev] Allocation of shape and strides fields in Py_buffer
- Next message: [Python-Dev] Allocation of shape and strides fields in Py_buffer
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Antoine Pitrou wrote:
Alexander Belopolsky <alexander.belopolsky gmail.com> writes:
I did not follow numpy development for the last year or more, so I won't qualify as "the numpy folks," but my understanding is that numpy does exactly what Nick recommended: the viewed object owns shape and strides just as it owns the data. The viewing object increases the reference count of the viewed object and thus assures that data, shape and strides don't go away prematurely. That doesn't work if e.g. you take a slice of a memoryview object, since the shape changes in the process. See http://bugs.python.org/issue4580
Note that the PEP is unambiguous as to who owns the pointers in the view object: "The exporter is responsible for making sure that any memory pointed to by buf, format, shape, strides, and suboffsets is valid until releasebuffer is called. If the exporter wants to be able to change an object's shape, strides, and/or suboffsets before releasebuffer is called then it should allocate those arrays when getbuffer is called (pointing to them in the buffer-info structure provided) and free them when releasebuffer is called."
The problem with memoryview appears to be related to the way it calculates its own length (since that is the check that is failing when the view blows up):
a = array('i', range(10)) m = memoryview(a) len(m) # This is the length in bytes, which is WRONG! 40 m2 = memoryview(a)[2:8] len(m2) # This is correct 6 a2 = array('i', range(6)) m[:] = a # But this works m2[:] = a2 # and this does not Traceback (most recent call last): File "", line 1, in ValueError: cannot modify size of memoryview object len(memoryview(a2)) # Ah, 24 != 6 is our problem! 24
Looks to me like there are a couple of bugs here:
The first is that memoryview is treating the len field in the Py_buffer struct as the number of objects in the view in a few places instead of as the total number of bytes being exposed (it is actually the latter, as defined in PEP 3118).
The second is that the getbuf implementation in array.array is broken. It is ONLY OK for shape to be null when ndim=0 (i.e. a scalar value). An array is NOT a scalar value, so the array objects should be setting the shape pointer to point to an single item array (where shape[0] is the length of the array).
memoryview can then be fixed to use shape[0] instead of len to get the number of objects in the view.
memoryview also currently gets the shape wrong on slices:
m.shape (10,) m2.shape (10,)
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message: [Python-Dev] Allocation of shape and strides fields in Py_buffer
- Next message: [Python-Dev] Allocation of shape and strides fields in Py_buffer
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]