[Python-Dev] Hashable memoryviews (original) (raw)

Stefan Krah stefan at bytereef.org
Sun Nov 13 12:56:23 CET 2011

Previous message: [Python-Dev] Hashable memoryviews
Next message: [Python-Dev] Hashable memoryviews
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Antoine Pitrou <solipsis at pitrou.net> wrote:

> > I would propose the following algorithm: > > 1) try to calculate the original object's hash; if it fails, consider > > the memoryview unhashable (the buffer is probably mutable) > > With slices or the new casts (See: http://bugs.python.org/issue5231, > implemented in http://hg.python.org/features/pep-3118#memoryview ), > it is possible to have different hashes for equal objects: > > >>> b1 = bytes([1,2,3,4]) > >>> b2 = bytes([4,3,2,1]) > >>> m1 = memoryview(b1) > >>> m2 = memoryview(b2)[::-1]

I don't understand this feature. How do you represent a reversed buffer using the buffer API, and how do you ensure that consumers (especially those written in C) see the buffer reversed?

In this case, view->buf points to the last memory location and view->strides is -1. In general, any PEP-3118 compliant consumer must only access elements of a buffer either directly via PyBuffer_GetPointer() or in an equivalent manner.

Basically, this means that you start at view->buf (which may be any location in the memory block) and follow the strides until you reach the desired element.

Objects/abstract.c:

void* PyBuffer_GetPointer(Py_buffer *view, Py_ssize_t indices) { char pointer; int i; pointer = (char )view->buf; for (i = 0; i < view->ndim; i++) { pointer += view->strides[i]indices[i]; if ((view->suboffsets != NULL) && (view->suboffsets[i] >= 0)) { pointer = ((char)pointer) + view->suboffsets[i]; } } return (void)pointer; }

Regardless, it's simply a matter of getting the hash algorithm right (i.e. iterate in logical order rather than memory order).

If you know how the original object computes the hash then this would work. It's not obvious to me how this would work beyond bytes objects though.

> >>> a = array.array('L', [0]) > >>> b = b'\x00\x00\x00\x00\x00\x00\x00\x00' > >>> marray = memoryview(a) > >>> mbytes = memoryview(b) > >>> mcast = marray.cast('B') > >>> mbytes == mcast > True > >>> hash(b) == hash(a) > Traceback (most recent call last): > File "", line 1, in > TypeError: unhashable type: 'array.array'

In this case, the memoryview wouldn't be hashable either.

Hmm, the point was that one could take the hash of m_bytes but not of m_cast, even though they are equal. Perhaps I misunderstood your proposal. I assumed that hash requests would be redirected to the original exporting object.

As above, it would be possible to write a custom hash function for objects with type 'B'.

Stefan Krah

Previous message: [Python-Dev] Hashable memoryviews
Next message: [Python-Dev] Hashable memoryviews
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list