Issue 5231: Change format of a memoryview (original) (raw)
Issue5231
Created on 2009-02-12 21:39 by pitrou, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Messages (21) | ||
---|---|---|
msg81823 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2009-02-12 21:39 |
Memoryview objects provide a structured view over a memory area, meaning the length, indexing and slicing operations respect the itemsize: >>> import array >>> a = array.array('i', [1,2,3]) >>> m = memoryview(a) >>> len(a) 3 >>> m.itemsize 4 >>> m.format 'i' However, in some cases, you want the memoryview to behave as a chunk of pure bytes regardless of the original object *and without making a copy*. Therefore, it would be handy to be able to change the format of the memoryview, or ask for a new memoryview with another format. An example of use could be: >>> a = array.array('i', [1,2,3]) >>> m = memoryview(a).with_format('B') >>> len(a), m.itemsize, m.format (12, 1, 'B') | ||
msg81824 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2009-02-12 21:47 |
(Another way to see it is as supplying a Python equivalent to the C buffer API, with access to the raw Py_buffer) | ||
msg81839 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2009-02-12 23:53 |
Agreed, this would be useful. See http://codereview.appspot.com/12470/show if anyone doesn't believe us. ;) | ||
msg128486 - (view) | Author: Xuanji Li (xuanji) * | Date: 2011-02-13 12:09 |
Is this issue from 2 years ago still open? I checked the docs and it seems to be. If it is, I would like to work on a patch and submit it soon. | ||
msg128488 - (view) | Author: Alyssa Coghlan (ncoghlan) * ![]() |
Date: 2011-02-13 13:07 |
It is, but keep issue 10181 in mind (since that may lead to some restructuring of the memoryview code, potentially leading to a need to update your patch). | ||
msg135600 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-05-09 15:32 |
In the mean time I had to resort to dirty hacks in 1ac03e071d65 (such as using io.BytesIO.write(), which I know is implemented in C and doesn't care about item size). At the minimum, a memoryview.getflatview() function would be nice (and probably easier to code than the generic version). Or a "flat" optional argument in the memoryview constructor. | ||
msg135601 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2011-05-09 15:35 |
Read a int32 array as a raw byte string is useful, but the opposite is also useful. | ||
msg135976 - (view) | Author: Mark Dickinson (mark.dickinson) * ![]() |
Date: 2011-05-14 15:47 |
Unassigning. Sorry; no time for this at the moment. | ||
msg142820 - (view) | Author: Stefan Krah (skrah) * ![]() |
Date: 2011-08-23 13:10 |
I think this would be useful and I'll try it out in features/pep-3118#memoryview. Syntax options that I'd prefer: a = array.array('i', [1,2,3]) m = memoryview(a, 'B') Or go all the way and make memoryview take any flag: a = array.array('i', [1,2,3]) m = memoryview(a, getbuf=PyBUF_SIMPLE) This is what I currently do in _testbuffer.c: >>> from _testbuffer import * >>> import array >>> a = array.array('i', [1,2,3]) >>> nd = ndarray(a, getbuf=PyBUF_SIMPLE) >>> nd.format '' >>> nd.len 12 >>> nd.shape () >>> nd.strides () >>> nd.itemsize # XXX array_getbuf should set this to 1. 4 We would need to fix various getbuffer() methods to adhere to strict rules that I've posed here: http://mail.scipy.org/pipermail/numpy-discussion/2011-August/058189.html | ||
msg142821 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-08-23 13:24 |
> Or go all the way and make memoryview take any flag: > > a = array.array('i', [1,2,3]) > m = memoryview(a, getbuf=PyBUF_SIMPLE) This is good for testing, but Python developers shouldn't have to know about the low-level flags. | ||
msg142826 - (view) | Author: Stefan Krah (skrah) * ![]() |
Date: 2011-08-23 13:51 |
Antoine Pitrou <report@bugs.python.org> wrote: > > Or go all the way and make memoryview take any flag: > > > > a = array.array('i', [1,2,3]) > > m = memoryview(a, getbuf=PyBUF_SIMPLE) > > This is good for testing, but Python developers shouldn't have to know > about the low-level flags. Hmm, indeed. How about: 1) memoryview(a, format='B') Shadows a builtin function; annoying syntax highlighting in current Vim. 2) memoryview(a, fmt='B') I'm fully expecting a comment about 'strpbrk' again, but I like it. :) Also, we've to see about speed implications. My current version of memoryview (not pushed yet to the public repo) also solves #10227, but is pretty sensitive even to small changes. | ||
msg142828 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-08-23 14:06 |
> Hmm, indeed. How about: > > 1) memoryview(a, format='B') > > Shadows a builtin function; annoying syntax highlighting in current Vim. > > 2) memoryview(a, fmt='B') > > I'm fully expecting a comment about 'strpbrk' again, but I like it. :) I really prefer "format", it's the natural word to use there. I don't think this the only place where we shadow a builtin function. There are probably variables named "dict" in many places. > Also, we've to see about speed implications. My current version of memoryview > (not pushed yet to the public repo) also solves #10227, but is pretty sensitive > even to small changes. Well, solving #10227 would be nice, but I don't think it's critical either. | ||
msg142830 - (view) | Author: Stefan Krah (skrah) * ![]() |
Date: 2011-08-23 14:28 |
Good, I'll use 'format'. I was mainly worried about the shadowing issue. | ||
msg142832 - (view) | Author: Stefan Krah (skrah) * ![]() |
Date: 2011-08-23 15:15 |
Rethinking a bit: Casting to arbitrary formats might go a bit far. Currently, the combination (format=NULL, shape=NULL) can serve as a warning "This buffer has been cast to unsigned bytes". If we allow casts from bytes to int32, we'll have (format="i", shape=x) and consumers of that buffer have no indication that the original exporter had a different format. If you know what you are doing, fine. On the other hand following the buffer paths in #12817 quickly turned into a very complex maze of getbuffer requests. So, an option would be to try out the cast to bytes first and disallow other casts. | ||
msg142833 - (view) | Author: Alyssa Coghlan (ncoghlan) * ![]() |
Date: 2011-08-23 15:22 |
Casting to a flat 1-D array of bytes is reasonable (it's essentially saying 'look, just give me the raw data, it's on my own head if I stuff up the formatting'). However, requiring an explicit two-step process for any other casting (i.e. take a 1-D view, then a shaped view of that flat 1-D view) also sounds reasonable. So I agree with Victor that 1-D bytes -> any shape/format and any shape/format -> 1-D bytes should be allowed, but I think we should hold off on allowing arbitrary transformations in a single step. | ||
msg142834 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-08-23 15:31 |
> However, requiring an explicit two-step process for any other casting > (i.e. take a 1-D view, then a shaped view of that flat 1-D view) also > sounds reasonable. > > So I agree with Victor that 1-D bytes -> any shape/format and any > shape/format -> 1-D bytes should be allowed, but I think we should > hold off on allowing arbitrary transformations in a single step. Converting to 1-D bytes is my main motivation for this feature request, so I'm fine with such a limitation. The point is to be able to do in Python what we can do in C, take an arbitrary buffer and handle it as pure bytes (for I/O or cryptography purposes, for example). | ||
msg142842 - (view) | Author: Stefan Krah (skrah) * ![]() |
Date: 2011-08-23 16:27 |
Nick Coghlan <report@bugs.python.org> wrote: > So I agree with Victor that 1-D bytes -> any shape/format and any > shape/format -> 1-D bytes should be allowed, but I think we should > hold off on allowing arbitrary transformations in a single step. 1-D bytes -> any shape/format would work if everyone agrees on the Numpy mailing list post that I linked to in an earlier message. [Summary: PyBUF_SIMPLE may downcast any C-contiguous array to unsigned bytes.] Otherwise a PyBUF_SIMPLE getbuffer call to the newly shaped memoryview would be required to fail, and these calls are almost certain to occur somewhere, e.g. in PyObject_AsWriteBuffer(). But then memoryview would also need a 'shape' parameter: m = memoryview(x, format='L', shape=[3, 4]) In that case, making it a method might indeed be more clear to underline that something extraordinary is going on: m = memoryview(x).cast(format='L', shape=[3, 4]) It also takes away a potential speed loss for regular uses. 1-D bytes would then be defined as 'b', 'B' and 'c', I presume? Being able to cast to 'c' would also solve certain memoryview index assignment problems that arise if we opt for strict typing as the struct module does. | ||
msg143729 - (view) | Author: Stefan Krah (skrah) * ![]() |
Date: 2011-09-08 14:56 |
The cast method is completely implemented over at #10181. | ||
msg152256 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2012-01-29 20:13 |
Shouldn't this be closed in favour of #10181? | ||
msg152259 - (view) | Author: Stefan Krah (skrah) * ![]() |
Date: 2012-01-29 20:59 |
Yes, it's really superseded by #10181 now. I'm closing as 'duplicate', since technically it'll be fixed once the patch for #10181 is committed. | ||
msg154238 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2012-02-25 11:25 |
New changeset 3f9b3b6f7ff0 by Stefan Krah in branch 'default': - Issue #10181: New memoryview implementation fixes multiple ownership http://hg.python.org/cpython/rev/3f9b3b6f7ff0 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:45 | admin | set | github: 49481 |
2012-02-25 11:25:29 | python-dev | set | nosy: + python-devmessages: + |
2012-01-29 20:59:16 | skrah | set | status: open -> closedsuperseder: Problems with Py_buffer management in memoryobject.c (and elsewhere?)messages: + dependencies: - Problems with Py_buffer management in memoryobject.c (and elsewhere?)resolution: duplicatestage: needs patch -> resolved |
2012-01-29 20:13:12 | pitrou | set | messages: + |
2011-09-08 14:56:21 | skrah | set | dependencies: + Problems with Py_buffer management in memoryobject.c (and elsewhere?)messages: + |
2011-08-23 16:27:03 | skrah | set | messages: + |
2011-08-23 15:31:53 | pitrou | set | messages: + |
2011-08-23 15:22:39 | ncoghlan | set | messages: + |
2011-08-23 15:15:01 | skrah | set | messages: + |
2011-08-23 14:28:06 | skrah | set | messages: + |
2011-08-23 14:06:34 | pitrou | set | messages: + |
2011-08-23 13:51:58 | skrah | set | messages: + |
2011-08-23 13:24:17 | pitrou | set | messages: + |
2011-08-23 13:10:40 | skrah | set | nosy: + skrahmessages: + |
2011-06-20 18:35:46 | jcon | set | nosy: + jcon |
2011-05-14 15:47:51 | mark.dickinson | set | messages: + |
2011-05-14 15:47:14 | mark.dickinson | set | assignee: mark.dickinson -> |
2011-05-09 15:35:32 | vstinner | set | nosy: + vstinnermessages: + |
2011-05-09 15:32:23 | pitrou | set | stage: patch review -> needs patch |
2011-05-09 15:32:17 | pitrou | set | stage: test needed -> patch reviewmessages: + versions: + Python 3.3, - Python 3.2 |
2011-02-13 13:07:51 | ncoghlan | set | nosy:gregory.p.smith, teoliphant, mark.dickinson, ncoghlan, pitrou, xuanjimessages: + |
2011-02-13 12:09:25 | xuanji | set | nosy:gregory.p.smith, teoliphant, mark.dickinson, ncoghlan, pitrou, xuanjimessages: + |
2011-02-13 11:53:28 | xuanji | set | nosy: + xuanji |
2011-01-04 01:44:06 | pitrou | set | assignee: mark.dickinsonnosy: + mark.dickinson |
2010-08-09 03:19:09 | terry.reedy | set | stage: test neededversions: + Python 3.2, - Python 3.1 |
2009-02-12 23:53:36 | gregory.p.smith | set | messages: + |
2009-02-12 21:47:20 | pitrou | set | messages: + |
2009-02-12 21:39:02 | pitrou | create |