Issue 5231: Change format of a memoryview (original) (raw)

Issue5231

Created on 2009-02-12 21:39 by pitrou, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (21)
msg81823 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-02-12 21:39
Memoryview objects provide a structured view over a memory area, meaning the length, indexing and slicing operations respect the itemsize: >>> import array >>> a = array.array('i', [1,2,3]) >>> m = memoryview(a) >>> len(a) 3 >>> m.itemsize 4 >>> m.format 'i' However, in some cases, you want the memoryview to behave as a chunk of pure bytes regardless of the original object *and without making a copy*. Therefore, it would be handy to be able to change the format of the memoryview, or ask for a new memoryview with another format. An example of use could be: >>> a = array.array('i', [1,2,3]) >>> m = memoryview(a).with_format('B') >>> len(a), m.itemsize, m.format (12, 1, 'B')
msg81824 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-02-12 21:47
(Another way to see it is as supplying a Python equivalent to the C buffer API, with access to the raw Py_buffer)
msg81839 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-02-12 23:53
Agreed, this would be useful. See http://codereview.appspot.com/12470/show if anyone doesn't believe us. ;)
msg128486 - (view) Author: Xuanji Li (xuanji) * Date: 2011-02-13 12:09
Is this issue from 2 years ago still open? I checked the docs and it seems to be. If it is, I would like to work on a patch and submit it soon.
msg128488 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2011-02-13 13:07
It is, but keep issue 10181 in mind (since that may lead to some restructuring of the memoryview code, potentially leading to a need to update your patch).
msg135600 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-09 15:32
In the mean time I had to resort to dirty hacks in 1ac03e071d65 (such as using io.BytesIO.write(), which I know is implemented in C and doesn't care about item size). At the minimum, a memoryview.getflatview() function would be nice (and probably easier to code than the generic version). Or a "flat" optional argument in the memoryview constructor.
msg135601 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-09 15:35
Read a int32 array as a raw byte string is useful, but the opposite is also useful.
msg135976 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-05-14 15:47
Unassigning. Sorry; no time for this at the moment.
msg142820 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-08-23 13:10
I think this would be useful and I'll try it out in features/pep-3118#memoryview. Syntax options that I'd prefer: a = array.array('i', [1,2,3]) m = memoryview(a, 'B') Or go all the way and make memoryview take any flag: a = array.array('i', [1,2,3]) m = memoryview(a, getbuf=PyBUF_SIMPLE) This is what I currently do in _testbuffer.c: >>> from _testbuffer import * >>> import array >>> a = array.array('i', [1,2,3]) >>> nd = ndarray(a, getbuf=PyBUF_SIMPLE) >>> nd.format '' >>> nd.len 12 >>> nd.shape () >>> nd.strides () >>> nd.itemsize # XXX array_getbuf should set this to 1. 4 We would need to fix various getbuffer() methods to adhere to strict rules that I've posed here: http://mail.scipy.org/pipermail/numpy-discussion/2011-August/058189.html
msg142821 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-23 13:24
> Or go all the way and make memoryview take any flag: > > a = array.array('i', [1,2,3]) > m = memoryview(a, getbuf=PyBUF_SIMPLE) This is good for testing, but Python developers shouldn't have to know about the low-level flags.
msg142826 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-08-23 13:51
Antoine Pitrou <report@bugs.python.org> wrote: > > Or go all the way and make memoryview take any flag: > > > > a = array.array('i', [1,2,3]) > > m = memoryview(a, getbuf=PyBUF_SIMPLE) > > This is good for testing, but Python developers shouldn't have to know > about the low-level flags. Hmm, indeed. How about: 1) memoryview(a, format='B') Shadows a builtin function; annoying syntax highlighting in current Vim. 2) memoryview(a, fmt='B') I'm fully expecting a comment about 'strpbrk' again, but I like it. :) Also, we've to see about speed implications. My current version of memoryview (not pushed yet to the public repo) also solves #10227, but is pretty sensitive even to small changes.
msg142828 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-23 14:06
> Hmm, indeed. How about: > > 1) memoryview(a, format='B') > > Shadows a builtin function; annoying syntax highlighting in current Vim. > > 2) memoryview(a, fmt='B') > > I'm fully expecting a comment about 'strpbrk' again, but I like it. :) I really prefer "format", it's the natural word to use there. I don't think this the only place where we shadow a builtin function. There are probably variables named "dict" in many places. > Also, we've to see about speed implications. My current version of memoryview > (not pushed yet to the public repo) also solves #10227, but is pretty sensitive > even to small changes. Well, solving #10227 would be nice, but I don't think it's critical either.
msg142830 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-08-23 14:28
Good, I'll use 'format'. I was mainly worried about the shadowing issue.
msg142832 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-08-23 15:15
Rethinking a bit: Casting to arbitrary formats might go a bit far. Currently, the combination (format=NULL, shape=NULL) can serve as a warning "This buffer has been cast to unsigned bytes". If we allow casts from bytes to int32, we'll have (format="i", shape=x) and consumers of that buffer have no indication that the original exporter had a different format. If you know what you are doing, fine. On the other hand following the buffer paths in #12817 quickly turned into a very complex maze of getbuffer requests. So, an option would be to try out the cast to bytes first and disallow other casts.
msg142833 - (view) Author: Alyssa Coghlan (ncoghlan) * (Python committer) Date: 2011-08-23 15:22
Casting to a flat 1-D array of bytes is reasonable (it's essentially saying 'look, just give me the raw data, it's on my own head if I stuff up the formatting'). However, requiring an explicit two-step process for any other casting (i.e. take a 1-D view, then a shaped view of that flat 1-D view) also sounds reasonable. So I agree with Victor that 1-D bytes -> any shape/format and any shape/format -> 1-D bytes should be allowed, but I think we should hold off on allowing arbitrary transformations in a single step.
msg142834 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-23 15:31
> However, requiring an explicit two-step process for any other casting > (i.e. take a 1-D view, then a shaped view of that flat 1-D view) also > sounds reasonable. > > So I agree with Victor that 1-D bytes -> any shape/format and any > shape/format -> 1-D bytes should be allowed, but I think we should > hold off on allowing arbitrary transformations in a single step. Converting to 1-D bytes is my main motivation for this feature request, so I'm fine with such a limitation. The point is to be able to do in Python what we can do in C, take an arbitrary buffer and handle it as pure bytes (for I/O or cryptography purposes, for example).
msg142842 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-08-23 16:27
Nick Coghlan <report@bugs.python.org> wrote: > So I agree with Victor that 1-D bytes -> any shape/format and any > shape/format -> 1-D bytes should be allowed, but I think we should > hold off on allowing arbitrary transformations in a single step. 1-D bytes -> any shape/format would work if everyone agrees on the Numpy mailing list post that I linked to in an earlier message. [Summary: PyBUF_SIMPLE may downcast any C-contiguous array to unsigned bytes.] Otherwise a PyBUF_SIMPLE getbuffer call to the newly shaped memoryview would be required to fail, and these calls are almost certain to occur somewhere, e.g. in PyObject_AsWriteBuffer(). But then memoryview would also need a 'shape' parameter: m = memoryview(x, format='L', shape=[3, 4]) In that case, making it a method might indeed be more clear to underline that something extraordinary is going on: m = memoryview(x).cast(format='L', shape=[3, 4]) It also takes away a potential speed loss for regular uses. 1-D bytes would then be defined as 'b', 'B' and 'c', I presume? Being able to cast to 'c' would also solve certain memoryview index assignment problems that arise if we opt for strict typing as the struct module does.
msg143729 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-09-08 14:56
The cast method is completely implemented over at #10181.
msg152256 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-29 20:13
Shouldn't this be closed in favour of #10181?
msg152259 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-01-29 20:59
Yes, it's really superseded by #10181 now. I'm closing as 'duplicate', since technically it'll be fixed once the patch for #10181 is committed.
msg154238 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-02-25 11:25
New changeset 3f9b3b6f7ff0 by Stefan Krah in branch 'default': - Issue #10181: New memoryview implementation fixes multiple ownership http://hg.python.org/cpython/rev/3f9b3b6f7ff0
History
Date User Action Args
2022-04-11 14:56:45 admin set github: 49481
2012-02-25 11:25:29 python-dev set nosy: + python-devmessages: +
2012-01-29 20:59:16 skrah set status: open -> closedsuperseder: Problems with Py_buffer management in memoryobject.c (and elsewhere?)messages: + dependencies: - Problems with Py_buffer management in memoryobject.c (and elsewhere?)resolution: duplicatestage: needs patch -> resolved
2012-01-29 20:13:12 pitrou set messages: +
2011-09-08 14:56:21 skrah set dependencies: + Problems with Py_buffer management in memoryobject.c (and elsewhere?)messages: +
2011-08-23 16:27:03 skrah set messages: +
2011-08-23 15:31:53 pitrou set messages: +
2011-08-23 15:22:39 ncoghlan set messages: +
2011-08-23 15:15:01 skrah set messages: +
2011-08-23 14:28:06 skrah set messages: +
2011-08-23 14:06:34 pitrou set messages: +
2011-08-23 13:51:58 skrah set messages: +
2011-08-23 13:24:17 pitrou set messages: +
2011-08-23 13:10:40 skrah set nosy: + skrahmessages: +
2011-06-20 18:35:46 jcon set nosy: + jcon
2011-05-14 15:47:51 mark.dickinson set messages: +
2011-05-14 15:47:14 mark.dickinson set assignee: mark.dickinson ->
2011-05-09 15:35:32 vstinner set nosy: + vstinnermessages: +
2011-05-09 15:32:23 pitrou set stage: patch review -> needs patch
2011-05-09 15:32:17 pitrou set stage: test needed -> patch reviewmessages: + versions: + Python 3.3, - Python 3.2
2011-02-13 13:07:51 ncoghlan set nosy:gregory.p.smith, teoliphant, mark.dickinson, ncoghlan, pitrou, xuanjimessages: +
2011-02-13 12:09:25 xuanji set nosy:gregory.p.smith, teoliphant, mark.dickinson, ncoghlan, pitrou, xuanjimessages: +
2011-02-13 11:53:28 xuanji set nosy: + xuanji
2011-01-04 01:44:06 pitrou set assignee: mark.dickinsonnosy: + mark.dickinson
2010-08-09 03:19:09 terry.reedy set stage: test neededversions: + Python 3.2, - Python 3.1
2009-02-12 23:53:36 gregory.p.smith set messages: +
2009-02-12 21:47:20 pitrou set messages: +
2009-02-12 21:39:02 pitrou create