[issue4428] make io.BufferedWriter observe max_buffer_size limits - Code Review (original) (raw)

Can't Edit Can't Publish+Mail Start Review Created: 16 years, 3 months ago by gregory.p.smith Modified: 16 years, 2 months ago Reviewers: Antoine Pitrou Base URL: http://svn.python.org/view/\checkout\/python/branches/py3k/ Visibility: Public.	Description http://bugs.python.org/issue4428 - patch gps04. Patch Set 1# Total comments: 10 Patch Set 2 : updated, many more test cases. now handles buffering on array.arrays properly.# Total comments: 8 Downloaded from: http://bugs.python.org/file12914/issue4428-io-bufwrite-gps05.diff Created: 16 years, 3 months ago Download[raw] [tar.bz2] Unified diffs Side-by-side diffs Delta from patch set Stats (+238 lines, -10 lines) Patch Lib/io.py View 1 2 chunks +75 lines, -7 lines 4 comments Download Lib/test/test_io.py View 1 2 chunks +163 lines, -3 lines 4 comments Download Messages Total messages: 6 Expand All Messages \| Collapse All Messages Antoine Pitrou http://codereview.appspot.com/12470/diff/1/2 File Lib/io.py (right): http://codereview.appspot.com/12470/diff/1/2#newcode1055 Line 1055: # b is an iterable of ints, it ... 16 years, 3 months ago (2009-01-20 11:12:46 UTC)#1 http://codereview.appspot.com/12470/diff/1/2 File Lib/io.py (right): http://codereview.appspot.com/12470/diff/1/2#newcode1055 Line 1055: # b is an iterable of ints, it won't always support len(). There is no reason for write() to accept arbitrary iterable of ints, only bytes-like and buffer-like objects. It will make the code simpler. http://codereview.appspot.com/12470/diff/1/2#newcode1060 Line 1060: # No buffer API? Make intermediate slice copies instead. Objects without the buffer API shouldn't be supported at all. http://codereview.appspot.com/12470/diff/1/2#newcode1066 Line 1066: while chunk and len(self._write_buf) > self.buffer_size: What if buffer_size == max_buffer_size? Is everything still written ok? http://codereview.appspot.com/12470/diff/1/2#newcode1070 Line 1070: written += e.characters_written e.characters_written can include bytes which were already part of the buffer before write() was called, but the newly raised BlockingIOError should only count those bytes which were part of the object passed to write(). http://codereview.appspot.com/12470/diff/1/3 File Lib/test/test_io.py (right): http://codereview.appspot.com/12470/diff/1/3#newcode496 Line 496: def testWriteNoLengthIterable(self): This shouldn't work at all. If it works right now, it is only a side-effect of the implementation. (it won't work with FileIO, for example) Sign in to reply to this message. gregory.p.smith Just responding to your comments on the support for generators and non buffer api supporting ... 16 years, 3 months ago (2009-01-20 22:44:37 UTC)#2 Just responding to your comments on the support for generators and non buffer api supporting inputs. I'll get to the other comments in the code soon with new unit tests for those cases. http://codereview.appspot.com/12470/diff/1/2 File Lib/io.py (right): http://codereview.appspot.com/12470/diff/1/2#newcode1055 Line 1055: # b is an iterable of ints, it won't always support len(). On 2009/01/20 11:12:47, Antoine Pitrou wrote: > There is no reason for write() to accept arbitrary iterable of ints, only > bytes-like and buffer-like objects. It will make the code simpler. > Agreed. But since I want to merge this into release30-maint doing that sounds like a behavior change. I'd be fine with removing it for the 3.1/2.7 version of this code (though I hope people will be using the C implementation instead). http://codereview.appspot.com/12470/diff/1/2#newcode1060 Line 1060: # No buffer API? Make intermediate slice copies instead. On 2009/01/20 11:12:47, Antoine Pitrou wrote: > Objects without the buffer API shouldn't be supported at all. Same reason as above. http://codereview.appspot.com/12470/diff/1/3 File Lib/test/test_io.py (right): http://codereview.appspot.com/12470/diff/1/3#newcode496 Line 496: def testWriteNoLengthIterable(self): On 2009/01/20 11:12:47, Antoine Pitrou wrote: > This shouldn't work at all. If it works right now, it is only a side-effect of > the implementation. > (it won't work with FileIO, for example) > hmm in that case it might not be too large of a thing to break when merged into release30-maint. I'll leave that up to the release manager & bdfl. my gut feeling for a release branch change is to say we have to live with this being supported for now. Sign in to reply to this message. Antoine Pitrou Hi! > Agreed. But since I want to merge this into release30-maint doing that sounds ... 16 years, 3 months ago (2009-01-20 23:27:47 UTC)#3 Hi! > Agreed. But since I want to merge this into release30-maint doing that sounds > like a behavior change. I'd be fine with removing it for the 3.1/2.7 version of > this code (though I hope people will be using the C implementation instead). Well, either it's supported and it will have to go through a deprecation phase, or it's unsupported and it can be ripped out right now... I don't think it should be supported at all, given that the semantics of writing an iterable of ints are totally non-obvious. Reading both the PEP and the docstrings in io.py, I only see mentions of "bytes" and "buffer", not of an iterable of ints. Perhaps Guido should pronounce. (do you know of any code relying on this behaviour? the C version obviously does not support it and all regression tests pass fine, except for an SSL bug I filed) Sign in to reply to this message. gregory.p.smith updated, many more test cases. now handles buffering on array.arrays properly. 16 years, 3 months ago (2009-02-01 06:27:43 UTC)#4 updated, many more test cases. now handles buffering on array.arrays properly. Sign in to reply to this message. gregory.p.smith updated in the -gps05 patch. http://codereview.appspot.com/12470/diff/1/2 File Lib/io.py (right): http://codereview.appspot.com/12470/diff/1/2#newcode1066 Line 1066: while chunk and ... 16 years, 3 months ago (2009-02-01 06:28:03 UTC)#5 updated in the -gps05 patch. http://codereview.appspot.com/12470/diff/1/2 File Lib/io.py (right): http://codereview.appspot.com/12470/diff/1/2#newcode1066 Line 1066: while chunk and len(self._write_buf) > self.buffer_size: On 2009/01/20 11:12:47, Antoine Pitrou wrote: > What if buffer_size == max_buffer_size? Is everything still written ok? > testcase added, yes that needed to be >=. thanks. http://codereview.appspot.com/12470/diff/1/2#newcode1070 Line 1070: written += e.characters_written On 2009/01/20 11:12:47, Antoine Pitrou wrote: > e.characters_written can include bytes which were already part of the buffer > before write() was called, but the newly raised BlockingIOError should only > count those bytes which were part of the object passed to write(). > Fixed. I also added several test cases for BlockingIOError characters_written behavior. Sign in to reply to this message. Antoine Pitrou In general I think the patch is quite complicated for what it tries to solve. ... 16 years, 2 months ago (2009-02-12 21:30:44 UTC)#6 In general I think the patch is quite complicated for what it tries to solve. In particular I think the memoryview itemsize issue makes things trickier than they should be. It would be nice if memoryview could support changing the internal format, and for the most common case of switching to bytes ('B') it would be easy to implement. http://codereview.appspot.com/12470/diff/401/601 File Lib/io.py (right): http://codereview.appspot.com/12470/diff/401/601#newcode1080 Line 1080: data = memoryview(data.tostring()) memoryview doesn't have a tostring() method (it's called tobytes()). http://codereview.appspot.com/12470/diff/401/601#newcode1099 Line 1099: remaining_bytes = (len(data)-items_written)*item_size I think you must first update `items_written` by taking `e.characters_written` into account. (and the problem is that `e.characters_written` is not necessarily a multiple of `item_size`!... which makes me think that the memoryview approach, while algorithmically elegant, might not be a good choice here :-)) http://codereview.appspot.com/12470/diff/401/601#newcode1108 Line 1108: written += e.characters_written - before You could first try to stuff as much data as possible in `self._write_buf`. http://codereview.appspot.com/12470/diff/401/601#newcode1127 Line 1127: if len(self._write_buf) > self.max_buffer_size: Can it still happen? It seems you are taking all precautions to avoid it. http://codereview.appspot.com/12470/diff/401/602 File Lib/test/test_io.py (right): http://codereview.appspot.com/12470/diff/401/602#newcode535 Line 535: max_buffer_size=1) What about buffer_size=0? Should it also raise? http://codereview.appspot.com/12470/diff/401/602#newcode559 Line 559: self.assertEquals(first_two_writes, writer._write_stack[:2]) This is very much implementation-dependent (for instance another implementation could write in buffer-sized chunks), it would be better IMHO to just b''.join(writer._write_stack) and compare the outcome. http://codereview.appspot.com/12470/diff/401/602#newcode618 Line 618: bufio.write(b"ijklm") # rejected, already full Not sure I understand why it should raise. 15 bytes is smaller than the max_buffer_size. http://codereview.appspot.com/12470/diff/401/602#newcode648 Line 648: self.assertEquals(0, e.characters_written) Here I would have expected the data to be buffered (2 bytes have been written before blocking, then the remaining data is 15 bytes which is smaller than max_buffer_size). Sign in to reply to this message. Expand All Messages	Collapse All Messages

Issue 12470: [issue4428] make io.BufferedWriter observe max_buffer_size limits
Created 16 years, 3 months ago by gregory.p.smith
Modified 16 years, 2 months ago
Reviewers: Antoine Pitrou
Base URL: http://svn.python.org/view/\*checkout\*/python/branches/py3k/
Comments: 18