msg106630 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2010-05-28 00:12 |
Some of my tests use io.StringIO and assert that captured print output equals expected output. Until now, I reused one buffer by truncating between tests. I recently replaced 3.1.1 with 3.1.2 (WinXP) and the *second* test of each run started failing. The following minimal code shows the problem (and should suggest a new unit test): from io import StringIO; s = StringIO(); print(repr(s.getvalue())) print('abc', file=s); print(repr(s.getvalue())) s.truncate(0); print(repr(s.getvalue())) print('abc', file=s); print(repr(s.getvalue())) prints (both command window and IDLE) '' 'abc\n' '' '\x00\x00\x00\x00abc\n' # should be and previously would have been 'abc\n' s.truncate(0) zeros the buffer and appears to set the length to 0, but a subsequent print sees the length as what it was before the truncate and appends after the zeroed characters. Ugh. I presume the problem is StringIO-emulation specific but have not tested 'real' files to be sure. --- also... >>> help(s.truncate) Help on built-in function truncate: truncate(...) Truncate size to pos. ... should be, for greater clarity, something like truncate([pos]) Truncate the size of the file or buffer to pos ... |
|
|
msg106631 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-05-28 00:16 |
This was an exceptional API change in 3.1.2: truncate() doesn't move the file pointer anymore, you have to do it yourself (with seek(0) in your case). I'm sorry for the inconvenience; the change was motivated by the desire of having an API more consistent with other file-handling APIs out there. |
|
|
msg106698 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2010-05-28 23:03 |
This should not have been closed yet. The announced policy is that bugfix releases should not add or change APIs. I think this hidden change (there is no What' New in 3.1.2 doc) should be reverted in 3.1.3. I will post on py-dev for other opinions. That aside, I think both the current behavior and docs are buggy and should be changed for 3.2 (and 3.1.3 if not reverted). 1. If the file pointer is not moved, then it seems to me that line 3 of my example output should have been '\0\0\0\0' instead of ''. The current behavior is '' + 'abc\n' == '\0\0\0\0abc\n', which is not sane. Maybe .getvalue() needs to be changed. It is hard to further critique the observed behavior since the intent is, to me, essentially undocumented. 2. The current 3.1.2/3.2a0 manual entry "truncate(size=None) Truncate the file to at most size bytes. size defaults to the current file position, as returned by tell(). Note that the current file position isn’t changed; if you want to change it to the new end of file, you have to seek() explicitly." has several problems. a. 'file' should be changed to 'stream' to be consistent with other entries. b. If "truncate the file to at most size bytes" does not mean 'change the steam position', then I have no idea what it is supposed to mean, or what .truncate is actually supposed to do. c. There is no mention that what is does do is to replace existing chars with null chars. (And the effect of that is/should be different in Python than in C.) d. There is no mention of the return value and what *it* is supposed to mean. 3. The current 3.1.2 (and I presume, 3.2a0) doc string (help entry) "truncate(...) Truncate size to pos. The pos argument defaults to the current file position, as returned by tell(). The current file position is unchanged. Returns the new absolute position." also has problems. a. Same change of 'file' to 'stream'. b. I already commented on ... and 'truncate size to pos', but to be consistent with the manual, the arg should be called 'size' rather that 'pos', or vice verse. c. 'truncate' does not define the meaning of 'truncate', especially when it no longer means what a native English speaker would expect it to mean. d. To me, 'the *new* absolute position' contradicts 'The current file position is *unchanged*' [emphases added]. Is there some subtle, undocumented, distinction between 'absolute position' and 'file [stream] position'? In any case, .truncate(0) returns 0, which seems to become the new position for .getvalue() but not for appending chars with print. To me, having *two* steams positions for different functions is definitely a bug. 4. There is no mention of a change to .truncate in What's New in Python 3.2. After searching more, I see that the change was discussed in #6939, by only two people. I see no justification there for changing 3.1 instead of waiting for 3.2. The OP suggested in his initial message, as I do here, that the doc say something about what .truncate does do with respect to padding, but that did not happen. |
|
|
msg106706 - (view) |
Author: Alyssa Coghlan (ncoghlan) *  |
Date: 2010-05-29 03:43 |
For the record, Guido's decision to change 3.1: http://mail.python.org/pipermail/python-dev/2009-September/092247.html |
|
|
msg106711 - (view) |
Author: Pascal Chambon (pakal) * |
Date: 2010-05-29 07:36 |
The change was announced in http://docs.python.org/dev/whatsnew/2.7.html, but indeed it wasn't advertised in py3k changes - my apologies, I didn't check it was. I agree that the doc should be clarified on several aspects. * The returned value is the new file SIZE indeed (I guess we can still use "file" here, since imo other streams can't be truncated anyway). * Truncate() simply changes the current end-of-file (the word is historical, resize() would have been better - as this has been discussed on mailing lists). * Extending the file size with truncate() or with a write() after end-of-file (that's your sample's case) does, or does not (depending on the platform), fill the empty space with zeroes. Proposal for doc update : Resizes the file to the given size (or the current position), without moving the file pointer. This resizing can extend or reduce the current file size. In case of extension, the content of the new file area depends on the platform (on most systems, additional bytes are zero-filled, on win32 they're undetermined). Returns the new file size. Would it be ok thus ? |
|
|
msg106716 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-05-29 11:56 |
How about reusing the documentation of legacy file objects: “Truncate the file’s size. If the optional size argument is present, the file is truncated to (at most) that size. The size defaults to the current position. The current file position is not changed. Note that if a specified size exceeds the file’s current size, the result is platform-dependent: possibilities include that the file may remain unchanged, increase to the specified size as if zero-filled, or increase to the specified size with undefined new content.” http://docs.python.org/library/stdtypes.html#file.truncate |
|
|
msg106717 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-05-29 12:09 |
I've committed a doc update (a mix of the legacy truncate() doc and Pascal's proposal) in r81594. |
|
|
msg106725 - (view) |
Author: Pascal Chambon (pakal) * |
Date: 2010-05-29 16:47 |
Good B-) Woudl it be necessary to update the docstrings too ? |
|
|
msg219943 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2014-06-07 15:07 |
Is any more work needed here as asks about updating doc strings? |
|
|
msg226442 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2014-09-05 19:09 |
Can this be closed or what? |
|
|
msg226451 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2014-09-05 20:07 |
The docstring is unchanged from before the behavior change and to me still has problems a. to d. listed in . The manual entry seems too longs to just copy, but I would not know know to condense it. |
|
|
msg280168 - (view) |
Author: A.M. Kuchling (akuchling) *  |
Date: 2016-11-06 20:28 |
"Why, this is a simple docstring change. How difficult can it be?", I thought. Ah ha ha ha. Here's a patch against the 3.5 branch. It should also apply cleanly to 3.6 or 3.7, except for a little Argument Clinic noise. The patch changes 3 occurrences of the truncate() docstring in Lib/_pyio.py, and 1 each in Modules/_io/{bytesio.c,fileio.c,iobase.c,stringio.c}. Whew! Do we want to change all of these occurrences, or just the one specific case of StringIO? It seemed to me that we want to change them all. |
|
|
msg280210 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2016-11-07 16:54 |
The patch looks good to me, but perhaps we should make these docstrings shorter and refer people to the actual documentation for details? We recently did this in subprocess and venv modules. |
|
|
msg280726 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2016-11-14 01:45 |
In general I agree with the doc strings giving the main details, and leaving smaller details for the reference documentation. Maybe for concrete implementations like BytesIO, it is not worth saying the expanded contents are undefined. One other way to make them shorter is to drop “as reported by tell()”. diff --git a/Lib/_pyio.py b/Lib/_pyio.py @@ -344,10 +344,12 @@ def truncate(self, pos=None): + """Resize stream to at most size bytes. Wouldn’t it be more correct to say “at most ‘pos’ bytes”? You should avoid mentioning byte sizes for TextIOBase.truncate(); see Issue 25849. Also applies to the StringIO subclasses. + Position in the stream is left unchanged. Size defaults to I think it should be “The position in the stream . . .”, to match the other full sentences in these paragraphs. |
|
|