Issue 1397960: File-iteration and read* method protection (original) (raw)

Created on 2006-01-05 18:18 by twouters, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (5)

msg49268 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2006-01-05 18:18

This patch causes the readline, readlines, read and readinto methods of file objects (as well as PyFile_ReadLine() called on actual fileobjects) to raise an exception if (and only if) there is data in the file-iteration buffer. Currently, if any of these methods are called during file-iteration (with a non-empty buffer), the read* methods return data following the buffer, causing the data to seem out of order. Or, if all of the file's remaining data is in the buffer, truncated.

The patch is only supposed to raise an error when, previously, the file's data would get corrupted. It doesn't prevent mixing iteration and read*-methods in harmless situations (methods followed by iteration, iteration until EOF followed by methods, or iteration until an exact buffer boundary, followed by methods.) The exception currently raised is ValueError, because it seems least inappropriate. Also, read* on closed files raises ValueError (probably for the same reason.) The test_file test has been ammended to include tests for this behaviour.

msg49269 - (view)

Author: Jim Jewett (jimjjewett)

Date: 2006-01-09 18:50

Logged In: YES user_id=764593

Since you're already adding the if-test (to be able to raise an error), why not just read the data from the buffer if there is one, instead of raising an error?

msg49270 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2006-01-10 10:46

Logged In: YES user_id=34209

It's more code. More to maintain, more to break. For instance, if the buffer has some data but not enough, you have to read more into the buffer, and I have a nagging suspicion the iteration buffer doesn't take nonblocking files into account right. I'd rather not change all that unless normal (non-iteration) operation uses the buffer in the first place, and that is going to subtly change some of the file semantics (like mixing reading and writing.) The current rule is "don't mix iteration and read*-methods", all this patch does is to enforce that when necessary, not to change that rule.

I'd love to overhaul fileobject.c completely, changing it all to buffered mode unless explicitly asked not to. It would probably reduce fileobject.c in size by half :) But that's not a minor change, and Guido already said he wanted to replace most C stdio for Py3K, where possible (and I agree.) If we can achieve the same effect with reasonable backward- and forward-compatibility in 2.6, that would be great, but I'm not going to rush anything to get it in 2.5.

msg49271 - (view)

Author: Neal Norwitz (nnorwitz) * (Python committer)

Date: 2006-02-12 06:17

Logged In: YES user_id=33168

Thomas, I'm fine with the patch going in, though I would prefer to see the bagofham.txt generated rather than checked in.

msg49272 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2006-02-12 12:17

Logged In: YES user_id=34209

Thanks, checked in a version that generates its testfile.