[Python-Dev] file.next() vs. file.readline() (original) (raw)

Thomas Wouters thomas at xs4all.net
Wed Jan 4 17:34:19 CET 2006


Twice now, I've been bitten by the 'mixing file-iteration with readline' issue. (Yeah, twice.. Good thing I don't write anything important in Python ;) I've also seen a number of newbies bitten by the same thing. The issue, for those that aren't aware, is that when mixing file-iteration and readline (or readlines or such), you run the risk of getting data in the wrong order. This is because file-iteration uses an 8k buffer, while readline doesn't. It's not really a big deal, once you understand it's not
wise to mix iteration and the readline(s) methods.

I do wonder, though, why Python doesn't take care to raise an exception when readline is called with 'stuff' in the iteration-buffer. A cursory look at the source doesn't reveal any glaring problems. It's a single check, possibly two, with good locality of reference. Also, raising an exception when there is stuff in the buffer would only make mixing iteration/readline an error when you would actually, in fact, lose or mess up data. In other words, it would only raise exceptions for existing cases that are already broken.

Is there something I've missed that makes the check undesireable or unfeasible? Or should I just assume no on has gotten around to it (or could be bothered), and just submit a patch? :)

(Making readline and friends use the same buffer may seem the better solution to some, but I'm sure there's a whole discussion behind that, about whether to buffer or not. All non-iteration routines in fileobject.c take pretty good care not to read too much, and I choose to see that as
explicitly designed that way.)

Absent-ly y'rs,

Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



More information about the Python-Dev mailing list