Issue 14475: codecs.StreamReader.read behaves differently from regular files (original) (raw)
Created on 2012-04-02 13:13 by tdb, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Messages (4)
Author: Mikko Rasa (tdb)
Date: 2012-04-02 13:13
For regular files, a read() call without arguments will read until EOF. codecs.StreamReader does its own buffering, and if there are characters in the buffer, a read() call will be satisfied from the buffer without an attempt to read the rest of the file. This discrepancy causes certain code that worked with regular open() fail if codecs.open() is substituted.
The easiest way to reproduce this is to first call readline() and then read(). Since readline() can't know how many characters are on the line, it will almost always leave some characters in the buffer, triggering the problem with read().
Author: STINNER Victor (vstinner) *
Date: 2012-04-02 20:03
Oh, yet another bug in in codecs.StreamReader. I should add it to the PEP :-) http://www.python.org/dev/peps/pep-0400/
Use io.TextIOWrapper (open) instead of codecs.StreamReader (codecs.open), it's bugfree :-)
Author: Andrew (A.S)
Date: 2012-05-16 23:37
Just got this behavior, with readlines(), which is unsurprising since it internally uses read() as described in the original bug report.
The break on line 468 of codecs.py seems to be the problem, it fixes it if I remove this conditional locally.
http://hg.python.org/cpython/file/f6a207d86154/Lib/codecs.py#l466
I may be overlooking something, but I would assume this should be checking if the character buffer extends to the EOF of the underlaying stream at this point?
As stated before can be reproduced by: f = codecs.open(...) f.read() f.readlines()
Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2012-12-07 20:03
This is obviously a duplicate of and .
History
Date
User
Action
Args
2022-04-11 14:57:28
admin
set
github: 58680
2012-12-07 20:03:38
serhiy.storchaka
set
status: open -> closed
superseder: When I use codecs.open(...) and f.readline() follow up by f.read() return bad result
nosy: + serhiy.storchaka
messages: +
resolution: duplicate
stage: resolved
2012-05-16 23:37:34
A.S
set
nosy: + A.S
messages: +
2012-04-02 20:03:58
vstinner
set
nosy: + vstinner
messages: +
2012-04-02 13:13:33
tdb
create