Issue 16636: codecs: readline() followed by readlines() returns trunkated results (original) (raw)

codecs.readlines() does not read to the end of the file if called after codecs.readline().

Skimmed through tasks containing codecs in title and could not find a candidate that sounded identical.

Repro follows:

$ cat sample_text.txt Subject: Incorrect email address

RATER EMAIL rejected an invitation from SUBJECT NAME in the PROJECT TITLE project. Notification was sent to , but the email address was no longer valid.

$ python Python 2.7.3 (default, Sep 26 2012, 21:53:58) [GCC 4.7.2] on linux2

import codecs

No problem if readlines() are run at the beginning:
f_in = codecs.open('sample_text.txt', 'rb', 'utf-8') f_in.readlines() [u'Subject: Incorrect email address\n', u'\n', u'RATER EMAIL rejected an invitation from SUBJECT NAME in\n', u'the PROJECT TITLE project. Notification was sent to ,\n', u'but the email address was no longer valid.'] f_in.close()

Let us try to read the first line separately,
and then read the remainder of the file:
f_in = codecs.open('sample_text.txt', 'rb', 'utf-8') f_in.readline() u'Subject: Incorrect email address\n' f_in.readlines() [u'\n', u'RATER EMAIL rejected an invitation fro']

The first readlines() does not read to the end. Subsequent readlines() returns what's left to read.

sample_text.txt attached.