ERR: Fail-fast with incompatible skipfooter combos by gfyoung · Pull Request #23711 · pandas-dev/pandas (original) (raw)
Setup:
from pandas import read_csv from pandas.compat import StringIO
data = "a\n1\n2\n3\n4\n5"
Case 1:
read_csv(StringIO(data), skipfooter=1, nrows=2)
Case 2:
read_csv(StringIO(data), skipfooter=1, chunksize=2)
Currently, we get:
Case 1:
... ValueError: skipfooter not supported for iteration
Case 2:
... <pandas.io.parsers.TextFileReader object at ...>
In Case 1, the error message is not correct. True, passing in nrows
and skipfooter
is not supported (skipfooter
should be skipping lines at the end of the file, NOT at the end of the nrows
of data that we read in, which is what happens currently), but we should raise a better error message.
(BTW, the only way to make this combo work is that we read in the entire file before cutting off the last skipfooter
rows but can look into that subsequently...)
In Case 2, creating this reader is deceiving. Any attempt to call .read()
on this will raise the ValueError
seen in Case 1. It is better that we alert end-users immediately that this reader doesn't work. After this PR, we get some more useful error messages out of the gate:
Case 1:
... ValueError: 'skipfooter' not supported with 'nrows'
Case 2:
... ValueError: 'skipfooter' not supported for iteration