csv_reader with limited number of columns should should completely disregard the unused fields (original) (raw)

xref #6710

I have a CSV whose lines may have 11 or 18 fields. I only need to read the first 6 fields, so I use "usecols=range(6)". Even with the limited number of columns, I get the exception:

ValueError: Expected 11 fields in line 776483, saw 18

The csv_reader should completely disregard the unused fields.

Small test case:

csv = '19,29,39\n'*2 + '10,20,30,40\n'
df = pd.read_csv(io.StringIO(csv), engine='python', header=None, usecols=list(range(3)))

It also affects the C engine.

Discussed at the users mailing list at https://groups.google.com/d/topic/pydata/vjhFpHtgnvw/discussion