BUG: read_csv with trailing comma, specified header names and usecols gives confusing error · Issue #29042 · pandas-dev/pandas (original) (raw)
Encountered this somewhat strange case: in case you have a malformed file (trailing comma's), we typically set the first column as the index. But if you also pass custom names, and use usecols
, in that case you get a cryptic error message:
import io
s = """a, b, c, d
1,2,3,4,
5,6,7,8,"""
>>> pd.read_csv(io.StringIO(s), header=0, names=['A', 'B', 'C', 'D'], usecols=[2,3])
...
ValueError: Passed header names mismatches usecols
I am not fully sure what the behaviour should be, but the current error message is not very helpful (since the usecols argument is only using integers, they are positional, and don't need to match the header names)