read_csv in combination with index_col and usecols · Issue #2654 · pandas-dev/pandas (original) (raw)
Starting point:
http://pandas.pydata.org/pandas-docs/stable/io.html#index-columns-and-trailing-delimiters
If there is one more column of data than there are colum names, usecols exhibits some (at least for me) unintuitive behavior:
data = 'a,b,c\n4,apple,bat,5.7\n8,orange,cow,10' pd.read_csv(StringIO(data)) a b c 4 apple bat 5.7 8 orange cow 10.0 pd.read_csv(StringIO(data), usecols=['a', 'b']) a b 0 4 apple 1 8 orange
I was expecting it to be equal to
pd.read_csv(StringIO(data))[['a', 'b']] a b 4 apple bat 8 orange cow
I am not sure if my expectation is unfounded, though, and that this behavior is indeed intentional?