Why is DataFrame.loc[[1]] 1,800x slower than df.ix [[1]] and 3,500x than df.loc[1]? · Issue #9126 · pandas-dev/pandas (original) (raw)
import pandas as pd
s=pd.Series(xrange(5000000))
%timeit s.loc[[0]] # You need pandas 0.15.1 or newer for it to be that slow
1 loops, best of 3: 445 ms per loop
Also see the question and the answer here: http://stackoverflow.com/questions/27596832/why-is-dataframe-loc1-1-800x-slower-than-df-ix-1-and-3-500x-than-df-loc
Since .loc[] started to raise KeyError in pandas 0.15.1, the calls via .loc[ [1] ]
(when passed a list) have slowed down enormously, because, I think, an exhaustive search is done to determine if a KeyError has to be raised or not, even if the list has only one element that can be easily located.
Profiling:
File: .../anaconda/lib/python2.7/site-packages/pandas/core/indexing.py,
1278 # require at least 1 element in the index
1279 1 241 241.0 0.1 idx = _ensure_index(key)
1280 1 391040 391040.0 99.9 if len(idx) and not idx.isin(ax).any():
1281
1282 raise KeyError("None of [%s] are in the [%s]" %