BUG: UnsortedIndexError is raised when slicing a dataframe with monotonic increasing multi-index. · Issue #44752 · pandas-dev/pandas (original) (raw)

Reproducible Example

index_0 = pd.date_range('20110101',periods=5) index_1 = pd.Index(['a','b']) multi_index = pd.MultiIndex.from_product([index_0,index_1]) ser1 = pd.Series(np.ones(10), index=multi_index) ser2 = pd.Series(np.ones(5),index=multi_index[:5])

UnsortedIndexError is raised when slicing

df = pd.concat([ser1, ser2],axis=1,join='outer') print(df.index.is_monotonic_increasing) df.loc['2011-01-01':'2011-01-04']

UnsortedIndexError is raised when slicing

df = pd.DataFrame({0:ser1, 1:ser2}) print(df.index.is_monotonic_increasing) df.loc['2011-01-01':'2011-01-04']

everything is OK

df = pd.concat([ser1, ser2.reindex(ser1.index)],axis=1,join='outer') print(df.index.is_monotonic_increasing) df.loc['2011-01-01':'2011-01-04']

Issue Description

The error is

D:\Anaconda3\lib\site-packages\pandas\core\indexing.py in getitem(self, key)
923 with suppress(KeyError, IndexError):
924 return self.obj._get_value(*key, takeable=self._takeable)
--> 925 return self._getitem_tuple(key)
926 else:
927 # we by definition only have the 0th axis

D:\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_tuple(self, tup)
1098 def _getitem_tuple(self, tup: tuple):
1099 with suppress(IndexingError):
-> 1100 return self._getitem_lowerdim(tup)
1101
1102 # no multi-index, so validate all of the indexers

D:\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_lowerdim(self, tup)
820 # we may have a nested tuples indexer here
821 if self._is_nested_tuple_indexer(tup):
--> 822 return self._getitem_nested_tuple(tup)
823
824 # we maybe be using a tuple to represent multiple dimensions here

D:\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_nested_tuple(self, tup)
904 continue
905
--> 906 obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
907 axis -= 1
908

D:\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis)
1155 # nested tuple slicing
1156 if is_nested_tuple(key, labels):
-> 1157 locs = labels.get_locs(key)
1158 indexer = [slice(None)] * self.ndim
1159 indexer[axis] = locs

D:\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in get_locs(self, seq)
3263 if true_slices and true_slices[-1] >= self._lexsort_depth:
3264 raise UnsortedIndexError(
-> 3265 "MultiIndex slicing requires the index to be lexsorted: slicing "
3266 f"on levels {true_slices}, lexsort depth {self._lexsort_depth}"
3267 )

UnsortedIndexError: 'MultiIndex slicing requires the index to be lexsorted: slicing on levels [0], lexsort depth 0'

When a dataframe is created from two unaligned series, using pd.concat or pd.Dataframe, even though the multi-index is monotonic increaing, the _lexsort_depth is still 0.

df = pd.concat([ser1, ser2],axis=1,join='outer') print(df.index.is_monotonic_increasing)

True

print(df.index._lexsort_depth)

0

Expected Behavior

The results should be the same like the output of the following codes:

df = pd.concat([ser1, ser2.reindex(ser1.index)],axis=1,join='outer') print(df.index.is_monotonic_increasing)

True

print(df.index._lexsort_depth)

2

df.loc['2011-01-01':'2011-01-04']

\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n

2011-01-01 a 1.0 1.0
b 1.0 1.0
2011-01-02 a 1.0 1.0
b 1.0 1.0
2011-01-03 a 1.0 1.0
b 1.0 NaN
2011-01-04 a 1.0 NaN
b 1.0 NaN

Installed Versions

pd.__version__ is 1.3.4