BUG: (iloc) Comparisons of Series have different indexing semantics from DataFrame comparisons (original) (raw)

Consider the following:

In [3]: df = pd.DataFrame(np.arange(5), columns=["foo"])

In [4]: df["foo"].iloc[:-1] != df["foo"].iloc[1:] Out[4]: 0 True 1 True 2 True 3 True dtype: bool

In [5]: df["foo"].iloc[1:] != df["foo"].iloc[:-1] Out[5]: 1 True 2 True 3 True 4 True dtype: bool

Obviously, the ne operator uses positional indexing when comparing the values and applies the index of the first argument to the result. I don't think this is the expected behaviour.

Again, the semantics are different when performing addition:

In [6]: df["foo"].iloc[:-1] + df["foo"].iloc[1:] Out[6]: 0 NaN 1 2 2 4 3 6 4 NaN dtype: float64

In [7]: df["foo"].values[:-1] + df["foo"].values[1:] Out[7]: array([1, 3, 5, 7])

Here, everything works as expected.

I also tried the dataframe ne method:

In [8]: df.iloc[:-1].ne(df.iloc[1:]) Out[8]: foo 0 True 1 False 2 False 3 False 4 True

This works as expected, and differently from Series comparisons. The respective method doesn't exist for Series objects, so I couldn't test that.