Vectorised addition of MonthOffset(n=0) returns different values to item-by-item addition · Issue #11370 · pandas-dev/pandas (original) (raw)

This code returns different values in 0.17.0 and 0.15.2

import pandas as pd from pandas.util.testing import assert_index_equal

pd.show_versions()

offsets = [ pd.offsets.Day, pd.offsets.MonthBegin, pd.offsets.QuarterBegin, pd.offsets.YearBegin, ]

dates = pd.date_range('2011-01-01', '2011-01-05', freq='D')

for offset in offsets: # adding each item individually or vectorised should give same answer expected_vec = dates + offset(n=0) expected = pd.DatetimeIndex([d + offset(n=0) for d in dates])

msg = "offset: {}, vectorised: {}, individual: {}".format(
    offset, expected_vec, expected
)
try:
    if pd.__version__ == '0.17.0':
        assert_index_equal(expected_vec, expected, check_names=False)
    else:
        assert_index_equal(expected_vec, expected)
except AssertionError as er:
    raise Exception(msg + str(er))

0.17.0

INSTALLED VERSIONS

commit: None python: 2.7.10.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 26 Stepping 5, GenuineIntel byteorder: little LC_ALL: None LANG: None

pandas: 0.17.0 nose: 1.3.7 pip: 7.1.0 setuptools: 18.0.1 Cython: 0.22 numpy: 1.10.1 scipy: 0.16.0 statsmodels: 0.6.1 IPython: 3.2.1 sphinx: 1.3.1 patsy: 0.4.0 dateutil: 2.4.1 pytz: 2015.4 blosc: None bottleneck: 1.0.0 tables: 3.2.2 numexpr: 2.4.4 matplotlib: 1.4.3 openpyxl: None xlrd: 0.9.4 xlwt: None xlsxwriter: 0.7.3 lxml: None bs4: 4.3.2 html5lib: 0.999 httplib2: None apiclient: None sqlalchemy: 1.0.7 pymysql: None psycopg2: None Traceback (most recent call last): File "c:\dev\code\sandbox\pandas_17_vs_15_dateoffsets.py", line 24, in raise Exception(msg + str(er)) Exception: offset: <class 'pandas.tseries.offsets.MonthBegin'>, vectorised: DatetimeIndex(['2010-12-01', '2011-01-01', '2011-01-01', '2011-01-01', '2011-01-01'], dtype='datetime64[ns]', freq=None), individual: DatetimeIndex(['2011-01-01', '2011-02-01', '2011-02-01', '2011-02-01', '2011-02-01'], dtype='datetime64[ns]', freq=None)Index are different

Index values are different (100.0 %) [left]: DatetimeIndex(['2010-12-01', '2011-01-01', '2011-01-01', '2011-01-01', '2011-01-01'], dtype='datetime64[ns]', freq=None) [right]: DatetimeIndex(['2011-01-01', '2011-02-01', '2011-02-01', '2011-02-01', '2011-02-01'], dtype='datetime64[ns]', freq=None)

0.15.2

INSTALLED VERSIONS

commit: None python: 2.7.10.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 26 Stepping 5, GenuineIntel byteorder: little LC_ALL: None LANG: en_GB

pandas: 0.15.2 nose: 1.3.7 Cython: 0.22 numpy: 1.9.2 scipy: 0.15.1 statsmodels: None IPython: 3.2.1 sphinx: 1.3.1 patsy: 0.3.0 dateutil: 2.4.1 pytz: 2015.4 bottleneck: 1.0.0 tables: 3.2.0 numexpr: 2.4.3 matplotlib: 1.4.3 openpyxl: 1.8.5 xlrd: 0.9.4 xlwt: 0.7.5 xlsxwriter: 0.7.3 lxml: 3.4.4 bs4: 4.3.2 html5lib: 0.999 httplib2: None apiclient: None rpy2: None sqlalchemy: 1.0.7 pymysql: None psycopg2: None