Vectorised addition of MonthOffset(n=0) returns different values to item-by-item addition · Issue #11370 · pandas-dev/pandas (original) (raw)
This code returns different values in 0.17.0
and 0.15.2
import pandas as pd from pandas.util.testing import assert_index_equal
pd.show_versions()
offsets = [ pd.offsets.Day, pd.offsets.MonthBegin, pd.offsets.QuarterBegin, pd.offsets.YearBegin, ]
dates = pd.date_range('2011-01-01', '2011-01-05', freq='D')
for offset in offsets: # adding each item individually or vectorised should give same answer expected_vec = dates + offset(n=0) expected = pd.DatetimeIndex([d + offset(n=0) for d in dates])
msg = "offset: {}, vectorised: {}, individual: {}".format(
offset, expected_vec, expected
)
try:
if pd.__version__ == '0.17.0':
assert_index_equal(expected_vec, expected, check_names=False)
else:
assert_index_equal(expected_vec, expected)
except AssertionError as er:
raise Exception(msg + str(er))
0.17.0
INSTALLED VERSIONS
commit: None python: 2.7.10.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 26 Stepping 5, GenuineIntel byteorder: little LC_ALL: None LANG: None
pandas: 0.17.0 nose: 1.3.7 pip: 7.1.0 setuptools: 18.0.1 Cython: 0.22 numpy: 1.10.1 scipy: 0.16.0 statsmodels: 0.6.1 IPython: 3.2.1 sphinx: 1.3.1 patsy: 0.4.0 dateutil: 2.4.1 pytz: 2015.4 blosc: None bottleneck: 1.0.0 tables: 3.2.2 numexpr: 2.4.4 matplotlib: 1.4.3 openpyxl: None xlrd: 0.9.4 xlwt: None xlsxwriter: 0.7.3 lxml: None bs4: 4.3.2 html5lib: 0.999 httplib2: None apiclient: None sqlalchemy: 1.0.7 pymysql: None psycopg2: None Traceback (most recent call last): File "c:\dev\code\sandbox\pandas_17_vs_15_dateoffsets.py", line 24, in raise Exception(msg + str(er)) Exception: offset: <class 'pandas.tseries.offsets.MonthBegin'>, vectorised: DatetimeIndex(['2010-12-01', '2011-01-01', '2011-01-01', '2011-01-01', '2011-01-01'], dtype='datetime64[ns]', freq=None), individual: DatetimeIndex(['2011-01-01', '2011-02-01', '2011-02-01', '2011-02-01', '2011-02-01'], dtype='datetime64[ns]', freq=None)Index are different
Index values are different (100.0 %) [left]: DatetimeIndex(['2010-12-01', '2011-01-01', '2011-01-01', '2011-01-01', '2011-01-01'], dtype='datetime64[ns]', freq=None) [right]: DatetimeIndex(['2011-01-01', '2011-02-01', '2011-02-01', '2011-02-01', '2011-02-01'], dtype='datetime64[ns]', freq=None)
0.15.2
INSTALLED VERSIONS
commit: None python: 2.7.10.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 26 Stepping 5, GenuineIntel byteorder: little LC_ALL: None LANG: en_GB
pandas: 0.15.2 nose: 1.3.7 Cython: 0.22 numpy: 1.9.2 scipy: 0.15.1 statsmodels: None IPython: 3.2.1 sphinx: 1.3.1 patsy: 0.3.0 dateutil: 2.4.1 pytz: 2015.4 bottleneck: 1.0.0 tables: 3.2.0 numexpr: 2.4.3 matplotlib: 1.4.3 openpyxl: 1.8.5 xlrd: 0.9.4 xlwt: 0.7.5 xlsxwriter: 0.7.3 lxml: 3.4.4 bs4: 4.3.2 html5lib: 0.999 httplib2: None apiclient: None rpy2: None sqlalchemy: 1.0.7 pymysql: None psycopg2: None