pandas.DataFrame.resample behaves inconsistently when the rule is set to 24H/1D and 48H/2D · Issue #24127 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
pd.version '0.23.4' index = pd.date_range('11/11/2000 06:00:00', periods=50, freq='H') series = pd.Series(range(50), index=index) series.head(10) 2000-11-11 06:00:00 0 2000-11-11 07:00:00 1 2000-11-11 08:00:00 2 2000-11-11 09:00:00 3 2000-11-11 10:00:00 4 2000-11-11 11:00:00 5 2000-11-11 12:00:00 6 2000-11-11 13:00:00 7 2000-11-11 14:00:00 8 2000-11-11 15:00:00 9 Freq: H, dtype: int64 series.resample('24H').count() 2000-11-11 18 2000-11-12 24 2000-11-13 8 Freq: 24H, dtype: int64 series.resample('1D').count() 2000-11-11 18 2000-11-12 24 2000-11-13 8 Freq: D, dtype: int64 series.resample('48H').count() 2000-11-11 42 2000-11-13 8 Freq: 48H, dtype: int64 series.resample('2D').count() 2000-11-11 06:00:00 48 2000-11-13 06:00:00 2 dtype: int64
Problem description
As you can see, when I set rule to '24H' or '1D', the behavior of resample is consistent (in both cases it starts at 0 o'clock on the first day), but for '48H' and '2D' it's obviously not.
Expected Output
If the behavior consistency of ‘24H’ and ‘1D’ is correct, then the behaviors of '48H' and '2D' should also be consistent.
Output of pd.show_versions()
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.4
pytest: None
pip: 18.1
setuptools: 40.4.3
Cython: None
numpy: 1.15.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None