Time Series Interpolation is wrong · Issue #21351 · pandas-dev/pandas (original) (raw)

Problem description

dates = pd.date_range('2016-08-28', periods=5, freq='21H') ts1 = pd.Series(np.arange(5), dates) ts1

2016-08-28 00:00:00    0
2016-08-28 21:00:00    1
2016-08-29 18:00:00    2
2016-08-30 15:00:00    3
2016-08-31 12:00:00    4
Freq: 21H, dtype: int64

ts1.resample('15H').interpolate(method='time')

2016-08-28 00:00:00    0.0
2016-08-28 15:00:00    0.0
2016-08-29 06:00:00    0.0
2016-08-29 21:00:00    0.0
2016-08-30 12:00:00    0.0
2016-08-31 03:00:00    0.0
Freq: 15H, dtype: float64

The answer whould not be always 0.

Note that without method='time' the result is the same.

It can look like ok

If I chose another frequency in the beginning, 20H instead of 21H, then it is ok except the last value:

dates = pd.date_range('2016-08-28', periods=5, freq='20H') ts1 = pd.Series(np.arange(5), dates) ts1.resample('15H').interpolate(method='time')

2016-08-28 00:00:00    0.00
2016-08-28 15:00:00    0.75
2016-08-29 06:00:00    1.50
2016-08-29 21:00:00    2.25
2016-08-30 12:00:00    3.00
2016-08-31 03:00:00    3.00
Freq: 15H, dtype: float64

My interpretation is that it puts NaN everywhere except on times it has data that is 2016-08-28 00:00:00 and 2016-08-30 12:00:00 and after it does a linear interpolation.

My example is bad because I used range(4) which is linear. If I set values to 0 9 9 3 9 then the interpolation gives the same result which is totaly wrong now.

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.10.0-38-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: fr_FR.UTF-8

pandas: 0.23.0
pytest: 3.5.0
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.14.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.3.1
sphinx: None
patsy: 0.5.0
dateutil: 2.7.2
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None