DataFrame[timedelta64] / timedelta64 or pydatetime has wrong dtype and wrong values · Issue #20088 · pandas-dev/pandas (original) (raw)
Code Sample
In [1]: import pandas, numpy, datetime
In [2]: df = pandas.DataFrame({'x': numpy.timedelta64(1, 'ms') * numpy.arange(0, 5)})
In [3]: df
Out[3]:
x
0 00:00:00
1 00:00:00.001000
2 00:00:00.002000
3 00:00:00.003000
4 00:00:00.004000
In [4]: df.dtypes
Out[4]:
x timedelta64[ns]
dtype: object
In [5]: df['x'] / datetime.timedelta(milliseconds=1)
Out[5]:
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
Name: x, dtype: float64
In [6]: df[['x']] / datetime.timedelta(milliseconds=1)
Out[6]:
x
0 00:00:00
1 00:00:00.000000
2 00:00:00.000000
3 00:00:00.000000
4 00:00:00.000000
In [8]: df[['x']] / numpy.timedelta64(1, 'ms')
Out[8]:
x
0 00:00:00
1 00:00:00.000000
2 00:00:00.000000
3 00:00:00.000000
4 00:00:00.000000
Problem description
When performing true division on a dataframe containing timedelta64 values, and dividing by a datetime.timedelta object or a timedelta64, there are two problems:
- the resulting values are incorrect; and
- the dtype (
timedelta64[ns]
) in the resulting dataframe is not consistent with the results of the same operation on the pandasSeries
or the numpy array. (In those cases, the result is a float series or array.)
Expected Output
The dataframe should contain a float64
column, with values equal to df['x'] / numpy.timedelta64(1, 'ms')
:
In [11]: df.apply(lambda s: s.values / numpy.timedelta64(1, 'ms'))
Out[11]:
x
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit: None
python: 3.4.5.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 79 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.11.2
scipy: 0.18.1
statsmodels: 0.6.1
xarray: 0.8.2
IPython: 5.1.0
sphinx: 1.4.8
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.7
blosc: 1.5.0
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 2.0.0
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.3
html5lib: 0.999
httplib2: 0.9.2
apiclient: None
sqlalchemy: 1.1.3
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: 2.43.0
pandas_datareader: None