DataFrame[timedelta64] / timedelta64 or pydatetime has wrong dtype and wrong values · Issue #20088 · pandas-dev/pandas (original) (raw)

Code Sample

In [1]: import pandas, numpy, datetime

In [2]: df = pandas.DataFrame({'x': numpy.timedelta64(1, 'ms') * numpy.arange(0, 5)})

In [3]: df
Out[3]:
                x
0        00:00:00
1 00:00:00.001000
2 00:00:00.002000
3 00:00:00.003000
4 00:00:00.004000

In [4]: df.dtypes
Out[4]:
x    timedelta64[ns]
dtype: object

In [5]: df['x'] / datetime.timedelta(milliseconds=1)
Out[5]:
0    0.0
1    1.0
2    2.0
3    3.0
4    4.0
Name: x, dtype: float64

In [6]: df[['x']] / datetime.timedelta(milliseconds=1)
Out[6]:
                x
0        00:00:00
1 00:00:00.000000
2 00:00:00.000000
3 00:00:00.000000
4 00:00:00.000000

In [8]: df[['x']] / numpy.timedelta64(1, 'ms')
Out[8]:
                x
0        00:00:00
1 00:00:00.000000
2 00:00:00.000000
3 00:00:00.000000
4 00:00:00.000000

Problem description

When performing true division on a dataframe containing timedelta64 values, and dividing by a datetime.timedelta object or a timedelta64, there are two problems:

  1. the resulting values are incorrect; and
  2. the dtype (timedelta64[ns]) in the resulting dataframe is not consistent with the results of the same operation on the pandas Series or the numpy array. (In those cases, the result is a float series or array.)

Expected Output

The dataframe should contain a float64 column, with values equal to df['x'] / numpy.timedelta64(1, 'ms'):

In [11]: df.apply(lambda s: s.values / numpy.timedelta64(1, 'ms'))
Out[11]:
     x
0  0.0
1  1.0
2  2.0
3  3.0
4  4.0

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 3.4.5.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 79 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.11.2
scipy: 0.18.1
statsmodels: 0.6.1
xarray: 0.8.2
IPython: 5.1.0
sphinx: 1.4.8
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.7
blosc: 1.5.0
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 2.0.0
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.3
html5lib: 0.999
httplib2: 0.9.2
apiclient: None
sqlalchemy: 1.1.3
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: 2.43.0
pandas_datareader: None