Possible Bug - math on like-indexed datetime series doesn't work as expected · Issue #7500 · pandas-dev/pandas (original) (raw)

firstordernoteval
firstevalorder

I have two series that are like-indexed datetimes. I'm trying to do simple math operations on them and noticed the results don't match what I'd expect. Specifically, subtracting one datetime from the other doesn't always result in subtraction across the aligned indices. Transforming the series to a dataframe with a dummy column gets us closer but the type manipulation isn't correct.

print firstOrderNotEval.loc[site] print firstEvalOrder.loc[site] print type(firstOrderNotEval.loc[site]) print type(firstEvalOrder.loc[site])

output:

#2008-08-21 00:00:00 #2013-09-10 00:00:00

<class 'pandas.tslib.Timestamp'>

<class 'pandas.tslib.Timestamp'>

timeToFirstNonEvalPurchase_doesntWork = ((firstOrderNotEval - firstEvalOrder)/np.timedelta64(1,'D')) timeToFirstNonEvalPurchase = ((firstOrderNotEval.to_frame('a') - firstEvalOrder.to_frame('a'))/np.timedelta64(1,'D'))['a']

print timeToFirstNonEvalPurchase_doesntWork.loc[2898717] print timeToFirstNonEvalPurchase.loc[2898717]

output:

nan

-1846 nanoseconds # note should be 1846 days

Subtracting individual elements gives the correct result but as a datetime.timedelta type. subtracting the series directly gives NaT:

site = 2898717
print (firstOrderNotEval.loc[site] - firstEvalOrder.loc[site]) print type(firstOrderNotEval.loc[site] - firstEvalOrder.loc[site]) print (firstOrderNotEval - firstEvalOrder).loc[site]

output:

-1846 days, 0:00:00

<type 'datetime.timedelta'>

NaT

Perhaps this has to do with the timestamp type itself given the following example:

print (firstOrderNotEval.to_frame('a') - firstEvalOrder.to_frame('a')).loc[site]/np.timedelta64(1,'D') print ((firstOrderNotEval.to_frame('a') - firstEvalOrder.to_frame('a'))/np.timedelta64(1,'D')).loc[site]

output:

a -1846

Name: 2898717.0, dtype: float64

a -00:00:00.000002

Name: 2898717.0, dtype: timedelta64[ns]

Note that the following have different results based on how the divide by timedelta64 is performed:

tmp = ((firstOrderNotEval.to_frame('a') - firstEvalOrder.to_frame('a'))) print (tmp/np.timedelta64(1,'D')).loc[site] print tmp.apply(lambda x: x/np.timedelta64(1,'D')).loc[site]

output:

a -00:00:00.000002

Name: 2898717.0, dtype: timedelta64[ns]

what we'd expect:

a -1846

Name: 2898717.0, dtype: float64

The attached pickle files (as .jpg) include the series used in this example

firstOrderNotEval.to_pickle('./firstOrderNotEval.jpg') firstEvalOrder.to_pickle('./firstEvalOrder.jpg')