ENH: Add Series.dt.total_seconds GH #10817 by sjdenny · Pull Request #10939 · pandas-dev/pandas (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Implements a Series.dt.total_seconds method for timedelta64 Series.
closes #10817
does timedelta.total_seconds() provide fractional second as wel?
Yes, fractional seconds are included:
In [1]: s = Series(pd.timedelta_range('1day',periods=5,freq='1ms'))
In [2]: s
Out[2]:
0 1 days 00:00:00
1 1 days 00:00:00.001000
2 1 days 00:00:00.002000
3 1 days 00:00:00.003000
4 1 days 00:00:00.004000
dtype: timedelta64[ns]
In [3]: s.dt.total_seconds
Out[3]:
0 86400.000
1 86400.001
2 86400.002
3 86400.003
4 86400.004
dtype: float64
not what I mean
is the actual Python timedelta.total_seconds have fractions (I think yes) just confirming
Ah, I understand. Yes, it does:
In [38]: t = timedelta(days=1,microseconds=40)
In [39]: t.total_seconds()
Out[39]: 86400.00004
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a test in test_series/test_dt_namespace_accessor as well (its like the one you have for test Series)
Indeed, didn't see that, but +1 making it a method instead of a property for consistency with datetime.timedelta/pd.Timedelta
Further, @sjdenny can you add a whatsnew notice (see doc/source/whatsnew/v0.17.0.txt, somewhere in 'other enhancements')
In extending the (nanosecond-precision) tests, I've come across this behaviour:
In [18]: td = pd.Timedelta(nanoseconds=50)
In [19]: td.nanoseconds
Out[19]: 50
In [20]: td.total_seconds()
Out[20]: 0.0
In [21]: td.total_seconds() == 0.0
Out[21]: True
In [22]: td2 = pd.Timedelta(microseconds=50)
In [23]: td2.total_seconds()
Out[23]: 5e-05
It appears (e.g. here) that timedelta.total_seconds() aims for microsecond accuracy only. Is this the behaviour we want to reproduce in Series.dt.total_seconds()? Nanosecond precision appears preferable, but then the scalar and Series functions would have slightly different behaviour.
It should be consistent between both in any case, but maybe we can also see the Timedelta behaviour as a bug?
I suspect that this method is just inherited from datetime.timedelta, so maybe we will have to overwrite it ourselves in the subclass.
BTW, the reason of this difference is that datetime.timedelta does not support nanoseconds, while pd.Timedelta does
the pandas functions should work with ns precision in all cases
so need to override the scalar function as well
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use double-backticks around timedelta64 and .dt.total_seconds()
I think also add the method .total_seconds() on the NaTType object (returns np.nan) for compat. (and pls add a test as well).
sjdenny changed the title
Add Series.dt.total_seconds GH #10817 ENH: Add Series.dt.total_seconds GH #10817
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use result = self._maybe_mask_results(result) (you can remove the hasnans as well)
jreback added a commit that referenced this pull request
ENH: Add Series.dt.total_seconds GH #10817
sjdenny deleted the series_total_seconds branch