ENH: Add Series.dt.total_seconds GH #10817 by sjdenny · Pull Request #10939 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation19 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

sjdenny

Implements a Series.dt.total_seconds method for timedelta64 Series.

closes #10817

@jreback

does timedelta.total_seconds() provide fractional second as wel?

@sjdenny

Yes, fractional seconds are included:

In [1]: s = Series(pd.timedelta_range('1day',periods=5,freq='1ms'))

In [2]: s
Out[2]: 
0          1 days 00:00:00
1   1 days 00:00:00.001000
2   1 days 00:00:00.002000
3   1 days 00:00:00.003000
4   1 days 00:00:00.004000
dtype: timedelta64[ns]

In [3]: s.dt.total_seconds
Out[3]: 
0    86400.000
1    86400.001
2    86400.002
3    86400.003
4    86400.004
dtype: float64

@jreback

not what I mean

is the actual Python timedelta.total_seconds have fractions (I think yes) just confirming

@sjdenny

Ah, I understand. Yes, it does:

In [38]: t = timedelta(days=1,microseconds=40)

In [39]: t.total_seconds()
Out[39]: 86400.00004

jreback

@@ -944,6 +945,27 @@ def test_fields(self):
tm.assert_series_equal(s.dt.days,Series([1,np.nan],index=[0,1]))
tm.assert_series_equal(s.dt.seconds,Series([10*3600+11*60+12,np.nan],index=[0,1]))
def test_total_seconds(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test in test_series/test_dt_namespace_accessor as well (its like the one you have for test Series)

@jorisvandenbossche

Indeed, didn't see that, but +1 making it a method instead of a property for consistency with datetime.timedelta/pd.Timedelta

@jorisvandenbossche

Further, @sjdenny can you add a whatsnew notice (see doc/source/whatsnew/v0.17.0.txt, somewhere in 'other enhancements')

@sjdenny

In extending the (nanosecond-precision) tests, I've come across this behaviour:

In [18]: td = pd.Timedelta(nanoseconds=50)

In [19]: td.nanoseconds
Out[19]: 50

In [20]: td.total_seconds()
Out[20]: 0.0

In [21]: td.total_seconds() == 0.0
Out[21]: True

In [22]: td2 = pd.Timedelta(microseconds=50)

In [23]: td2.total_seconds()
Out[23]: 5e-05

It appears (e.g. here) that timedelta.total_seconds() aims for microsecond accuracy only. Is this the behaviour we want to reproduce in Series.dt.total_seconds()? Nanosecond precision appears preferable, but then the scalar and Series functions would have slightly different behaviour.

@jorisvandenbossche

It should be consistent between both in any case, but maybe we can also see the Timedelta behaviour as a bug?

I suspect that this method is just inherited from datetime.timedelta, so maybe we will have to overwrite it ourselves in the subclass.

BTW, the reason of this difference is that datetime.timedelta does not support nanoseconds, while pd.Timedelta does

@jreback

the pandas functions should work with ns precision in all cases

so need to override the scalar function as well

@jorisvandenbossche

jreback

@@ -176,6 +176,10 @@ Other enhancements
- ``pandas.tseries.offsets`` larger than the ``Day`` offset can now be used with with ``Series`` for addition/subtraction (:issue:`10699`). See the :ref:`Documentation <timeseries.offsetseries>` for more details.
- ``pd.Series`` of type timedelta64 has new method .dt.total_seconds() returning the duration of the timedelta in seconds (:issue: `10817`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double-backticks around timedelta64 and .dt.total_seconds()

@jreback

I think also add the method .total_seconds() on the NaTType object (returns np.nan) for compat. (and pls add a test as well).

@sjdenny sjdenny changed the titleAdd Series.dt.total_seconds GH #10817 ENH: Add Series.dt.total_seconds GH #10817

Sep 1, 2015

jreback

values = self.asi8
hasnans = self.hasnans
result = 1e-9 * values
if hasnans:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use result = self._maybe_mask_results(result) (you can remove the hasnans as well)

@sjdenny

jreback added a commit that referenced this pull request

Sep 2, 2015

@jreback

ENH: Add Series.dt.total_seconds GH #10817

@jreback

@jorisvandenbossche

@sjdenny sjdenny deleted the series_total_seconds branch

September 2, 2015 20:13