FIX: Scalar timedelta NaT results raise by ischwabacher · Pull Request #7661 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation10 Commits5 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

ischwabacher

Currently, coercion of scalar results from float to timedelta64[ns] passes through int, which raises when attempting to coerce NaN to NaT.

To reproduce:

import pandas as pd import numpy as np pd.Series([np.timedelta64('NaT')]).sum()

TypeError: reduction operation 'sum' not allowed for this dtype

@ischwabacher

Currently, coercion of scalar results from float to timedelta64[ns] passes through int, which raises when attempting to coerce NaN to NaT.

To reproduce:


import pandas as pd
import numpy as np
pd.Series([np.timedelta64('NaT')]).sum()
# TypeError: reduction operation 'sum' not allowed for this dtype

@jreback

needs tests (put in tests/test_series.py, after the timdelta operations sections. Need to systematically tests these types of ops (both with and w/o NaT). Might also be some tests in tseries/tests/test_timeseries.py/test_timedeltas.py

e.g. sum,mean,max,min, and that raises on things like var,std

@ischwabacher

Test coverage for ops on NaT values is a little better. The SciPy functions for which operations on datetime64 and timedelta64 make sense could be added, but to my knowledge SciPy will currently choke on these anyway.

@ischwabacher

@ischwabacher

Oh, the problem is that test_nanops.py only covers NaN and ignores NaT. Hrm. Any bets on how much breakage this change is going to reveal?

@ischwabacher

It looks like np.timedelta64('NaT') creates an object of dtype m8 instead of m8[ns], and the isnull can't handle that. Instead of accepting m8, it would probably be better to coerce np.timedelta64('NaT') to np.timedelta64('NaT', 'ns') whenever it's being added to a Pandas object, if that's feasible.

@jreback

you don't handle m8 like that
it's an integer array; use tslib.iNaT

@ischwabacher

@ischwabacher

I don't understand what you mean by that. I will see if I can get back to it sometime over the weekend.

@ischwabacher

Note that print(np.array([pd.tslib.iNaT], dtype='m8[ns]')) does not display correctly, but its behavior is consistent with being Not-A-Time.

@ischwabacher

OK, clearly I need to wait until I have time to really read this and figure out what's going on.

@jreback

have a look at tseries/test_timedeltas.py. timedeltas are an odd duck. You generally work on an int64 based series, then take a timedelta64[ns] view. note that almost all tests are skipped for numpy < 1.7 (as its broken) for timedeltas.

@jreback

@ischwabacher

Sorry, hectic time at home. I will try to make time for this soon.

@jreback

closing in favor of #8268

this is already fixed in master with Timedelta refactor

2 participants

@ischwabacher @jreback