BUG: Timedelta input string with only symbols and no digits failed to raise an error · Issue #39710 · pandas-dev/pandas (original) (raw)


pd.Timedelta with non-digit string input, such as pd.Timedelta('-'), successfully processes and returns Timedelta('0 days 00:00:00'). From discussion with @jreback in #39497 it was concluded that this is unwanted and that it should raise an error instead.

I found that this appears for string input with '+', '-', ',' and ' ' in various combinations, see below

In [1]: pd.Timedelta('+') Out[1]: Timedelta('0 days 00:00:00')

In [2]: pd.Timedelta('-') Out[2]: Timedelta('0 days 00:00:00')

In [3]: pd.Timedelta(' ') Out[3]: Timedelta('0 days 00:00:00')

In [4]: pd.Timedelta(',') Out[4]: Timedelta('0 days 00:00:00')

In [5]: pd.Timedelta('++') Out[5]: Timedelta('0 days 00:00:00')

In [6]: pd.Timedelta('+-') Out[6]: Timedelta('0 days 00:00:00')

Problem description

Invalid input should raise an error instead of successfully resolving to 0

Expected Output

Raise an error. I would propose to generalize the ValueError("unit abbreviation w/o a number") on L497 of timedeltas.pyx to ValueError("characters w/o a number") . Since this error also shows up if there are no unit abbrevations, for example:

In [2]: pd.Timedelta('foo')

ValueError Traceback (most recent call last) in ----> 1 pd.Timedelta('foo')

~/Documents/developer/pandas/pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.Timedelta.new() 1192 value = parse_iso_format_string(value) 1193 else: -> 1194 value = parse_timedelta_string(value) 1195 value = np.timedelta64(value) 1196 elif PyDelta_Check(value):

~/Documents/developer/pandas/pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.parse_timedelta_string() 429 result += timedelta_as_neg(r, neg) 430 else: --> 431 raise ValueError("unit abbreviation w/o a number") 432 433 # treat as nanoseconds

ValueError: unit abbreviation w/o a number

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 492c5e0
python : 3.8.6.final.0
python-bits : 64
OS : Darwin
OS-release : 20.2.0
Version : Darwin Kernel Version 20.2.0: Wed Dec 2 20:39:59 PST 2020; root:xnu-7195.60.75~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8

pandas : 1.2.0.dev0+1259.g492c5e00d1
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.4
setuptools : 49.6.0.post20201009
Cython : 0.29.21
pytest : 6.1.1
hypothesis : 5.37.3
sphinx : 3.2.1
blosc : None
feather : None
xlsxwriter : 1.3.7
lxml.etree : 4.6.0
html5lib : 1.1
pymysql : None
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.2
IPython : 7.18.1
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 0.8.4
fastparquet : 0.4.1
gcsfs : 0.7.1
matplotlib : 3.3.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.5
pandas_gbq : None
pyarrow : 1.0.1
pyxlsb : None
s3fs : 0.4.2
scipy : 1.5.2
sqlalchemy : 1.3.20
tables : 3.6.1
tabulate : 0.8.7
xarray : 0.16.1
xlrd : 1.2.0
xlwt : 1.3.0
numba : 0.51.2