Crash when pd.to_datetime gets a None for certain values of format · Issue #30011 · pandas-dev/pandas (original) (raw)

Code Sample

import pandas as pd

print(pd.to_datetime(['19850212', '19890611'], format = '%Y-%m-%d'))

=> DatetimeIndex(['1985-02-12', '1989-06-11'], dtype='datetime64[ns]', freq=None)

print(pd.to_datetime(['19850212', '19890611', None], format = '%Y-%m-%d'))

=> DatetimeIndex(['1985-02-12', '1989-06-11', 'NaT'], dtype='datetime64[ns]', freq=None)

print(pd.to_datetime(['19850212', '19890611'], format = '%Y%m%d'))

=> DatetimeIndex(['1985-02-12', '1989-06-11'], dtype='datetime64[ns]', freq=None)

print(pd.to_datetime(['19850212', '19890611', None], format = '%Y%m%d'))

=> TypeError, ValueError

Problem description

When format is '%Y%m%d', pd.to_datetime chokes on the None, rather than passing it through as NaT as I would expect and as occurred in a previous version of pandas (I'm not sure which). The error is raised even if pd.to_datetime's errors argument is set to 'ignore' or 'raise'. The cache argument doesn't make a difference, either.

Related: #23055

Tracebacks

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/tools/datetimes.py", line 448, in _convert_listlike_datetimes
    values, tz = conversion.datetime_to_datetime64(arg)
  File "pandas/_libs/tslibs/conversion.pyx", line 200, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/example.py", line 6, in <module>
    print(pd.to_datetime(["19850212", "19890611", None], format = "%Y%m%d"))
  File "/usr/local/lib/python3.7/dist-packages/pandas/util/_decorators.py", line 208, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/tools/datetimes.py", line 794, in to_datetime
    result = convert_listlike(arg, box, format)
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/tools/datetimes.py", line 451, in _convert_listlike_datetimes
    raise e
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/tools/datetimes.py", line 409, in _convert_listlike_datetimes
    "cannot convert the input to " "'%Y%m%d' date format"
ValueError: cannot convert the input to '%Y%m%d' date format

Output of pd.show_versions()

commit           : None
python           : 3.7.5.candidate.1
python-bits      : 64
OS               : Linux
OS-release       : 5.3.0-19-generic
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 0.25.3
numpy            : 1.16.2
pytz             : 2019.2
dateutil         : 2.7.3
pip              : 18.1
setuptools       : 41.1.0
Cython           : None
pytest           : 4.5.0
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.2.5
html5lib         : 1.0.1
pymysql          : None
psycopg2         : 2.7.7 (dt dec pq3 ext lo64)
jinja2           : 2.10
IPython          : None
pandas_datareader: None
bs4              : 4.7.1
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.2.5
matplotlib       : 3.0.2
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
s3fs             : None
scipy            : 1.3.0
sqlalchemy       : None
tables           : None
xarray           : 0.13.0
xlrd             : 1.2.0
xlwt             : None
xlsxwriter       : None