ENH: Add support for more placeholders in guess_datetime_format · Issue #43901 · pandas-dev/pandas (original) (raw)

We recently realized that pandas.core.tools.datetimes.guess_datetime_format outputs an incorrect format string "Tue %d %b %Y %H:%M:%S AM" for the datetime "Tue 24 Aug 2021 01:30:48 AM", which otherwise is parseable by dateutil.parser.parse.

Describe the solution you'd like

Since "Tue 24 Aug 2021 01:30:48 AM" is parseable by dateutil.parser.parse and guess_datetime_format is based on that function, it seems reasonable that guess_datetime_format should produce a correct format string when given this example datetime.

API breaking implications

I believe modifying guess_datetime_format to support more formats placeholders should be a backwards compatible change.

Describe alternatives you've considered

Not sure that there are any.

Additional context

from pandas.core.tools.datetimes import guess_datetime_format

actual

assert guess_datetime_format("Tue 24 Aug 2021 01:30:48 AM") == "Tue %d %b %Y %H:%M:%S AM"

expected

assert guess_datetime_format("Tue 24 Aug 2021 01:30:48 AM") == "%a %d %b %Y %H:%M:%S %p"

I already have a PR up for this here: #43900