PERF: add exact kw to to_datetime to enable faster regex format parsing by jreback · Pull Request #8904 · pandas-dev/pandas (original) (raw)

closes #8989
closes #8903

Clearly the default is exact=True for back-compat
but allows for a match starting at the beginning (has always been like this),
but doesn't require the string to ONLY match the format, IOW, can be extra stuff after the match.

Avoids having to do a regex replace first.

In [21]: s = Series(['19MAY11','19MAY11:00:00:00']*100000)

In [22]: %timeit pd.to_datetime(s.str.replace(':\S+$',''),format='%d%b%y')
1 loops, best of 3: 828 ms per loop

In [23]: %timeit pd.to_datetime(s,format='%d%b%y',exact=False)
1 loops, best of 3: 603 ms per loop