0.17 regression: convert_objects coercion with multiple dtypes (original) (raw)

I upgraded from 0.16.2 to 0.17/master and it broke working code.

0.16.2:

In [2]: import pandas as pd ...: from StringIO import StringIO ...: x="""foo,bar ...: 2015-09-14,True ...: 2015-09-15, ...: """ ...: df=pd.read_csv(StringIO(x),sep=',').convert_objects('coerce') ...: df.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 2 entries, 0 to 1 Data columns (total 2 columns): foo 2 non-null datetime64[ns] bar 1 non-null object dtypes: datetime64ns, object(1) memory usage: 48.0+ bytes

0.17/master:

import pandas as pd from StringIO import StringIO x="""foo,bar 2015-09-14,True 2015-09-15, """ df=pd.read_csv(StringIO(x),sep=',').convert_objects('coerce') df.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 2 entries, 0 to 1 Data columns (total 2 columns): foo 2 non-null datetime64[ns] bar 0 non-null datetime64[ns] dtypes: datetime64ns memory usage: 48.0 bytes /home/ambk/work/pandas/pandas/core/generic.py:2584: FutureWarning: The use of 'coerce' as an input is deprecated. Instead set coerce=True. FutureWarning)

booleans get cast to datetimes now?! Usually deprecation means "avoid in new code" and not that your working code will break, otherwise it wouldn't be a deprecation but a breaking change. So that's not good, but ok, let's follow the helpful hint:

In [3]: import pandas as pd ...: from StringIO import StringIO ...: x="""foo,bar ...: 2015-09-14,True ...: 2015-09-15, ...: """ ...: df=pd.read_csv(StringIO(x),sep=',').convert_objects(coerce=True) ...: df.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 2 entries, 0 to 1 Data columns (total 2 columns): foo 2 non-null object bar 1 non-null float64 dtypes: float64(1), object(1) memory usage: 48.0+ bytes

wut?, booleans become floats and almost-datetimes aren't converted at all anymore?