Possible date parsing bug in read_table · Issue #2618 · pandas-dev/pandas (original) (raw)

I recently updated Pandas to 0.10.0 under Python 2.7.3. I have encountered problems with the read_table function. My program read in the following file:

ftp://ftp.ncdc.noaa.gov/pub/data/anomalies/monthly.land\_ocean.90S.90N.df\_1901-2000mean.dat

Here's the head of the file:

1880 1 -0.0760 1880 2 -0.2099 1880 3 -0.2170 1880 4 -0.1180 1880 5 -0.1680 1880 6 -0.2055 1880 7 -0.1863 1880 8 -0.1128 1880 9 -0.1192 1880 10 -0.1951

The code to load the data:

import pandas as pd noaa_file = "monthly.land_ocean.90S.90N.df_1901-2000mean.dat"

noaa = pd.read_table(noaa_file, header=None, sep=r'\s*', parse_dates=[[0,1]], index_col=0, squeeze=True, na_values='-999.0000').to_period(freq='M')

This throws an exception:

AttributeError: 'Series' object has no attribute 'to_period'

I invoke to_period because the dates that were parsed were appearing as YYYY-MM-DD. The data is monthly and there is no need for a day component. I dropped the to_period method and the error disappeared. But I noticed something strange about the index:

         2

0_1
1880 1 -0.0760 1880 2 -0.2099 1880 3 -0.2170 1880 4 -0.1180 1880 5 -0.1680

The index is a pandas.core.index.Index object. Under the previous version of the library, the index was a pandas.tseries.period.PeriodIndex object. It looks like the dates aren't be parsed at all. If I drop the to_period method and follow-up with

noaa_temp = pd.Series(noaa_temp.values, pd.PeriodIndex(noaa_temp.index, freq='M'))

then I get exactly what I need and what the original one-liner at the top produced under the previous version. The only way I can accomplish this in "one" line is to define a parser:

from datetime import datetime parse = lambda x: datetime.strptime(x, '%Y %m')

noaa = pd.read_table(noaa_file, header=None, delim_whitespace=True, parse_dates=[[0,1]], index_col=0, squeeze=True, na_values='-999.0000', date_parser=parse).to_period(freq='M')

Now everything works. I think this is a bug in the date parser.