first line comments on a read_csv · Issue #4623 · pandas-dev/pandas (original) (raw)

related #4505

It seems that commenting on the first line is a little buggy (or perhaps not well-defined):

In [11]: s1 = '# notes\na,b,c\n# more notes\n1,2,3'

In [12]: s2 = 'a,b,c\n# more notes\n1,2,3'

In [13]: pd.read_csv(StringIO(s1), comment='#')
Out[13]: 
        Unnamed: 0
a   b            c
NaN NaN        NaN
1   2            3

In [14]: pd.read_csv(StringIO(s2), comment='#')
Out[14]: 
    a   b   c
0 NaN NaN NaN
1   1   2   3

If you ignore the header:

In [15]: pd.read_csv(StringIO(s1), comment='#', header=None)
CParserError: Error tokenizing data. C error: Expected 1 fields in line 2, saw 3

related #3001 and from this SO question.