[Python-ideas] csv.DictReader could handle headers more intelligently. (original) (raw)

Steven D'Aprano steve at pearwood.info
Fri Jan 25 00:15:14 CET 2013


On 25/01/13 03:08, J. Cliff Dyer wrote:

On Thu, 2013-01-24 at 07:28 -0800, Shane Green wrote:

Since every form of CSV file counts EOL as a line terminator, I think discarding empty lines preceding the headers is arguably acceptable, but do not think discarding lines of just delimiters would be. What about extending the DictReader API so it was easy to perform these actions explicitly, such as being able to discard() the field names to be re-evaluated on the next line? I think I like this idea. There's something a little distasteful about making the user manually delve into the underlying reader, but this makes it more user-friendly and more obvious how to proceed.

I couldn't disagree more. I think:

For clarity's sake, what is your objection to discarding lines of delimiters? The reason I suggest doing it is that it is a common output situation when exporting Excel files or LibreCalc files that have a blank row at the top.

A row of delimiters should be treated by the reader object as a row with explicitly empty fields. If the caller wishes to discard them, they can. But the reader object shouldn't make that decision.

An empty row, on the other hand, should be just ignored. DictReader already ignores empty rows, provided that they are not in the first row.

-- Steven



More information about the Python-ideas mailing list