[Python-ideas] csv.DictReader could handle headers more intelligently. (original) (raw)

MRAB python at mrabarnett.plus.com
Thu Jan 24 17:12:09 CET 2013


On 2013-01-24 15:24, Chris Angelico wrote:

On Fri, Jan 25, 2013 at 2:11 AM, J. Cliff Dyer <jcd at sdf.lonestar.org> wrote:

On Thu, 2013-01-24 at 13:38 +0100, Antoine Pitrou wrote:

> 1. Do any data conditioning by ignoring empty lines and lines of > just field delimiters before the header row (consensus seems to be > "no")

Well, I wouldn't necessarily say we have a consensus on this one. This idea received a +1 from Bruce Leban and an "I don't see any reason not to" from Steven D'Aprano. I've been lurking this thread, but fwiw, I'd put +1 on ignoring empty lines/just delimiter lines. For a row of column headers, a completely blank line makes no sense. It's a backward-incompatible change, yes, but I can't imagine any code actively relying on this. ISTM this would probably be safe for a minor release (Python 3.4), though of course not for Python 3.3.1. Ignoring empty lines before a header seems OK to me, but ignoring just-delimiter lines doesn't.

To me, a just-delimiter line where it's expecting a header would mean that all of the columns are unnamed, unless we insist that it's not a header unless at least one column is named, and I don't think that that should be the default behaviour.

As for duplicated columns names, I think that it should probably raise an exception unless you've specified that duplicates should be put into a list.



More information about the Python-ideas mailing list