[Python-ideas] csv.DictReader could handle headers more intelligently. (original) (raw)

Shane Green shane at umbrellacode.com
Sat Jan 26 15:39:11 CET 2013


Okay, I like your point about DictReader having a place with a subset of CSV tables, and agree that, given that definition, it should throw an exception when its fed something that doesn't conform to this definition. I like that.

One thing, though, the new version would let you access column data by name as well:

Instead of row["timestamp"] == 1359210019.299478

It would be row["timestamp"] == [1359210019.299478]

And potentially row["timestamp"] == [1359210019.299478,1359210019.299478]

It could also be accessed as: row.headers[0] == "timestamp" row.headers[1] == "timestamp" row.values[0] == 1359210019.299478 row.values[1] == 1359210019.299478

Could still provide: for name,value in records.iterfirstitems(): # get the first value for each column with a given name. - or - for name,value in records.iterlasttitems(): # get the last value for each column with a given name.

And the exact functionality you have now: records.itervaluemaps() # or something… just a map(dict(records.iterlastitesm()))

Overkill, but really simple things to add…

The only thing this really adds to the "convenience" of the current DictReader for well-behaved tables, is the ability to access values sequentially or by name; other than that, the only difference would be iterating on a generator method's output instead of the instance itself.

Shane Green www.umbrellacode.com 408-692-4666 | shane at umbrellacode.com

On Jan 26, 2013, at 5:53 AM, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:

Shane Green writes:

And while it's true that a dictionary is a dictionary and it works the way it works, the real point that drives home is that it's an inappropriate mechanism for dealing ordered rows of sequential values. Right! So use csv.reader, or csv.DictReader with an explicit fieldnames argument. The point of csv.DictReader with default fieldnames is to take a "well-behaved" table and turn it into a sequence of "poor-man's" objects. The final point is a simple one: while that CSV file format was stupid, it was perfectly legal. Something that deals with CSV content should not be losing any of its content. That's a reasonable requirement. It also should [not] be barfing or throwing exceptions, by the way. That's not. As long as the module provides classes capable of handling any CSV format (it does), it may also provide convenience classes for special purposes with restricted formats. Those classes may throw exceptions on input that doesn't satisfy the restrictions. And what about fixing it by replacing implementing a class that does it correctly, [...]? Doesn't help users who want automatically detected access-by-name. They must have unique field names. (I don't have a use case. I assume the implementer of csv.DictReader did.)

-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130126/6964040a/attachment.html>



More information about the Python-ideas mailing list