[Python-ideas] csv dialect enhancement (original) (raw)

Chris Angelico rosuav at gmail.com
Sun Jan 13 21:55:21 CET 2013


On Sat, Jan 12, 2013 at 4:16 AM, rurpy at yahoo.com <rurpy at yahoo.com> wrote:

There is a common dialect of CSV, often used in database applications [*1], that distinguishes between an empty (quoted) string,

e.g., the second field in "abc","",3 and an empty field, e.g., the second field in "abc",,3 This distinction is needed to specify or tell the difference between 0-length strings and NULLs, when sending csv data to or receiving it from a database application.

Ugh, this is exactly the sort of thing that my boss didn't believe happened. He thinks that CSV is the same the world over, except for a few really old or arcane programs that can be completely ignored. Took a lot of arguing before we agreed to disagree on that one...

As an explicitly-requestable dialect, looks good.

Sniffer: Will set "nulls" to True when both adjacent delimiters and quoted empty strings are seen in the input text. (Perhaps this behaviour needs to be optional for backward compatibility reasons?)

Yes, and make it optional. I think the interpretation of ,,,, as empty strings is the more common, since CSV is often used in contexts that don't have a concept of NULL (spreadsheets mainly); this ought, then, to be the default, but one quick option can add recognition of this.

So, +1 on the whole idea.

ChrisA



More information about the Python-ideas mailing list