Issue 1532483: the csv module writes files that Excel sees as SYLK files (original) (raw)

Created on 2006-08-01 14:52 by madewokherd, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
csvtest.py madewokherd,2006-08-01 14:52 test case that writes a file Excel has trouble opening
Messages (6)
msg29409 - (view) Author: Vincent Povirk (madewokherd) Date: 2006-08-01 14:52
I'm using python version 2.4.3 Apparently, when Excel 2003 reads a file, it looks for the identifying string "ID" at the beginning of the file. If it finds this string, it assumes it's reading an SYLK file (see http://netghost.narod.ru/gff/graphics/summary/micsylk.htm for some information on SYLK). The csv module will generate a file that starts with ID if the first field it writes starts with ID and does not need to be quoted. When Excel tries to open the file, the following message pops up: "Excel has detected that 'test.csv' is a SYLK file, but cannot load it. Either the file has errors or it is not a SYLK file format. Click OK to try to open the file in a different format." Excel can read the file after clicking OK. Excel actually has the same problem with CSV files it has written. Even so, when using the 'excel' dialect, csv should write files that Excel can open without any problems. It could do this by quoting the first field in the file if it begins with "ID". Unfortunately, csv's Dialect class does not make this possible. I'm currently working around it by using QUOTE_NONNUMERIC.
msg29410 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2006-08-01 16:22
Logged In: YES user_id=44345 Seems like a shortcoming in Excel 2003 to me, not a problem with the csv module. Still, if you can suggest a change that won't break many other uses of the csv module's output, I'll consider it. Have you tried the same test with a later version of Excel? Skip
msg29411 - (view) Author: Vincent Povirk (madewokherd) Date: 2006-08-01 18:01
Logged In: YES user_id=553355 Thanks for your response. Yes, it's definitely a bug in Excel 2003 (as it also complains about files it saved). I do not have a later version of Excel to test. Microsoft has a page about this issue that seems to say 2003 is the last version with that problem: http://support.microsoft.com/kb/323626/ Their solution is worse than the problem, I'd be interested in seeing how a later version behaves. I know that if the first cell is quoted, Excel will open it without complaining. I think the best solution would be to quote the first cell if it starts with ID by introducing a new QUOTE_ constant. I don't know how that part of the code works (I'm too lazy to read things that aren't written in python); maybe it's more reasonable to quote any field starting with ID. I don't know of any other uses that would break, but I'm not in touch with many csv users.
msg29412 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2006-08-02 13:28
Logged In: YES user_id=44345 Vincent, A simple workaround would be to define a fully quoting dialect: class quoted_excel(csv.excel): quoting=csv.QUOTE_NONNUMERIC # or QUOTE_ALL That would cause your generated CSV files to start with a quote character, e.g.: "ID","FOO" 1,"bar" 2,"bAz" Try that and see if it makes Excel 2003 happy. Skip
msg29413 - (view) Author: Vincent Povirk (madewokherd) Date: 2006-08-03 05:17
Logged In: YES user_id=553355 It does, and that's what I've been doing. I can't believe I failed to mention it.
msg29414 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2006-08-03 11:17
Logged In: YES user_id=44345 > I can't believe I failed to mention it. Actually, you did: > I'm currently working around it by using QUOTE_NONNUMERIC. I didn't notice it because I was focused on Microsloth's "workaround"... My apologies. I'm closing this as "won't fix". Skip
History
Date User Action Args
2022-04-11 14:56:19 admin set github: 43756
2006-08-01 14:52:24 madewokherd create