Chokes on source files with non-utf-8 encoding · Issue #157 · nedbat/coveragepy (original) (raw)
If you have python source files that are, e.g. latin-1 encoded, the reporter will die like this:
coverage.main()
File "/var/cache/eggs/coverage-3.5.1-py2.6-linux-x86_64.egg/coverage/cmdline.py", line 657, in main
status = CoverageScript().command_line(argv)
File "/var/cache/eggs/coverage-3.5.1-py2.6-linux-x86_64.egg/coverage/cmdline.py", line 549, in command_line
directory=options.directory, **report_args)
File "/var/cache/eggs/coverage-3.5.1-py2.6-linux-x86_64.egg/coverage/control.py", line 599, in html_report
reporter.report(morfs, config=self.config)
File "/var/cache/eggs/coverage-3.5.1-py2.6-linux-x86_64.egg/coverage/html.py", line 83, in report
self.report_files(self.html_file, morfs, config, config.html_dir)
File "/var/cache/eggs/coverage-3.5.1-py2.6-linux-x86_64.egg/coverage/report.py", line 86, in report_files
report_fn(cu, self.coverage._analyze(cu))
File "/var/cache/eggs/coverage-3.5.1-py2.6-linux-x86_64.egg/coverage/html.py", line 198, in html_file
self.write_html(html_path, html)
File "/var/cache/eggs/coverage-3.5.1-py2.6-linux-x86_64.egg/coverage/html.py", line 103, in write_html
write_encoded(fname, html, 'ascii', 'xmlcharrefreplace')
File "/var/cache/eggs/coverage-3.5.1-py2.6-linux-x86_64.egg/coverage/backward.py", line 137, in write_encoded
f.write(text.decode('utf8'))
File "/usr/local/python2.6/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe4 in position 14451: invalid continuation byte
The workaround is simple, of course, change the file's encoding and declaration (and you should be using utf-8 if any, anyway). But still I wonder whether this could be handled more gracefully and with an error message that tells what's going on.