CodeParser not opening source files with proper decoder · Issue #107 · nedbat/coveragepy (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Description
Originally reported by Brett Cannon (Bitbucket: brettcannon, GitHub: brettcannon)
In CodeParser.init() you will notice that it is opening a source file and then reading it, relying on the default encoding for open(). This can trigger a UnicodeDecodeError if the source file specifies an explicit encoding other than Unicode (on Python 3).
For example, in Python's stdlib, Lib/sqlite3/test/dbapi.py has a specified encoding of ISO-8859-1. But because the CodeParser doesn't use something like tokenize.detect_encoding() (http://docs.python.org/py3k/library/tokenize.html#tokenize.detect_encoding) the read fails as there is some bytes in there not allowed under UTF-8 but are valid under ISO-8859-1.
- Bitbucket: https://bitbucket.org/ned/coveragepy/issue/107
- This issue had attachments: tokenize_open.diff. See the original issue for details.