[Python-Dev] Quick sum up about open() + BOM (original) (raw)
Victor Stinner victor.stinner at haypocalc.com
Sat Jan 9 14:34:17 CET 2010
- Previous message: [Python-Dev] Quick sum up about open() + BOM
- Next message: [Python-Dev] Quick sum up about open() + BOM
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Le samedi 09 janvier 2010 02:12:28, MRAB a écrit :
What about listing the possible encodings? It would try each in turn until it found one where the BOM matched or had no BOM:
myfile = open(filename, 'r', encoding='UTF-8-sig|UTF-16|UTF-8') or is that taking it too far?
Yes, you're taking it foo far :-) Checking BOM is reliable, whereas guessing the charset only using the byte stream can only be an heuristic. Guess a charset is a complex problem, they are 3rd party library to do that, like the chardet project.
-- Victor Stinner http://www.haypocalc.com/
- Previous message: [Python-Dev] Quick sum up about open() + BOM
- Next message: [Python-Dev] Quick sum up about open() + BOM
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]