Tools/scripts/h2py.py fails with UnicodeDecodeError when a header file contains characters undecodable in current locale. I suggest to use binary mode. I'm attaching a patch.
Using your patch, h2py.py skips all statements that cannot be decoded from UTF-8, whereas unpatched h2py.py accepts all statements that can be decoded from the locale encoding. I don't know if it is intentional to accept non-ASCII statements. It is maybe safer to ensure that a statement is encodable to ASCII using: try: stmt.encode('ASCII') exec(stmt, env) except: ... Anyway, I would prefer to just drop this script with all Lib/plat-*/ directoes. I reopened this topic on python-dev (I already asked when I was working on the sys.platform=="linux3" issue: #12326).
UTF-8 is default encoding in Python 3, so statements with UTF-8 characters could be accepted. Any strings are very rare in these statements. On my system, only generated TYPES.py contains 2 strings: # Included from bits/select.h __FD_ZERO_STOS = "stosq" __FD_ZERO_STOS = "stosl" /usr/include/bits/select.h contains: # if __WORDSIZE == 64 # define __FD_ZERO_STOS "stosq" # else # define __FD_ZERO_STOS "stosl" # endif