Issue 8495: test_gdb: use utf8+surrogateescape charset? (original) (raw)

Because of a strange bug, gdb writes random bytes to stdout. test_gdb decodes output as utf8, but these random bytes cause a UnicodeDecodeError:

ERROR: test_int (main.PrettyPrintTests) Verify the pretty-printing of various "int"/long values

Traceback (most recent call last): File "Lib/test/test_gdb.py", line 188, in test_int self.assertGdbRepr(1000000000000) File "Lib/test/test_gdb.py", line 176, in assertGdbRepr cmds_after_breakpoint) File "Lib/test/test_gdb.py", line 144, in get_gdb_repr import_site=import_site) File "Lib/test/test_gdb.py", line 120, in get_stack_trace out, err = self.run_gdb(*args) File "Lib/test/test_gdb.py", line 62, in run_gdb return out.decode('utf-8'), err.decode('utf-8') UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1882-1887: unsupported Unicode code range

surrogateescape should be used the invalid sequence using surrogates.

See attached file for the strange gdb bug.

command is the byte string "id(1000000000000)\n\0" (19 bytes, strlen=18), but gdb prints bytes after the \0. Stranger: print (*command)@15 does also prints these random bytes, whereas print (*command)@14 doesn't.