PyArg_ParseTupleAndKeywords() and non-ASCII keyword names · Issue #110815 · python/cpython (original) (raw)
Most of C strings in the C API are implied to be UTF-8 encoded. PyArg_ParseTupleAndKeywords() mostly works with non-ASCII keyword names as they are UTF-8 encoded. Except one case, when you pass argument by keyword with invalid non-ASCII name to a function that has optional parameter with non-ASCII UTF-8 encoded name. In this case you get a crash in the debug build.
It was caused by combination of f4934ea and a83a6a3 (bpo-28701, #72887). Before these changes you simply got inaccurate or even wrong error message.
Examples:
- Parameters: "ä"
Keyword arguments: "ë"
Old behavior: TypeError "'ë' is an invalid keyword argument for this function"
Current behavior: crash
Expected behavior: TypeError "'ë' is an invalid keyword argument for this function" - Parameters: "ä"
Keyword arguments: "ä"
Old behavior: TypeError "invalid keyword argument for this function"
Current behavior: crash
Expected behavior: TypeError "'ä' is an invalid keyword argument for this function" - Parameters: "ä", "ë"
Keyword arguments: "ä", "ë"
Old behavior: TypeError "'ë' is an invalid keyword argument for this function"
Current behavior: crash
Expected behavior: TypeError "'ä' is an invalid keyword argument for this function"
In case 1 the pre-bpo-28701 behavior was correct, in case 2 it was correct but not precise (it failed to find the name of invalid keyword argument), in case 3 it was wrong (it found wrong name). In all cases there is a crash currently.