Issue 13209: Refactor code using unicode_encode_call_errorhandler() in unicodeobject.c (original) (raw)
It's difficult to use unicode_encode_call_errorhandler() because the caller has to:
- resize the output buffer (and check for integer overflow on the new size)
- handle bytes and str for the replacement string: PyUnicode_EncodeDecimal() doesn't support bytes for example
- encode replacement str: some encoders uses ASCII, unicode_encode_ucs1() uses Latin1, PyUnicode_EncodeCharmap() uses a recursive call (without check for infinite loop!), ... ; and raise a UnicodeEncodeError if the encoding fails
It would be nice to factorize this code.
I plan this implement this refactoring, it's just a reminder for me :-)
I tried to factorize the code, but it is too complex. Each encoder handles errors differently. The most tricky is charmap: it reencodes the result of the error handler for non-ASCII characters. I'm not happy with the current situtation, but I don't see how to factorize easily the code, so I prefer to leave it unchanged. I close the issue.