Issue 13209: Refactor code using unicode_encode_call_errorhandler() in unicodeobject.c (original) (raw)

It's difficult to use unicode_encode_call_errorhandler() because the caller has to:

resize the output buffer (and check for integer overflow on the new size)
handle bytes and str for the replacement string: PyUnicode_EncodeDecimal() doesn't support bytes for example
encode replacement str: some encoders uses ASCII, unicode_encode_ucs1() uses Latin1, PyUnicode_EncodeCharmap() uses a recursive call (without check for infinite loop!), ... ; and raise a UnicodeEncodeError if the encoding fails

It would be nice to factorize this code.

I plan this implement this refactoring, it's just a reminder for me :-)

I tried to factorize the code, but it is too complex. Each encoder handles errors differently. The most tricky is charmap: it reencodes the result of the error handler for non-ASCII characters. I'm not happy with the current situtation, but I don't see how to factorize easily the code, so I prefer to leave it unchanged. I close the issue.