[Python-Dev] PyUnicode_EncodeDecimal (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Tue Nov 22 02:02:05 CET 2011


Le lundi 21 novembre 2011 21:39:53, Victor Stinner a écrit :

I'm trying to rewrite PyUnicodeEncodeDecimal() to upgrade it to the new Unicode API. The problem is that the function is not accessible in Python nor tested.

I added tests for this function in Python 2.7, 3.2 and 3.3.

PyUnicodeEncodeDecimal() goes into an unlimited loop if there is more than one unencodable character. It's a known bug and there is a patch: http://bugs.python.org/issue13093

I fixed this issue. I was wrong: it was not possible to DoS Python, the bug was not an unlimited loop (but there was a bug on error handling).

PyUnicodeEncodeDecimal() requires an output buffer and it has no argument for the size of the output buffer. It is unsafe: it leads to buffer overflow if the buffer is too small.

This function is broken by design if an error handler is specified: the caller cannot know the size of the output buffer, whereas the caller has to allocate this buffer.

I propose to raise an error if an error handler (different than "strict") is specified) and do this change in Python 2.7, 3.2 and 3.3.

In Python 2.7 code base, PyUnicode_EncodeDecimal() is always called with errors=NULL. In Python 3.x, the function is no more called.

Should we document and test it, leave it unchanged and deprecate it, or simply remove it?

If we change PyUnicode_EncodeDecimal() to reject error handlers different than strict, we can keep this function for some release and deprecate it. The function is already deprecated beacuse it uses the deprecated Py_UNICODE type.

Victor



More information about the Python-Dev mailing list