[Python-Dev] Replace useless %.100s by %s in PyErr_Format() (original) (raw)

M.-A. Lemburg mal at egenix.com
Thu Mar 24 13:22:47 CET 2011


Victor Stinner wrote:

Hi,

I plan to replace all %.100s (or any other size, %.[0-9]+s regex) by %s in the whole source code, in all calls to PyErrFormat(). And I would like your opinion. When Guido added the function PyErrFormat(), 13 years ago, the function was implemented using a buffer of 500 bytes (allocated on the stack). The developer was responsible to limit the argument fit into a total of 500 bytes. But 3 years later (2000), PyErrFormat() was patched to use a dynamic buffer (allocated on the heap). But since this change, PyErrFormat() doesn't support %.100s anymore (the 100 bytes limitation is just ignored), and it becomes useless and so no, it's no more (since 10 years) a "protection" against segmentation fault. But I would like to know if I have to do in all branches (3.1-3.3, or worse: 2.5-3.3), or just in 3.3? Because it may make merge harder (like any change only done in default). I would like to replace %.100s because there are no more reason to truncate strings to an arbitrary length. => http://bugs.python.org/issue10833 --- ... at the same time, Ray Allen wrote a patch to implement %.100s in PyUnicodeFromFormatV() (so PyErrFormat() will support it too). I would like to replace %.100s in PyErrFormat(), and then commit its patch. http://bugs.python.org/issue7330

I think it's better to add the #7330 fix and leave the length limitations in place.

Note that the length limitation did not only protect against segfaults at the time when PyErr_Format() was using a fixed size buffer and later on against miscalculations in creating the variable sized buffer, it also protects against making the error message text too long to be of any use or cause problems further down the line in error processing.

BTW: Why do you think that %.100s is not supported in PyErr_Format() in Python 2.x ? PyString_FromFormatV() does support this. The change to use Unicode error strings introduced the problem, since PyUnicode_FromFormatV() for some reason ignores the precision (which is shouldn't).

That said, it's a good idea to add the #7330 fix to at least Python 2.7 as well, since ignoring the precision is definitely a bug. It may even be security relevant, since it could be used for DOS attacks on servers (e.g. causing them to write huge strings to log files instead of just a few hundreds bytes per message), so may even need to go into Python 2.6.

Thanks,

Marc-Andre Lemburg eGenix.com

Professional Python Services directly from the Source (#1, Mar 24 2011)

Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! ::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/



More information about the Python-Dev mailing list