bpo-15988: make various overflow messages more consistent and helpful by orenmn · Pull Request #668 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation87 Commits21 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

according to http://bugs.python.org/issue15988:

in Python/getargs.c, patch seterror so that it would be easier to do 2 in various places that use PyArg_* functions.
make various OverflowError and ValueError messages more consistent and helpful
add various overflow tests

(I ran the test module, and on my Windows 10, the same tests failed with
and without my patches. However, on my Ubuntu 16.04 VM, none of the tests
failed.)

…i added to csv, as the inconsistent message in csv was already fixed, and so my patch isn't related to it anymore

…oryio, and optimized an if in seterror

…Arg_* funcs in mmapmodule.c

… remove redundant code in _PyLong_AsInt

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great patch! Thanks Oren.

I made the first turn of reviewing. In general LGTM, but I left a number of comments. I think some changes could be extracted to separate issues. Some OverflowErrors could be avoided at all. I didn't tested all changes. I am not sure that reraising OverflowError with new error messages is appropriate in all cases. In some cases current error message looks not obviously worse.

if (shiftby < 0) {
PyErr_SetString(PyExc_ValueError, "negative shift count");
PyErr_SetString(PyExc_ValueError, "shift count can't be negative");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to raise the same ValueError for all negative values, even if they don't fit in Py_ssize_t.

This changes the behavior so maybe it worths the separate issue (or be resolved as a part of bpo-29816).

/* PyLong_Check(b) is true, so it must be that
PyLong_AsSsize_t raised an OverflowError. */
assert(PyErr_ExceptionMatches(PyExc_OverflowError));
PyErr_SetString(PyExc_OverflowError,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See bpo-29816. Right shift can be changed to never raise an OverflowError.

@@ -586,7 +597,8 @@ PyLong_AsUnsignedLong(PyObject *vv)
x = 0;
if (i < 0) {
PyErr_SetString(PyExc_OverflowError,
"can't convert negative value to unsigned int");
"can't convert negative Python int to C "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ValueError is more appropriate here.

@@ -630,7 +642,7 @@ PyLong_AsSize_t(PyObject *vv)
x = 0;
if (i < 0) {
PyErr_SetString(PyExc_OverflowError,
"can't convert negative value to size_t");
"can't convert negative Python int to C size_t");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ValueError is more appropriate here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we open a sub-issue of bpo-29833 for this?
IMHO, as this is very similar to bpo-29834, maybe we should just mention in bpo-29834 that in case bpo-29834 is accepted and fixed, then also PyLong_AsSize_t should also be changed in a similar way.

| if (overflow || result > INT_MAX | | result < INT_MIN) { | | ------------------------------------------- | ---------------------- | | /* XXX: could be cute and give a different | | | message for overflow == -1 */ | | | if (overflow == 1 || result > INT_MAX) { | |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In common case overflow == 0 and INT_MIN <= result <= INT_MAX. Checking three false conditions in overflow || result > INT_MAX || result < INT_MIN is faster than checking four false conditions in overflow == 1 || result > INT_MAX and overflow == -1 || result < INT_MIN (and comparing with zero can be faster than comparing with non-zero). It is better to keep the current check and adds more checks only for qualifying exception message.

@@ -2184,7 +2181,7 @@ delta_new(PyTypeObject type, PyObject args, PyObject *kw)
"minutes", "hours", "weeks", NULL
};

if (PyArg_ParseTupleAndKeywords(args, kw, "\|OOOOOOO:__new__",
if (PyArg_ParseTupleAndKeywords(args, kw, "\|OOOOOOO:timedelta.__new__",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks unrelated to this issue. After converting this module to Argument Clinic it will be outdated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I changed it only because I changed other similar *_new functions in Modules/_datetimemodule.c, and this :__new__ seemed like a mistake that I should correct while I am there.
(anyway, I will remove this change.)

@@ -1217,8 +1221,9 @@ PyCursesWindow_GetStr(PyCursesWindowObject self, PyObject args)
Py_END_ALLOW_THREADS
break;
case 1:
if (!PyArg_ParseTuple(args,"i;n", &n))
if (!PyArg_ParseTuple(args, "i:getstr", &n)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks unrelated to this issue. It will be outdated after converting the module to Argument Clinic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it not related to this issue?
do you mean it is more related to bpo-29833?

Anyway, should I remove this change and all other similar changes I made to format arguments of PyArg_ParseTuple in _cursesmodule.c?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't directly related to error messages in case of integer overflow. Remove all similar changes. They will be made by bpo-20171.

@@ -1351,7 +1351,7 @@ PyCArrayType_new(PyTypeObject type, PyObject args, PyObject *kwds)
length = PyLong_AsLongAndOverflow(length_attr, &overflow);
if (overflow) {
PyErr_SetString(PyExc_OverflowError,
"The '_length_' attribute is too large");
"The '_length_' attribute does not fit in C long");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative length is not valid. Raise ValueError in that case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all right. I would also add a test_bad_length test to Lib/ctypes/test/test_arrays.py.

by the way, according to https://docs.python.org/3.7/library/ctypes.html#ctypes.Array._length_,
_length_ must be positive (ISTM that it makes sense, as an array of length 0 is useless), but currently no error is raised in case _length_ == 0.
do you think this is OK?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on second thought, this changes the behavior from raising OverflowError for small negative _length_ to raising ValueError.
should I open another sub-issue of bpo-29833 for that?
or maybe should I just raise OverflowError with a better error message?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether the length 0 is acceptable. Yes, since changing OverflowError to ValueError changes the behavior, this worths a sub-issue.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -159,6 +159,10 @@ py_blake2s_new_impl(PyTypeObject type, PyObject data, int digest_size,
if (leaf_size_obj != NULL) {
leaf_size = PyLong_AsUnsignedLong(leaf_size_obj);
if (leaf_size == (unsigned long) -1 && PyErr_Occurred()) {
if (PyErr_ExceptionMatches(PyExc_OverflowError)) {
PyErr_SetString(PyExc_OverflowError,
"leaf_size must be between 0 and 2**32-1");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message and error message at line 169 should be same.

It may be worth to raise ValueError for negative leaf_size. Are there other restrictions for upper limit? Since this is specific to BLAKE2b it is worth to open a separate issue about this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so do you suggest to remove the changes in py_blake2b_new_impl and py_blake2s_new_impl, and patch these functions as part of the issue of replacing OverflowError with ValueError for negative values?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't well known with this part of code and suggest to open a separate issue for changes in this file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -223,6 +224,25 @@ class TestFilemodeCStat(TestFilemode, unittest.TestCase):
format_funcs = TestFilemode.format_funcs \| {'S_ISDOOR', 'S_ISPORT',
'S_ISWHT'}

@cpython_only
def test_mode_t_converter(self):
from _testcapi import USHRT_MAX

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed only on Windows. It is worth to split the test on two parts: the general one and the Windows-specific one.

@@ -197,20 +197,22 @@ b_setitem(arrayobject ap, Py_ssize_t i, PyObject v)
/* PyArg_Parse's 'b' formatter is for an unsigned char, therefore
must use the next size up that is signed ('h') and manually do
the overflow checking */
if (!PyArg_Parse(v, "h;array item must be integer", &x))
if (!PyArg_Parse(v, "h:array('b').__setitem__", &x)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is enough to keep OverflowError raised in PyArg_Parse(). It should be good enough. No change is needed.

@@ -1414,12 +1414,16 @@ _sre_compile_impl(PyObject module, PyObject pattern, int flags,
self->code[i] = (SRE_CODE) value;
if ((unsigned long) self->code[i] != value) {
PyErr_SetString(PyExc_OverflowError,
"regular expression code size limit exceeded");
"regular expression code out of range");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to add Py_DECREF(self); and return NULL; here.

Other option is to left just break and rewrite the condition after the loop:

if (i < n || PyErr_Occurred() && PyErr_ExceptionMatches(PyExc_OverflowError)) {
    PyErr_SetString(PyExc_OverflowError, ...);
}
if (PyErr_Occurred()) {
    Py_DECREF(self);
    return NULL;
}

Third option is to add a label before PyErr_SetString(PyExc_OverflowError, ...); and jump to it from here. This looks the simplest option.

else if (x < -128) {
PyErr_SetString(PyExc_OverflowError,
"signed char is less than minimum");
"array item too small to convert to C char");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it consistent with error messages raised by PyArg_Parse().

if (map_size < 0) {
PyErr_SetString(PyExc_OverflowError,
"memory mapped length must be postiive");
"memory mapped length must be positive");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But check that 0 is actually accepted as the length. That there is not a bug in the condition.

@@ -1244,8 +1253,8 @@ new_mmap_object(PyTypeObject type, PyObject args, PyObject *kwdict)
"tagname",
"access", "offset", NULL };

if (!PyArg_ParseTupleAndKeywords(args, kwdict, "in\|ziL", keywords,
&fileno, &map_size,
if (!PyArg_ParseTupleAndKeywords(args, kwdict, "in\|ziL:mmap.__new__",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At best, it would be the same message.

@orenmn, could you please merge with master and resolve conflicts? This will help further reviewing this giant PR.

Some issues may be already solved, perhaps in slightly different way.

I still have some small issues that i wish to take care of before diving into this one, but i hope to get to it sometime soon.

To try and help move older pull requests forward, we are going through and backfilling 'awaiting' labels on pull requests that are lacking the label. Based on the current reviews, the best we can tell in an automated fashion is that a core developer requested changes to be made to this pull request.

If/when the requested changes have been made, please leave a comment that says, I have made the requested changes; please review again. That will trigger a bot to flag this pull request as ready for a follow-up review.

Based on @orenmn's last message and the merge conflicts, I'm going to close this pull request. I believe @serhiy-storchaka was suggesting for this to be split into smaller pull requests that would be easier to review. If that's not the case, feel free to re-open this PR.