Issue 3720: segfault in for loop with evil iterator (original) (raw)

process

Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ajaksu2, amaury.forgeotdarc, arigo, benjamin.peterson, donmez, gideon, gregory.p.smith, gvanrossum, jcea, rhettinger, terry.reedy, vstinner
Priority: normal Keywords: 26backport

Created on 2008-08-29 04:19 by gideon, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
baditerator.py gideon,2008-08-29 04:19 example of iterator crashing in a for loop
baditer.patch amaury.forgeotdarc,2008-08-29 13:58
baditer-2.6.patch amaury.forgeotdarc,2008-08-29 13:59
itercrashers.diff ajaksu2,2008-08-29 23:17 Fix Armin's examples
next-nevernull-2.patch amaury.forgeotdarc,2008-09-02 16:54
next-nevernull-26.patch amaury.forgeotdarc,2009-01-13 00:47 binary-compatible patch for 2.6
Messages (29)
msg72119 - (view) Author: Gideon Smeding (gideon) Date: 2008-08-29 04:19
The attached example crashes python. The crash is caused by an evil iterator that removes its own next function.
msg72130 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-29 12:34
Good catch! Here is a patch for 3.0.
msg72133 - (view) Author: Ismail Donmez (donmez) * Date: 2008-08-29 13:33
Maybe do a s/"object is no more an iterator"/"is no longer an iterator"
msg72135 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-29 13:59
New patches, with the suggested spelling. For 3.0 and 2.6.
msg72148 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-29 18:33
It would be a damned shame to slow down the entire language to save code that is intentionally shooting itself in the foot. EVERYONE will pay a cost -- NO ONE will benefit. My two cents.
msg72155 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-29 20:18
Here are some timings, on winXP, vs2008 release build: # t.py def f(l=range(5000)): for x in l: pass # before the patch > python -m timeit -s "from t import f" "f()" 10000 loops, best of 3: 159 usec per loop # after the patch > python -m timeit -s "from t import f" "f()" 10000 loops, best of 3: 160 usec per loop and these figures are pretty stable on my machine. Is it too much to pay? Some may consider that potential crashes are more expensive.
msg72158 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-29 20:34
Go for it. There's really no question about fixing possible segfaulters. Was just voicing my displeasure at this sort of exercise. The code has been around since at least 2.2 and hasn't caused the slightest problem. It bugs me to put cruft in the middle of otherwise tight loops. Fortunately, the check is cheap and branch predictable. BTW, your timings are domained by the function call and the range(5000) step. To cleanly time for-loop overhead, use: python -m timeit "pass"
msg72159 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2008-08-29 21:23
The same approach can be used to segfault many more places. See http://svn.python.org/projects/python/trunk/Lib/test/crashers/iter.py .
msg72171 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2008-08-29 23:11
This patch fixes Armin's list of crashers for trunk. Looking for others like them.
msg72172 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2008-08-29 23:17
Hopefully the right patch this time :/
msg72173 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-29 23:38
Amaury, if you decide to go forward with this, please clean-up the patches to eliminate common subexpressions. Wonder if there is an alternate solution that keeps the next slot filled with something that raises an Attribute error.
msg72175 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2008-08-30 00:03
I share Raymond's annoyance. The ultimate solution for segfaults is for bad pointer references to be catchable (trappable) the same as math errors are are now. I remember when 1/0 was fatal, not EDOM. Then the interpreter could print a traceback and SegfaultException message ;-) For this problem, I think there are alternatives to the current patch, which as Armin points out, would need to be extended to every built-in use of iterators. Is there any (sensible) way to make .__next__ (or the underlying tp_iternext slot) undelete-able? We already cannot change methods of built-ins. Or replace with def __next__(self): raise StopIteration. Or could iter() temporarily lock iterators in some way similar to what happens to dict during iteration? (Modifying actually does the action, and then raises RuntimeError). Modifying an active iterator strikes me as even less sane than modifying an underlying dict. The basic problem is that C code (unnecessarily?) does the same pointer tracing each iteration. Why not calculate np = *v->ob_type->tp_iternext just once? and just (more quickly) call np(v) each iteration? One could claim this is a semantic change, but so was the change to dicts. The equivalent in Python would be class BadIterator() : def __iter__(self) : return self def __next__(self) : # should be __next__ for python 3.0 del BadIterator.__next__ return 1 itnext = BadIterator().__next__ print(itnext()) print(itnext()) The second itnext call only fails because the del statement fails with AttributeError: __next__. I presume it would do the same if called from C. (My annoyance with Py3 changing .next to .__next__ is that it makes it harder to do the 'right thing' when iterating explicitly, which to me it to use a bound method. I wish next(it) returned it.__next__ instead of calling it.)
msg72177 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2008-08-30 00:09
Raymond, I think a different solution would be great, as the performance penalty might become nasty in tight loops if we miss some detail. Regarding the possible impact, I hope we can get a better estimate since the other examples of segfaulting that look like Armin's I've found are in itertools. I imagine you have the right tests to check the effect of any changes there.
msg72180 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-30 00:21
It's not just performance -- the patch code is grotesque. Lots of code would need to be changed just before the release candidate (usually a bad idea). And the underlying problem doesn't seem to be one that has *ever* affected a real user since 2.2. I have a hard time caring much about this problem. The minor performance hit only bugs me because it affects the inner- loops of just about every piece of real python code -- everyone will pay a cost for this change. Since the "problem" has existed so long with no ill effect, am unmarking the "release blocker" priority. Would rather have a thoughtful discussion on alternatives along with a careful, thorough, efficient patch for a bug release version.
msg72202 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-30 22:55
> Amaury, if you decide to go forward with this, please clean-up the > patches to eliminate common subexpressions. I already considered this, but generated machine code was identical, so I chose the more readable code. > Wonder if there is an alternate solution that keeps the next slot > filled with something that raises an Attribute error. That's an interesting idea, and not difficult to implement, see attached patch. The penalty here is an extra comparison in PyIter_Check(). And no change in the event loop... Armin's crashers now correctly raise a TypeError (and the final patch should convert them as unit tests)
msg72203 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-31 03:09
It was the ajaksu2 patches that needed clean-up.
msg72204 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-31 03:20
The next-nevernull patch is much cleaner than I expected. Nice work. The assertion in abstract.c can be changed to: assert(iter==PyObject_NextNotImplemented | PyIter_Check(iter)); Armin, are you happy with the new approach? Though "del obj.next" isn't doing exactly what you would expect, that seems harmless to me.
msg72205 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2008-08-31 09:21
Hacking at typeobject.c should always be done extremely carefully. I don't have time to review this patch as thouroughly as I think it needs to be. (A quick issue is that it seems to break PyIter_Check() which will always return true now, but I'd recommend a much more thourough review.)
msg72301 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-01 22:01
Did you notice that the definition of PyIter_Check() also changed? >>> class T(object): ... def __iter__(self): return self ... >>> iter(T()) Traceback (most recent call last): File "", line 1, in TypeError: iter() returned non-iterator of type 'T' Or did you refer to .so extensions modules that are not recompiled between 2.5 and 2.6? (I don't know if this still works)
msg72349 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2008-09-02 16:07
Maybe the file 'next-nevernull.patch' is not complete? I don't see any change in the definition of PyIter_Check().
msg72351 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-02 16:54
Oops, sorry. I have too many files opened at the same time. Here is an updated patch. I removed all the "assert(PyIter_Check(it))", our evil iterator used to make them fail anyway in debug mode; and in the case of a null function pointer, the C stack gives the same information.
msg79693 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-01-12 18:46
I think PyObject_NextNotImplemented should be renamed to _PyObject_NextNotImplemented. Aside from that, I think the patch is ready for committing.
msg79709 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-01-13 00:00
Any crash is a potential security problem. This needs to be fixed in 2.6.x without breaking 2.6.0 extension compatibility. (requiring extensions to be recompiled to fix the problem for an extension is fine, but they still need to load and work normally even if they have not been) As trunk r68560 can't be backported? what will? Is there an appropriate pre-existing hackish substitute for PyObj_NextNotImplemented or will it require adding a test anytime the tp_nextiter slot is used?
msg79710 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-13 00:03
Fixed in r68560 (trunk) and r68561 (py3k). Difficult to backport: extensions compiled with 2.6.x would not run on 2.6.0.
msg79711 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-13 00:47
Yes, a hack: What about setting tp_iternext to PyObject_GetIter? they happen to have the same signature. Yes, calling next() will call iter() instead; but an iterator is often its own iterator, and more importantly, PyIter_Check() is also called. And the error message is surprisingly almost correct: "TypeError: iter() returned non-iterator of type 'BadIterator'" It's not completely correct since the error occurred while calling the __next__() method of the iterator. See attached patch for 2.6. I can see one cause of incompatibility: if someone designed an extension type in C where tp_iternext is already PyObject_GetIter. It's is insane but valid, and the patch would break it. It's not worth the trouble I suppose...
msg79717 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-01-13 03:49
nice hack! :) I'm going to guess that existing code in the wild setting tp_iternext = &PyObject_GetIter is rare. I certainly can not rule it entirely out but I don't see anything in the open source world using http://www.google.com/codesearch I'd be okay with this hack. Let the release manager and/or BDFL make this call.
msg86204 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2009-04-20 19:51
I'm okay with that hack for 2.6 and 2.5.
msg104864 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-05-03 18:05
The fix works for 3.1.2, giving TypeError: iter() returned non-iterator of type 'BadIterator' Too late to patch 2.5. Is there any intention of patching 2.6 before its final bug-fix release, in a couple of months?
msg121020 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-11-12 05:26
Too late for 2.6.6 ;-)
History
Date User Action Args
2022-04-11 14:56:38 admin set github: 47970
2010-11-12 05:26:19 terry.reedy set status: open -> closedmessages: +
2010-05-03 18:05:08 terry.reedy set keywords: - patch, needs reviewmessages: + versions: - Python 2.5, Python 3.0
2010-05-03 06:06:20 belopolsky set keywords: + 26backport
2009-04-20 19:51:11 gvanrossum set nosy: + gvanrossummessages: +
2009-01-13 03:49:49 gregory.p.smith set messages: +
2009-01-13 00:47:52 amaury.forgeotdarc set status: pending -> openfiles: + next-nevernull-26.patchmessages: + keywords: + needs review
2009-01-13 00:03:34 amaury.forgeotdarc set status: open -> pendingkeywords: - needs reviewmessages: + resolution: fixed
2009-01-13 00:00:25 gregory.p.smith set nosy: + gregory.p.smithmessages: + components: + Interpreter Core
2009-01-12 18:46:23 benjamin.peterson set nosy: + benjamin.petersonmessages: +
2009-01-09 17:38:08 benjamin.peterson link issue4897 superseder
2008-11-10 11🔞59 vstinner set nosy: + vstinner
2008-09-16 18:30:51 jcea set nosy: + jcea
2008-09-02 16:54:45 amaury.forgeotdarc set files: - next-nevernull.patch
2008-09-02 16:54:38 amaury.forgeotdarc set files: + next-nevernull-2.patchmessages: +
2008-09-02 16:07:03 arigo set messages: +
2008-09-01 22:01:40 amaury.forgeotdarc set messages: +
2008-08-31 09:21:33 arigo set messages: +
2008-08-31 03:20:24 rhettinger set messages: +
2008-08-31 03:09:45 rhettinger set messages: +
2008-08-30 22:55:39 amaury.forgeotdarc set files: + next-nevernull.patchmessages: +
2008-08-30 00:21:38 rhettinger set priority: release blocker -> normalmessages: +
2008-08-30 00:09:11 ajaksu2 set messages: +
2008-08-30 00:03:40 terry.reedy set nosy: + terry.reedymessages: +
2008-08-29 23:38:51 rhettinger set messages: +
2008-08-29 23:17:47 ajaksu2 set files: + itercrashers.diffmessages: +
2008-08-29 23:16:20 ajaksu2 set files: - itercrashers.diff
2008-08-29 23:11:18 ajaksu2 set files: + itercrashers.diffnosy: + ajaksu2messages: + versions: + Python 2.6
2008-08-29 21:23:13 arigo set nosy: + arigomessages: +
2008-08-29 20:34:42 rhettinger set messages: +
2008-08-29 20🔞05 amaury.forgeotdarc set messages: +
2008-08-29 18:33:20 rhettinger set nosy: + rhettingermessages: +
2008-08-29 13:59:38 amaury.forgeotdarc set files: + baditer-2.6.patchmessages: +
2008-08-29 13:58:39 amaury.forgeotdarc set files: + baditer.patch
2008-08-29 13:58:22 amaury.forgeotdarc set files: - baditer.patch
2008-08-29 13:33:07 donmez set nosy: + donmezmessages: +
2008-08-29 12:34:30 amaury.forgeotdarc set files: + baditer.patchpriority: release blockermessages: + keywords: + needs review, patchnosy: + amaury.forgeotdarc
2008-08-29 04:19:31 gideon create