Issue 26551: Regex.finditer infinite loops with certain input (original) (raw)

Issue26551

Created on 2016-03-13 10:15 by Alan Mislove, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bug.py Alan Mislove,2016-03-13 10:15 Example demonstrating bug
Messages (4)
msg261692 - (view) Author: Alan Mislove (Alan Mislove) Date: 2016-03-13 10:15
I found a regex and input that causes re.finditer() to appear to get into an infinite loop. Please see the attached minimal python script that triggers the behavior. I've verified the bug exists on 2.7.6, 3.4.0, and 3.5.1; I haven't yet be able to test whether it also exists in the latest 3.6.
msg261693 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-03-13 10:36
This is finite loop. Your script just needs too much time to finish.
msg261694 - (view) Author: Alan Mislove (Alan Mislove) Date: 2016-03-13 10:45
Thanks for the quick reply, Serhiy! You're right -- after letting it run for even longer, it does complete. Sorry for the trouble. I have two quick followup questions: 1. Would this be considered a performance bug? On my machine, it runs for over 20 seconds before completing. 2. This regex was a much-simplified version of a larger one (I was trying to isolate the problem for the bug report). When running the larger regex on the same input, it ran for multiple days without completing. Should I send the larger regex along, or is this issue out of scope?
msg261697 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-03-13 11:51
This is known issue. Some regular expressions has quadratic or even exponential complexity. Existing implementation can't be easy fixed. You can try an regular expressions engine that use completely different algorithms, i.e. re2 (https://pypi.python.org/pypi/re2). Or rewrite your regular expression. You can ask for help on the comp.lang.python newsgroup (news:comp.lang.python). It is also accessible as a mailing list (https://mail.python.org/mailman/listinfo/python-list). There is a Web-interface (http://dir.gmane.org/gmane.comp.python.general).
History
Date User Action Args
2022-04-11 14:58:28 admin set github: 70738
2016-03-13 11:51:35 serhiy.storchaka set messages: +
2016-03-13 10:45:52 Alan Mislove set messages: +
2016-03-13 10:36:51 serhiy.storchaka set status: open -> closednosy: + serhiy.storchakamessages: + resolution: not a bugstage: resolved
2016-03-13 10:15:14 Alan Mislove create