When using the following re to extract all objects from a PDF file, I get a maximum recursion limit exceeded error. Attached is a pdf file that will reproduce the error. If I do import pre as re, it works fine. platform is Win2k, Python 2.2.1 build #34 ####### import re GETOBJECT = re.compile(r'\d+\s+\d+\s+obj.+?endobj', re.I|re.S
re.M) pdf = open('userguide.pdf', 'rb').read() all = GETOBJECT.findall(pdf) print len(all)
Logged In: YES user_id=7887 As Gary Herron correctly pointed me out, this was fixed in 2.3 with the introduction of a new opcode to handle single character non-greedy matching. This won't be fixed in 2.2.3, but hopefully will be backported to 2.2.4 together with other regular expression fixes.