Python 2.4.1, Red Hat Linux 7.3. Speeds up message parsing on files with large attachments by approximately 4x, mostly by replacing REs by direct string processing.
Logged In: YES user_id=88157 A first examinaation reveals no particular speedup on an email with approximately 30 MB of attachments. Can the OP perhaps provide some code and test data I could time to verify the assertions of speedup? Otherwise I can't see much point in applying the patch.
Logged In: YES user_id=12800 Here's a slightly better version, cleaned up for style and applicable to Python 2.5 (which is the only place I'd feel comfortable applying it). I've verified that this provides about a 3x speed up at least for some messages with really big attachments.
Test fails with stack overflow: ====================================================================== ERROR: test_pushCR_LF (email.test.test_email.TestIterators) FeedParser BufferedSubFile.push() assumed it received complete ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/serhiy/py/cpython2.7/Lib/email/test/test_email.py", line 2585, in test_pushCR_LF bsf.push(il) File "/home/serhiy/py/cpython2.7/Lib/email/feedparser.py", line 140, in push parts = _splitlines(data) File "/home/serhiy/py/cpython2.7/Lib/email/feedparser.py", line 170, in _splitlines lines.extend(_splitlines(part)) ... File "/home/serhiy/py/cpython2.7/Lib/email/feedparser.py", line 170, in _splitlines lines.extend(_splitlines(part)) RuntimeError: maximum recursion depth exceeded