Issue 1243730: Big speedup in email message parsing (original) (raw)

Created on 2005-07-23 22:07 by lpd, last changed 2022-04-11 14:56 by admin.

Files
File name Uploaded Description Edit
t.dif lpd,2005-07-23 22:07 Patches for email/FeedParser.py
1243730.diff barry,2006-05-28 01:12 review
Messages (5)
msg48615 - (view) Author: L. Peter Deutsch (lpd) Date: 2005-07-23 22:07
Python 2.4.1, Red Hat Linux 7.3. Speeds up message parsing on files with large attachments by approximately 4x, mostly by replacing REs by direct string processing.
msg48616 - (view) Author: Steve Holden (holdenweb) * (Python committer) Date: 2006-05-25 22:55
Logged In: YES user_id=88157 A first examinaation reveals no particular speedup on an email with approximately 30 MB of attachments. Can the OP perhaps provide some code and test data I could time to verify the assertions of speedup? Otherwise I can't see much point in applying the patch.
msg48617 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2006-05-28 01:12
Logged In: YES user_id=12800 Here's a slightly better version, cleaned up for style and applicable to Python 2.5 (which is the only place I'd feel comfortable applying it). I've verified that this provides about a 3x speed up at least for some messages with really big attachments.
msg124717 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-12-27 17:17
Since this is a performance hack and is considerably invasive of the feedparser code (and needs updating), I'm deferring it to 3.3.
msg184191 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-03-14 20:46
Test fails with stack overflow: ====================================================================== ERROR: test_pushCR_LF (email.test.test_email.TestIterators) FeedParser BufferedSubFile.push() assumed it received complete ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/serhiy/py/cpython2.7/Lib/email/test/test_email.py", line 2585, in test_pushCR_LF bsf.push(il) File "/home/serhiy/py/cpython2.7/Lib/email/feedparser.py", line 140, in push parts = _splitlines(data) File "/home/serhiy/py/cpython2.7/Lib/email/feedparser.py", line 170, in _splitlines lines.extend(_splitlines(part)) ... File "/home/serhiy/py/cpython2.7/Lib/email/feedparser.py", line 170, in _splitlines lines.extend(_splitlines(part)) RuntimeError: maximum recursion depth exceeded
History
Date User Action Args
2022-04-11 14:56:12 admin set status: pending -> opengithub: 42213
2017-10-28 12:23:16 serhiy.storchaka set status: open -> pending
2014-01-23 21:32:49 serhiy.storchaka set versions: + Python 3.5, - Python 3.4
2013-04-13 17:08:13 serhiy.storchaka set stage: patch review -> needs patch
2013-03-14 20:46:00 serhiy.storchaka set nosy: + serhiy.storchakamessages: +
2013-03-14 08:00:07 ezio.melotti set versions: + Python 3.4, - Python 3.3
2012-05-16 01:23:15 r.david.murray set keywords: - easyassignee: r.david.murray -> components: + email
2010-12-27 17:17:26 r.david.murray set nosy:barry, holdenweb, lpd, ajaksu2, r.david.murrayversions: + Python 3.3, - Python 3.1, Python 2.7, Python 3.2messages: + stage: test needed -> patch review
2010-07-20 03🔞04 BreamoreBoy set versions: + Python 3.2
2010-05-05 13:45:09 barry set assignee: barry -> r.david.murraynosy: + r.david.murray
2009-04-22 14:36:47 ajaksu2 set keywords: + easy
2009-03-20 22:12:45 ajaksu2 set versions: + Python 3.1, Python 2.7, - Python 2.5nosy: + ajaksu2components: + Library (Lib), - Interpreter Coretype: performancestage: test needed
2005-07-23 22:07:37 lpd create