Issue 711632: htmllib.HTMLParser.anchorlist problem - Python tracker (original) (raw)
Issue711632
Created on 2003-03-29 00:26 by cpgray, last changed 2022-04-10 16:07 by admin. This issue is now closed.
Messages (3) | ||
---|---|---|
msg15290 - (view) | Author: Chris Gray (cpgray) | Date: 2003-03-29 00:26 |
htmllib.HTMLParser.anchorlist is cleared when __init__() is called but not when reset() is called. Processing more than one document with the same instance accumulates anchors from all documents processed in the list. Arguably a feature not a bug, but it makes sense for reset to clear whatever is initialized by __init__. Here is an illustrative IDLE session: Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. IDLE 0.8 -- press F1 for help >>> import htmllib >>> import formatter >>> p = htmllib.HTMLParser(formatter.NullFormatter()) >>> p.feed('Python') >>> p.anchorlist ['http://www.python.org'] >>> p.reset() >>> p.feed('Sourceforge') >>> p.anchorlist ['http://www.python.org', 'http://sourceforge.net/'] | ||
msg15291 - (view) | Author: Andrew Gaul (gaul) | Date: 2003-08-22 08:44 |
Logged In: YES user_id=139865 See patch 793021 for a fix. | ||
msg15292 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2003-09-12 16:38 |
Logged In: YES user_id=21627 Fixed with #793021. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:07:56 | admin | set | github: 38229 |
2003-03-29 00:26:11 | cpgray | create |