[Python-Dev] is htmllib broken in 2.2a4? (original) (raw)
Skip Montanaro skip@pobox.com (Skip Montanaro)
Mon, 1 Oct 2001 21:52:58 -0500
- Previous message: [Python-Dev] Performance of various marshallers
- Next message: [Python-Dev] is htmllib broken in 2.2a4?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Responding to a question in python-help about extracting links from web pages, I wrote a simple href printer:
import htmllib, formatter
class MyParser(htmllib.HTMLParser):
def anchor_bgn(self, href, name, type):
print href
fmt = formatter.NullFormatter()
parser = MyParser(fmt, verbose=1)
parser.feed(open("tour01.html").read())
parser.close()
When run using 2.2a4, it never prints anything. It outputs a list of hrefs when run with 2.1 or 1.6. Either there's a bug somewhere (in my code possibly, though it's pretty simple) or some semantics changed that I missed.
I thought maybe the method resolution order change affected things, but htmllib.HTMLParser only uses single inheritance. When displaying help about htmllib.HTMLParser, pydoc.help does emit the method resolution order, which it doesn't generally seem to do:
class HTMLParser(sgmllib.SGMLParser)
| Method resolution order:
| HTMLParser
| sgmllib.SGMLParser
| markupbase.ParserBase
...
Skip
- Previous message: [Python-Dev] Performance of various marshallers
- Next message: [Python-Dev] is htmllib broken in 2.2a4?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]