[Python-bugs-list] [ python-Bugs-467059 ] htmllib broken (original) (raw)
noreply@sourceforge.net noreply@sourceforge.net
Wed, 10 Oct 2001 08:48:10 -0700
- Previous message: [Python-bugs-list] [ python-Bugs-467059 ] htmllib broken
- Next message: [Python-bugs-list] [ python-Bugs-469972 ] xmlrpclib won't marshal new types
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Bugs item #467059, was opened at 2001-10-01 20:25 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=105470&aid=467059&group_id=5470
Category: Python Library Group: Irreproducible Status: Closed Resolution: None Priority: 5 Submitted By: Skip Montanaro (montanaro) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: htmllib broken
Initial Comment: Responding to a question in python-help about extracting links from web pages, I wrote a simple href printer (see attached file). When run using 2.2a4, it never prints anything. It outputs a list of hrefs when run with 2.1 or 1.6. Either there's a bug somewhere (in my code possibly, though it's pretty simple) or some semantics changed that I missed.
I thought maybe the method resolution order change affected things, but htmllib.HTMLParser only uses single inheritance. When displaying help about htmllib.HTMLParser, pydoc.help does emit the method resolution order, which it doesn't generally seem to do:
class HTMLParser(sgmllib.SGMLParser)
| Method resolution order:
| HTMLParser
| sgmllib.SGMLParser
| markupbase.ParserBase
...Comment By: Skip Montanaro (montanaro) Date: 2001-10-10 08:47
Message: Logged In: YES user_id=44345
I tried it again just now with the same input that was failing when I first submitted this bug. It worked this time (though the output was slightly different than we running against 2.1 - the href parameter is a tuple instead of a string), so I went ahead and closed the bug instead of just leaving it pending. Something apparently changed in the past 9 days.
(Sorry for the delay responding. My procmail filters classed the message as spam...)
Skip
Comment By: Fred L. Drake, Jr. (fdrake) Date: 2001-10-04 13:05
Message: Logged In: YES user_id=3066
Please attach the input for which this fails. A trivial test case does not fail (see Lib/test/test_htmllib.py).
Set status to "pending".
Comment By: Skip Montanaro (montanaro) Date: 2001-10-01 20:30
Message: Logged In: YES user_id=44345
SF apparently doesn't like file uploads from Opera, so it's pasted here...
import htmllib, formatter
class MyParser(htmllib.HTMLParser): def anchor_bgn(self, href, name, type): print href
fmt = formatter.NullFormatter() parser = MyParser(fmt, verbose=1) parser.feed(open("tour01.html").read()) parser.close()
You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=105470&aid=467059&group_id=5470
- Previous message: [Python-bugs-list] [ python-Bugs-467059 ] htmllib broken
- Next message: [Python-bugs-list] [ python-Bugs-469972 ] xmlrpclib won't marshal new types
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]