[Python-Dev] sgmllib Comments (original) (raw)
Terry Reedy tjreedy at udel.edu
Mon Jun 12 04:06:16 CEST 2006
- Previous message: [Python-Dev] sgmllib Comments
- Next message: [Python-Dev] sgmllib Comments
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"Fred L. Drake, Jr." <fdrake at acm.org> wrote in message news:200606112039.37834.fdrake at acm.org...
On Sunday 11 June 2006 16:26, Sam Ruby wrote: > Planet is a feed aggregator written in Python. It depends heavily on > SGMLLib. A recent bug report turned out to be a deficiency in sgmllib, > and I've submitted a test case and a patch[1] (use or discard the > patch, > it is the test that I care about). ... > and which are original. (Note: feeds often contain such abominations > as > © which the new code will treat indistinguishably from ©)
It really sounds like sgmllib is the wrong foundation for this. ... Have you looked at HTMLParser as an alternate to sgmllib? It has better support for XHTML constructs.
Have you (the OP), checked how related Python projects, such as Mark Pilgrim's feed parser, http://www.feedparser.org/ handle the same sort of input (I have only looked at docs and tests, not code).
tjr
- Previous message: [Python-Dev] sgmllib Comments
- Next message: [Python-Dev] sgmllib Comments
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]