msg304782 - (view) |
Author: Sebastian Rittau (srittau) * |
Date: 2017-10-23 08:27 |
HTMLParser derives from _markupbase.ParserBase, which has the following method: class HTMLParser: ... def error(self, message): raise NotImplementedError( "subclasses of ParserBase must override error()") HTMLParser does not implement this method and the documentation for HTMLParser (https://docs.python.org/3.6/library/html.parser.html) does not mention that its sub-classes need to override it. I am not sure whether this is a documentation omission, whether HTMLParser should provide an (empty?) implementation, or whether ParserBase should not raise a NotImplementedError (to make linters happy). |
|
|
msg304783 - (view) |
Author: Sebastian Rittau (srittau) * |
Date: 2017-10-23 08:29 |
The quoted code above should have used ParserBase: class ParserBase: ... def error(self, message): raise NotImplementedError( "subclasses of ParserBase must override error()") |
|
|
msg305303 - (view) |
Author: William Ayd (William Ayd) |
Date: 2017-10-31 14:24 |
Would we be open to setting the meta class of the ParserBase to ABCMeta and setting error as an abstract method? That at the very least would make the expectation clearer for subclasses. I haven’t contributed to Python before but am open to this as a first attempt if the direction makes sense. |
|
|
msg305306 - (view) |
Author: William Ayd (William Ayd) |
Date: 2017-10-31 14:38 |
And assuming that subclass requirement is intentional we could add an optional keyword argument to the HTMLParser that indicates what to do with errors, much like how encoding issues are handled within codecs. For backwards compatibility it can default to ignore, but fail and warn could be two alternate approaches that the error method could account for |
|
|
msg322662 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2018-07-30 09:40 |
HTMLParser.error() method was deprecated in Python 3.4 (https://github.com/python/cpython/commit/88ebfb129b59dc8a2b855fc93fcf32457128d64d#diff-1a7486df8279dbac7f20abd487947845R157) and removed in Python 3.5 (https://github.com/python/cpython/commit/73a4359eb0eb624c588c5d52083ea4944f9787ea#diff-1a7486df8279dbac7f20abd487947845L171) _markupbase is a private and undocumented module and its only user is HTMLParser (sgmllib was removed from the stdlib in 2008) Since we already have removed HTMLParser.error(), I think we can just remove _markupbase.ParserBase.error() without a deprecation period. |
|
|
msg322672 - (view) |
Author: Sebastian Rittau (srittau) * |
Date: 2018-07-30 13:01 |
Good call. Maybe it's actually time to retire _markupbase and merge ParserBase into HTMLParser. |
|
|
msg323968 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2018-08-23 18:16 |
After triaging issue 34480, I realized that we can't simply remove the error() method because the _markupbase.ParserBase() class still uses it. I've just closed PR 8562. |
|
|
msg371500 - (view) |
Author: Cheryl Sabella (cheryl.sabella) *  |
Date: 2020-06-14 12:06 |
@berker.peksag's last comment was he closed the PR on 23 August 2018. However, he reopened it on 6 January 2020 as @ezio.melotti mentioned that they are both needed. The PR for this issue is waiting to be re-reviewed by Ezio. |
|
|
msg373745 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2020-07-16 06:13 |
New changeset e34bbfd61f405eef89e8aa50672b0b25022de320 by Berker Peksag in branch 'master': bpo-31844: Remove _markupbase.ParserBase.error() (GH-8562) https://github.com/python/cpython/commit/e34bbfd61f405eef89e8aa50672b0b25022de320 |
|
|
msg373746 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2020-07-16 06:39 |
New changeset d4d127f1c6e586036104e4101f5af239fe7dc156 by Berker Peksag in branch 'master': bpo-31844: Move whatsnew note to 3.10.rst (GH-21504) https://github.com/python/cpython/commit/d4d127f1c6e586036104e4101f5af239fe7dc156 |
|
|