Issue 25017: htmllib deprecated: Which library to use? Missing sane default in docs (original) (raw)

Created on 2015-09-07 09:01 by guettli, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
htmllib_deprecation_warning.patch Nan Wu,2015-10-16 20:52
htmllib_deprecation_warning_2.patch Nan Wu,2015-10-21 12:56
htmllib_deprecation_warning_3.patch martin.panter,2015-11-13 02:44 review
Messages (17)
msg250088 - (view) Author: Thomas Guettler (guettli) * Date: 2015-09-07 09:01
At the top of the htmllib module: > Deprecated since version 2.6: The htmllib module has been removed in > Python 3. Source: https://docs.python.org/2/library/htmllib.html#module-htmllib Newcomers need more advice: Which library should be used? I know there are many html parsing libraries. But there should be a sane default for newcomers. Is there already an agreement of a sane default html parsing library?
msg250092 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-09-07 09:50
PEP 3108 says “Superseded by HTMLParser”. I presume this means Python 3’s “html.parser” module (called “HTMLParser” in Python 2). I guess a lot of work would be involved in changing existing code over, but it shouldn’t be much of a problem for someone writing new code.
msg250123 - (view) Author: Thomas Guettler (guettli) * Date: 2015-09-07 19:54
This issue is just about documentation. No code change is required for it. How to update the docs, to point to html.parser?
msg250125 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2015-09-07 20:07
If you want to create a patch, you have to edit the file Doc/library/htmllib.rst in the 2.7 branch. You can find information about cloning the CPython repository and switching branch in the devguide. The warning should suggest :mod:`HTMLParser` for Python 2 and the equivalent :mod:`html.parser` for Python 3.
msg253098 - (view) Author: Nan Wu (Nan Wu) * Date: 2015-10-16 20:52
Added a small patched for this change.
msg253274 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2015-10-21 03:17
Thanks for the patch. I think we can move the Python 3 part of the patch to a new note directive (similar to the example in httplib documentation: https://docs.python.org/2/library/httplib.html) For example: .. deprecated:: 2.6 Use 📳`HTMLParser` instead. .. note:: The :mod:`htmllib` module has been removed in Python 3. Use :mod:`html.parser` (equivalent of 📳`HTMLParser`) instead.
msg253279 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-10-21 08:02
Also beware it should be :mod: not 📳 :)
msg253285 - (view) Author: Nan Wu (Nan Wu) * Date: 2015-10-21 12:56
Updated the patch. The typo was fixed too. Thanks for the catching.
msg253533 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-10-27 12:35
This looks good enough to me. I would have probably avoided littering the page with too many Deprecated and Note boxes, but I can respect your and Berker’s preference to add the separate box.
msg253541 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-27 14:24
The note should actually be parallel to the http one (assuming 2to3 does do the translation), rather than say "use instead", which would be incorrect advice for a python2 user :)
msg253562 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-10-27 20:41
Not quite. This is a two-step deprecation: 1. “htmllib” is removed in favour of HTMLParser. The API is different, so no automatic 2to3 change would be practical. 2. HTMLParser is renamed to “html.parser”, and 2to3 handles this. This is already documented at <https://docs.python.org/2/library/htmlparser.html>.
msg253565 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-27 21:40
OK, then the note should be dropped.
msg254256 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-07 05:59
David: are you saying you like the first patch better (ignoring the markup mistakes)?
msg254313 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-11-07 23:21
Yes, though I hadn't looked at it before this :)
msg254582 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-13 02:44
Here is a cleaned-up version of Nan’s first patch.
msg254586 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2015-11-13 03:11
htmllib_deprecation_warning_3.patch looks good to me.
msg254639 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-11-14 00:45
New changeset 7bc8f56ef1f3 by Martin Panter in branch '2.7': Issue #25017: Document that htmllib is superseded by module HTMLParser https://hg.python.org/cpython/rev/7bc8f56ef1f3
History
Date User Action Args
2022-04-11 14:58:20 admin set github: 69205
2015-11-14 00:48:43 martin.panter set status: open -> closedresolution: fixedstage: commit review -> resolved
2015-11-14 00:45:13 python-dev set nosy: + python-devmessages: +
2015-11-13 03:11:07 berker.peksag set messages: + stage: patch review -> commit review
2015-11-13 02:44:24 martin.panter set files: + htmllib_deprecation_warning_3.patchmessages: +
2015-11-07 23:21:51 r.david.murray set messages: +
2015-11-07 05:59:03 martin.panter set messages: +
2015-10-27 21:40:28 r.david.murray set messages: +
2015-10-27 20:41:58 martin.panter set messages: +
2015-10-27 14:24:32 r.david.murray set nosy: + r.david.murraymessages: +
2015-10-27 12:35:01 martin.panter set messages: +
2015-10-21 12:56:55 Nan Wu set files: + htmllib_deprecation_warning_2.patchmessages: +
2015-10-21 08:02:37 martin.panter set messages: +
2015-10-21 03:17:45 berker.peksag set nosy: + berker.peksagmessages: + stage: needs patch -> patch review
2015-10-16 20:52:11 Nan Wu set files: + htmllib_deprecation_warning.patchnosy: + Nan Wumessages: + keywords: + patch
2015-09-08 11:50:25 berker.peksag set keywords: + easystage: needs patch
2015-09-07 20:07:55 ezio.melotti set messages: +
2015-09-07 19:58:32 berker.peksag set nosy: + ezio.melotti
2015-09-07 19:54:19 guettli set messages: +
2015-09-07 09:50:08 martin.panter set nosy: + martin.pantermessages: + versions: + Python 2.7
2015-09-07 09:01:37 guettli create