Issue 16766: small disadvantage of htmlentitydefs (original) (raw)

Issue16766

Created on 2012-12-24 14:39 by WitcherGeralt, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (2)
msg178060 - (view) Author: Al Korgun (WitcherGeralt) Date: 2012-12-24 14:39
>>> import htmlentitydefs >>> htmlentitydefs.name2codepoint.get("quot") # ok 34 >>> htmlentitydefs.name2codepoint.get("apos", "null") # ' -> chr(39) 'null'
msg178148 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-12-25 16:42
That's because ' is not a valid character reference in HTML 4, but only in HTML5/XML/XHTML. A mapping that contains a list of HTML 5 entities has been added from Python 3.3. Modules like HTMLParser also include ' among the entities while parsing.
History
Date User Action Args
2022-04-11 14:57:39 admin set github: 60970
2012-12-25 16:42:03 ezio.melotti set status: open -> closedtype: behaviormessages: + assignee: ezio.melottiresolution: not a bugstage: resolved
2012-12-24 15:06:13 serhiy.storchaka set nosy: + ezio.melottiversions: + Python 3.2, Python 3.3, Python 3.4, - Python 2.6
2012-12-24 14:44:50 WitcherGeralt set versions: + Python 2.6
2012-12-24 14:39:58 WitcherGeralt create