Issue 24197: minidom parses comments wrongly (original) (raw)
from xml.dom import minidom
html = """
"""minidom.parseString(html)
Result: Traceback (most recent call last): File "minidom.py", line 10, in minidom.parseString(html) File "/usr/lib/python2.7/xml/dom/minidom.py", line 1928, in parseString return expatbuilder.parseString(string) File "/usr/lib/python2.7/xml/dom/expatbuilder.py", line 940, in parseString return builder.parseString(string) File "/usr/lib/python2.7/xml/dom/expatbuilder.py", line 223, in parseString parser.Parse(string, True) xml.parsers.expat.ExpatError: not well-formed (invalid token): line 3, column 34
Tested versions: 2.7.6, 2.7.3
Reason: -- between obraz and super;
Thanks for your report. Alas, according to the W3C XML 1.0 specification:
"For compatibility, the string " -- " (double-hyphen) MUST NOT occur within comments."
So, it appears minidom (and other XML parsers) are correct in rejecting your example as not well-formed XML.