Issue 13273: HTMLParser improperly handling open tags when strict is False (original) (raw)

This is is encountered when extending html.parser.HTMLParser and running with strict mode False.

Expected behavior: When '''

The rain
in Spain
''' is passed to the feed method, div, b, a, br, and span should all be passed to the handle_starttag method.

Actual behavior The handle_data method receives the values

,,,
, in addition to the regular text.

This can be fixed by changing this (inside the parse_starttag method):

m = hparse.attrfind_tolerant.search(rawdata, k)

to

m = hparse.attrfind_tolerant.match(rawdata, k)