nonstrict mode seems to eat too much into data and gets past endpos of the chunk processed, and parser gets confused and treats any subsequent stuff as data. i didn't think out how to fix the regexp as such, but instead limited its span to :endpos so it doesnot eat too much. seems to happen with unquoted attributes.
I was bitten by this bug today. Hope it will be solved in the next release of Python 3. It is also possible to use the third argument of search in line 285: m = attrfind_tolerant.search(rawdata, k, endpos) This seems to me to be a more `natural' solution.
This seems to be already fixed in 3.2/3.3, so I extracted the test from your script and added to the test suite. If you can find a way to break the parser let me know.
History
Date
User
Action
Args
2022-04-11 14:57:16
admin
set
github: 56217
2011-11-01 12:46:40
ezio.melotti
set
status: open -> closedassignee: ezio.melottinosy: + ezio.melottimessages: + resolution: out of datestage: resolved