Message 102077 - Python tracker (original) (raw)
Ezio Melotti wrote:
Ezio Melotti <ezio.melotti@gmail.com> added the comment:
Here is an incomplete patch. It seems to solve the problem but I still have to add more tests and check it better.
Thanks. Please also check whether it's worthwhile unrolling those loops by hand.
I also wonder if the sequences with the first byte in range F5-FD (start of 4/5/6-byte sequences, restricted by RFC 3629) should behave in the same way. Right now they just "eat" the following 4/5/6 chars without checking.
I think we need to do this all the way, even though 5 and 6 byte sequences are not used at the moment.