[Python-Dev] Decoding incomplete unicode (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Wed Aug 18 06:57:02 CEST 2004
- Previous message: [Python-Dev] Decoding incomplete unicode
- Next message: [Python-Dev] Decoding incomplete unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
M.-A. Lemburg wrote:
I've thought about this some more. Perhaps I'm still missing something, but wouldn't it be possible to add a feeding mode to the existing stream codecs by creating a new queue data type (much like the queue you have in the test cases of your patch) and using the stream codecs on these ?
Here is the problem. In UTF-8, how does the actual algorithm tell (the application) that the bytes it got on decoding provide for three fully decodable characters, and that 2 bytes are left undecoded, and that those bytes are not inherently ill-formed, but lack a third byte to complete the multi-byte sequence?
On top of that, you can implement whatever queuing or streaming APIs you want, but you need an efficient way to communicate incompleteness.
Regards, Martin
- Previous message: [Python-Dev] Decoding incomplete unicode
- Next message: [Python-Dev] Decoding incomplete unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]