[Python-Dev] Decoding incomplete unicode (original) (raw)
Walter Dörwald walter at livinglogic.de
Thu Aug 19 17:45:26 CEST 2004
- Previous message: [Python-Dev] Decoding incomplete unicode
- Next message: [Python-Dev] Decoding incomplete unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Martin v. Löwis wrote:
Walter Dörwald wrote:
They will not, because StreamReader.decode() already is a feed style API (but with state amnesia).
Any stream decoder that I can think of can be (and most are) implemented by overwriting decode(). I consider that an unfortunate implementation artefact. You either use the stateless encode/decode that you get from codecs.get(encoder/decoder) or you use the file API on the streams. You never ever use encode/decode on streams.
That is exactly the problem with the current API. StreamReader mixes two concepts:
- The stateful API, which allows decoding a byte input in chunk and the state of the decoder is kept between calls.
- A file API where the chunks to be decoded are read from a byte stream.
I would have preferred if the default .write implementation would have called self.internalencode, and the Writer would contain a Codec, rather than inheriting from Codec.
This would separate the two concepts from above.
Alas, for (I guess) simplicity, a more direct (and more confusing) approach was taken.
1) Having feed() as part of the StreamReader API: --- s = u"???".encode("utf-8") r = codecs.getreader("utf-8")() for c in s: print r.feed(c) Isn't that a totally unrelated issue? Aren't we talking about short reads on sockets etc?
We're talking about two problems:
- The current implementation does not really support the stateful API, because trailing incomplete byte sequences lead to errors.
- The current file API is not really convenient for decoding when the input is not read for a stream.
I would very much prefer to solve one problem at a time.
Bye, Walter Dörwald
- Previous message: [Python-Dev] Decoding incomplete unicode
- Next message: [Python-Dev] Decoding incomplete unicode
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]