[Python-Dev] Stateful codecs [Was: str object going in Py3K] (original) (raw)
Walter Dörwald walter at livinglogic.de
Fri Feb 17 15:38:24 CET 2006
- Previous message: [Python-Dev] str object going in Py3K
- Next message: [Python-Dev] Stateful codecs [Was: str object going in Py3K]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
M.-A. Lemburg wrote:
Walter Dörwald wrote:
Guido van Rossum wrote:
[...] Years ago I wrote a prototype; checkout sandbox/sio/. However sio.DecodingInputFilter and sio.EncodingOutputFilter don't work for encodings that need state (e.g. when reading/writing UTF-16). Switching to stateful encoders/decoders isn't so easy, because the stateful codecs require a stream-API, which brings in a whole bunch of other functionality (readline() etc.), which we'd probably like to keep separate. I have a patch (http://bugs.python.org/1101097) that should fix this problem (at least for all codecs derived from codecs.StreamReader/codecs.StreamWriter). Additionally it would make stateful codecs more useful in the context for iterators/generators. I'd like this patch to go into 2.5. The patch as-is won't go into 2.5. It's simply the wrong approach: StreamReaders and -Writers work on streams (hence the name). It doesn't make sense adding functionality to side-step this behavior, since it undermines the design.
I agree that using a StreamWriter without a stream somehow feels wrong.
Like I suggested in the patch discussion, such functionality could be factored out of the implementations of StreamReaders/Writers and put into new StatefulEncoder/Decoder classes, the objects of which then get used by StreamReader/Writer.
In addition to that we could extend the codec registry to also maintain slots for the stateful encoders and decoders, if needed.
We have to do it like this otherwise there would be no way to get a StatefulEncoder/Decoder from an encoding name.
Does this mean that codecs.lookup() would have to return a 6-tuple? But this would break if someone uses codecs.lookup("foo")[-1]. So maybe codecs.lookup() should return an instance of a subclass of tuple which has the StatefulEncoder/Decoder as attributes. But then codecs.lookup() must be able to handle old 4-tuples returned by old search functions and update those to the new 6-tuples. (But we could drop this again after several releases, once all third party codecs are updated).
Bye, Walter Dörwald
- Previous message: [Python-Dev] str object going in Py3K
- Next message: [Python-Dev] Stateful codecs [Was: str object going in Py3K]
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]