Issue 20420: BufferedIncrementalEncoder violates IncrementalEncoder interface (original) (raw)

I dug up an ancient email about that subject:

However, I've discovered that BufferedIncrementalEncoder.getstate() doesn't match the specification (i.e. it returns the buffer, not an int). However this class is unused (and probably useless, because it doesn't make sense to delay encoding the input). The simplest solution would be to simply drop the class.

Sounds like a plan; go right ahead!

Oops, there is one codec that uses it: The idna encoder. It buffers the input until a '.' is encountered (or encode() is called with final==True) and then encodes this part.

Either the idna encoder encodes the unencoded input as a int, or we drop the specification that encoder.getstate() must return an int, or we change it to mirror the decoder specification (i.e. return a (buffered_input, additional_state_info) tuple.

(A more radical solution would be to completely drop the incremental codecs for idna).

Maybe we should wait and see how the implementation of writing turns out?

And indeed the incremental encoder for idna behaves strange:

import io b = io.BytesIO() s = io.TextIOWrapper(b, 'idna') s.write('x') 1 s.tell() 0 b.getvalue() b'' s.write('.') 1 s.tell() 2 b.getvalue() b'x.' b = io.BytesIO() s = io.TextIOWrapper(b, 'idna') s.write('x') 1 s.seek(s.tell()) 0 s.write('.') Traceback (most recent call last): File "", line 1, in File "/Users/walter/.local/lib/python3.3/codecs.py", line 218, in encode (result, consumed) = self._buffer_encode(data, self.errors, final) File "/Users/walter/.local/lib/python3.3/encodings/idna.py", line 246, in _buffer_encode result.extend(ToASCII(label)) File "/Users/walter/.local/lib/python3.3/encodings/idna.py", line 73, in ToASCII raise UnicodeError("label empty or too long") UnicodeError: label empty or too long

The cleanest solution might probably by to switch to a (buffered_input, additional_state_info) state.

However I don't know what changes this would require in the seek/tell imlementations.