Issue 27799: Fix base64-codec and bz2-codec incremental decoders (original) (raw)
This is split off a large patch I posted at Issue 20132. My new patch here fixes the following two flaws.
There is special code in the bz2 decoder that returns an empty text str object at EOF, even though bz2-codec is a bytes-to-bytes codec:
import codecs decoder = codecs.getincrementaldecoder("bz2")() decoder.decode(codecs.encode(b"data", "bz2")) b'data' decoder.decode(b"", final=True) # Should return bytes object ''
The base64 decoder does not handle partial sets of four codes, because it treats each input chunk as a stand-alone base64 encoding:
tuple(codecs.iterdecode((b"AA", b"AA\r\n"), "base64")) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.5/codecs.py", line 1039, in iterdecode output = decoder.decode(input) File "/usr/lib/python3.5/encodings/base64_codec.py", line 35, in decode return base64.decodebytes(input) File "/usr/lib/python3.5/base64.py", line 554, in decodebytes return binascii.a2b_base64(s) binascii.Error: Incorrect padding