Issue 32268: quopri.decode(): string argument expected, got 'bytes' (original) (raw)

Created on 2017-12-10 10:45 by luch, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (9)
msg307957 - (view) Author: Alexey Luchko (luch) Date: 2017-12-10 10:45
$ python3 -c 'import io, quopri; quopri.decode(io.StringIO("some initial text data"), io.StringIO())' Traceback (most recent call last): File "", line 1, in File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/quopri.py", line 125, in decode output.write(odata) TypeError: string argument expected, got 'bytes'
msg307959 - (view) Author: Christoph Reiter (lazka) * Date: 2017-12-10 13:41
The documentation [0] states: "input and output must be binary file objects" and you are not passing a binary file object. [0] https://docs.python.org/3.6/library/quopri.html#quopri.decode
msg307993 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-10 21:40
Right, this is not a bug, it is working as documented. You could submit an enhancement request, but I'm surprised to find that anyone is actually using that module :) We unfortunately have multiple implementations of quoted printable handling in the stdlib. You might be interested in the functions in the binascii module, which do accept unicode input when doing decoding.
msg308134 - (view) Author: Alexey Luchko (luch) Date: 2017-12-12 14:41
Yes. With io.BytesIO() output, it works. However, this kind of error messages is quite very confusing. It better be possible to distinguish binary and text streams, so one (including quopri module) could tell it won't work in advance %) Thanks for referring to binascii module as well!
msg308148 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-12 18:22
We generally don't do advance type checking (look before you leap) in Python. This allows a type the programmer hadn't planned for to be used as long as it "quacks like" the expected type (this is called duck typing). So the error was produced exactly where it should be and exactly as it should be: it was produced by the StringIO object you provided, when quopri tried to write the binary output that it produces to the object you handed it.
msg308692 - (view) Author: Alexey Luchko (luch) Date: 2017-12-19 23:30
I didn't mean type checking. The point is that since string and bytes are different types, then binary and text files are actually much more different than before python 3. Therefore they better be of different protocols. Then inside quopri with StringIO in place of BytesIO the error would be much more clear and this issue would not have appeared. This would be helpful in lack of one's intuition, like with newcomers.
msg308694 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-19 23:50
That's type checking. Not type checking is to call the method that writes the data, expecting the object to handle the bytes it is passed, and then that object raises an error to indicate that it cannot. There is no protocol that can be checked for.
msg308699 - (view) Author: Alexey Luchko (luch) Date: 2017-12-20 01:06
1. On quopri. It is counter-intuitive despite the clear statement in the doc-string. Quoted-printable is used mostly for text encoding. (It would be quite awkward and inefficient to use it for binary data.) My case: I have a text and I mean to get a text. Why on earth StringIO is not suitable for this goal... It is just crazy! 2. On duck typing and StringIO. It should make life easy, not crazy. But with great power... You know. Taking a wider look than just quopri, the case shows a problem – writing text to StringIO produces a type error stating *string argument expected*. That is crazy and as counter-duck-typing as type checking. And even more ... crazy in case of 7bit ascii! There could be a solution on StringIO side, like an encoding it should expect on input if it gets binary data. At least 7bit ascii would be totally ok for default it this case. Then it either would have worked or would have produced a clear text encoding related error that would be *meaningful* and *instructive*. 3. On protocol. A protocol: write_bytes() and write_string() methods that would have been raising type error in case of text_file.write_bytes() or binary_file.write_string(). Then one having bytes-*like* object (like quopri.decode) would call write_bytes() and StringIO would raise smth like 'StringIO used for binary output, consider BytesIO' or 'Text file requires strings, not bytes' that would be meaningful as well. Other one having string-*like* object would call knowingly write_string() with a corresponding error from BytesIO or binary file. *There is nothing of checking in the protocol, but clear intention statement.* Disclaimer: It is just an example and an opinion. There is no reason for holy war. I don't pretend it would be any better...
msg308702 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-12-20 03:13
Yes, if that protocol existed the errors would be clearer. But it doesn't, for historical reasons, and that is unlikely to change. You are welcome to submit an enhancement request to make quopri accept string as an argument when decoding. But when encoding, it must produce bytes, because "ASCII" is a *byte* encoding, not a unicode encoding (unicode is ascii compatible, but it is neither ascii nor bytes). You might have to write the PR yourself, I'm not sure if anyone else will be interested (but some of the people on the core-mentorship mailing list might be). StringIO is specifically designed to only operate on strings. If you want to decode the bytes you feed it, you have to do that. This is an intentional design. Further discussion of ways to improve the situation should move to the python-idea mailing list. There's really nothing to do here from a bug tracker perspective, unless you want to open an enhancement request as mentioned above.
History
Date User Action Args
2022-04-11 14:58:55 admin set github: 76449
2017-12-20 08:33:23 lazka set nosy: - lazka
2017-12-20 03:13:09 r.david.murray set messages: +
2017-12-20 01:06:45 luch set messages: +
2017-12-19 23:50:51 r.david.murray set messages: +
2017-12-19 23:30:34 luch set messages: +
2017-12-12 18:22:58 r.david.murray set messages: +
2017-12-12 14:41:03 luch set messages: +
2017-12-10 21:40:18 r.david.murray set status: open -> closedtype: behaviornosy: + r.david.murraymessages: + resolution: not a bugstage: resolved
2017-12-10 13:41:42 lazka set nosy: + lazkamessages: +
2017-12-10 10:45:53 luch create