Issue 547537: cStringIO should provide a binary option (original) (raw)

Created on 2002-04-23 12:52 by gvanrossum, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Messages (9)
msg53535 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-23 12:52
The last few comments added to bug 216388 indicate a new problem in cStringIO. Rather than abusing that bug report, I'm opening a new one here. The problem is that cStringIO now accepts Unicode strings to write(), but when you use this, getvalue() returns binary garbage. The cause is apparently MAL's checkin for cStringIO 2.30, which enabled read buffers.
msg53536 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-23 12:59
Logged In: YES user_id=6380 I wonder if perhaps the fix is as simple as using "t#" instead of "s#" in the PyArg_... format string in P_write(). That accepts Unicode strings as args to write() only when they are ASCII (actually, it uses the default encoding). Marc-Andre, can you explain the reason for the change in the first place (other than fixing a dubious dependency on PyString_GetSize() raising an exception for a non-string object)?
msg53537 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-26 21:08
Logged In: YES user_id=6380 Should I just check this in? It looks pretty safe to me...
msg53538 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2002-04-27 15:02
Logged In: YES user_id=38388 The idea to rip out the old string only approach was to make cStringIO more compatible to the file object implementation. Rather than switching from s# to t#, the cStringIO object should maintain a binary switch just like the file object does and then use s# for pseudo files opened in binary mode (default) and t# for text mode ones. Note that in any case, Unicode should be explicitly encoded before writing it to a file. Simply switching to t# would cause compatibility problems, since a different buffer API would be used for all input objects.
msg53539 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2002-04-27 15:13
Logged In: YES user_id=38388 Another note: the bug title is wrong: cStringIO doesn't mangle Unicode, it just returns the raw binary data. Not that this is of much use, but it's in sync with what the file object does.
msg53540 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2002-04-28 00:02
Logged In: YES user_id=6380 I think that adding a binary mode to cStringIO is okay, but the default should be text, and until we have the binary mode option, the format should be t#. Another solution would be to let cStringIO act more like StringIO; after all that was its original intention. But since that would require a major overhaul, I'm not seriously proposing that.
msg53541 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2002-05-29 10:36
Logged In: YES user_id=38388 Guido already fixed this in CVS, so I'll turn the bug into a feature request: cStringIO should provide a way to "open" a file in binary mode.
msg55193 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-08-23 19:33
Unassigning: I've never had a need for this in the past years.
msg55203 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-08-23 20:30
I think this can be closed, cStringIO won't change and Py3k won't have StringIO unicode problems anyway.
History
Date User Action Args
2022-04-10 16:05:15 admin set github: 36487
2007-08-23 20:30:21 georg.brandl set status: open -> closedresolution: wont fixmessages: + nosy: + georg.brandl
2007-08-23 19:33:20 lemburg set assignee: lemburg -> messages: +
2002-04-23 12:52:36 gvanrossum create