msg53535 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2002-04-23 12:52 |
The last few comments added to bug 216388 indicate a new problem in cStringIO. Rather than abusing that bug report, I'm opening a new one here. The problem is that cStringIO now accepts Unicode strings to write(), but when you use this, getvalue() returns binary garbage. The cause is apparently MAL's checkin for cStringIO 2.30, which enabled read buffers. |
|
|
msg53536 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2002-04-23 12:59 |
Logged In: YES user_id=6380 I wonder if perhaps the fix is as simple as using "t#" instead of "s#" in the PyArg_... format string in P_write(). That accepts Unicode strings as args to write() only when they are ASCII (actually, it uses the default encoding). Marc-Andre, can you explain the reason for the change in the first place (other than fixing a dubious dependency on PyString_GetSize() raising an exception for a non-string object)? |
|
|
msg53537 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2002-04-26 21:08 |
Logged In: YES user_id=6380 Should I just check this in? It looks pretty safe to me... |
|
|
msg53538 - (view) |
Author: Marc-Andre Lemburg (lemburg) *  |
Date: 2002-04-27 15:02 |
Logged In: YES user_id=38388 The idea to rip out the old string only approach was to make cStringIO more compatible to the file object implementation. Rather than switching from s# to t#, the cStringIO object should maintain a binary switch just like the file object does and then use s# for pseudo files opened in binary mode (default) and t# for text mode ones. Note that in any case, Unicode should be explicitly encoded before writing it to a file. Simply switching to t# would cause compatibility problems, since a different buffer API would be used for all input objects. |
|
|
msg53539 - (view) |
Author: Marc-Andre Lemburg (lemburg) *  |
Date: 2002-04-27 15:13 |
Logged In: YES user_id=38388 Another note: the bug title is wrong: cStringIO doesn't mangle Unicode, it just returns the raw binary data. Not that this is of much use, but it's in sync with what the file object does. |
|
|
msg53540 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2002-04-28 00:02 |
Logged In: YES user_id=6380 I think that adding a binary mode to cStringIO is okay, but the default should be text, and until we have the binary mode option, the format should be t#. Another solution would be to let cStringIO act more like StringIO; after all that was its original intention. But since that would require a major overhaul, I'm not seriously proposing that. |
|
|
msg53541 - (view) |
Author: Marc-Andre Lemburg (lemburg) *  |
Date: 2002-05-29 10:36 |
Logged In: YES user_id=38388 Guido already fixed this in CVS, so I'll turn the bug into a feature request: cStringIO should provide a way to "open" a file in binary mode. |
|
|
msg55193 - (view) |
Author: Marc-Andre Lemburg (lemburg) *  |
Date: 2007-08-23 19:33 |
Unassigning: I've never had a need for this in the past years. |
|
|
msg55203 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2007-08-23 20:30 |
I think this can be closed, cStringIO won't change and Py3k won't have StringIO unicode problems anyway. |
|
|