msg176732 - (view) |
Author: Markus Kettunen (makegho) |
Date: 2012-12-01 00:23 |
In a C application on Windows, at least on MSVC 2010 and Windows 7, do this: wprintf(L"Test\n"); Py_Initialize(); wprintf(L"Test\n"); Output is: Test T e s t I was able to track the issue to fileio.c to the following code block by searching where wprintf breaks: if (dircheck(self, nameobj) < 0) goto error; #if defined(MS_WINDOWS) | |
defined(__CYGWIN__) /* don't translate newlines (\r\n <=> \n) */ _setmode(self->fd, O_BINARY); <----- breaks it #endif if (PyObject_SetAttrString((PyObject *)self, "name", nameobj) < 0) goto error; This can be easily confirmed by adding wprintfs on both sides of _setmode. This issue was also raised at http://mail.python.org/pipermail/python-list/2012-February/620528.html but no solution was provided back then. |
|
msg176734 - (view) |
Author: Markus Kettunen (makegho) |
Date: 2012-12-01 00:47 |
If the standard streams are not used through Python, this hack can be used to work around the bug on C side: #ifdef WIN32 #include <fcntl.h> #endif ... Py_Initialize(); #ifdef WIN32 _setmode(stdin->_file, O_TEXT); _setmode(stdout->_file, O_TEXT); _setmode(stderr->_file, O_TEXT); #endif |
|
|
msg176828 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-12-03 07:23 |
_setmode(self->fd, O_BINARY) change was done in Python 3.2: see the issue #10841. This change introduced regressions: - #11272: "input() has trailing carriage return on windows", fixed in Python 3.2.1 - #11395: "print(s) fails on Windows with long strings", fixed in Python 3.2.1 - #13119: "Newline for print() is \n on Windows, and not \r\n as expected", fixed in Python 3.3 (and will be fixed in Python 3.2.4) In Python 3.1, _setmode(self->fd, O_BINARY) was already used when Python is called with the -u command line option. _setmode() supports different options: - _O_BINARY: no conversion - _O_TEXT: translate "\n" with "\r\n" - _O_U8TEXT: UTF-8 without BOM - _O_U16TEXT: UTF-16 without BOM - _O_WTEXT: UTF-16 with BOM I didn't try wprintf(). This function is not used in the Python source code (except in the Windows launcher, which is not part of the main interpreter). I don't know how to fix wprintf(). |
|
|
msg176837 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-12-03 11:49 |
> _setmode(self->fd, O_BINARY) change was done in Python 3.2: see the issue #10841 The main reason was to be able to read binary file from sys.stdin using the CGI module: see the issue #4953. In _O_TEXT mode, 0x0A byte is replaced with 0x0A 0x0D (or the opposite, I never remember) which corrupt binary files. Articles about _setmode() and wprintf(): "A confluence of circumstances leaves a stone unturned..." http://blogs.msdn.com/b/michkap/archive/2010/09/23/10066660.aspx "Conventional wisdom is retarded, aka What the @#%&* is _O_U16TEXT?" http://blogs.msdn.com/b/michkap/archive/2008/03/18/8306597.aspx See also issue #1602 (Windows console doesn't print or input Unicode). |
|
|
msg186852 - (view) |
Author: John Ehresman (jpe) * |
Date: 2013-04-13 21:16 |
One way to fix this is to use the FileRead & FileWrite api functions directly as proposed in issue 17723 I would regard this as a change in behavior and not a simple bug fix because there is probably code written for 3.3 that assumes the C level stdout is in binary after python is initialized so would target 3.4 for the change. |
|
|
msg220642 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2014-06-15 14:40 |
I'll let our Windows gurus fight over who gets this one :) |
|
|
msg220762 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2014-06-16 20:57 |
If I understood correctly, supporting the "wide mode" for wprintf() requires to modify all calls to functions like printf() in Python and so it requires to change a lot of code. Since this issue was the first time that I heard about wprintf(), I don't think that we should change Python. I'm not going to fix this issue except if much more users ask for it. |
|
|
msg221346 - (view) |
Author: Markus Kettunen (makegho) |
Date: 2014-06-23 09:38 |
It's quite common to use wide character strings to support Unicode in C and C++. In C++ this often means using std::wstring and std::wcout. Maybe these are more common than wprintf? In any case the console output breaks as Py_Initialize hijacks the host application's standard output streams which sounds quite illegitimate to me. I understand that Python isn't designed for embedding and it would be a lot of work to fix it, but I would still encourage everyone to take a look at this bug. For me, this was one of the reasons I ultimately had to decide against using Python as my application's scripting language, which is a shame. |
|
|
msg221347 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2014-06-23 09:48 |
"In C++ this often means using std::wstring and std::wcout. Maybe these are more common than wprintf? In any case the console output breaks as Py_Initialize hijacks the host application's standard output streams which sounds quite illegitimate to me." On Linux, std::wcout doesn't use wprintf(). Do you mean that std::wcout also depends on the "mode" of stdout (_setmode)? |
|
|
msg221348 - (view) |
Author: Markus Kettunen (makegho) |
Date: 2014-06-23 09:59 |
> On Linux, std::wcout doesn't use wprintf(). Do you mean that std::wcout also depends on the "mode" of stdout (_setmode)? Yes, exactly. I originally noticed this bug by using std::wcout on Windows. |
|
|