Issue 22999: Copying emoji to Windows clipboard corrupts string in Python 3.3 and up (original) (raw)

Issue22999

Created on 2014-12-05 09:58 by Cees.Timmerman, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test_clipboard_win.py Cees.Timmerman,2014-12-05 10:11
Messages (4)
msg232188 - (view) Author: Cees Timmerman (Cees.Timmerman) Date: 2014-12-05 09:58
# http://stackoverflow.com/a/25678113/819417 def copy(data): if not isinstance(data, unicode): data = data.decode('mbcs') OpenClipboard(None) EmptyClipboard() hCd = GlobalAlloc(GMEM_DDESHARE, 2 * (len(data) + 1)) pchData = GlobalLock(hCd) wcscpy(ctypes.c_wchar_p(pchData), data) GlobalUnlock(hCd) SetClipboardData(CF_UNICODETEXT, hCd) CloseClipboard() Emoji "📋" (\U0001f400) is copied as "🐀" (\U0001f4cb), or "📋." turns to "📋" (note the period). It works fine in Python 3.2.5.
msg232189 - (view) Author: Cees Timmerman (Cees.Timmerman) Date: 2014-12-05 10:11
A copy of my test program at https://gist.github.com/CTimmerman/133cb80100357dde92d8
msg232190 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2014-12-05 10:32
(you swapped the unicode values: \U0001f4cb is copied as \U0001f400) On Windows, strings have changed in 3.3. See in https://docs.python.org/3/whatsnew/3.3.html, "len() now always returns 1 for non-BMP characters". The call to GlobalAlloc should use the number of wchar_t units, something like len(data.encode('utf-16')) + 2
msg232191 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2014-12-05 10:36
Better use utf-16-le encoding: len(data.encode('utf-16-le')) + 2 otherwise the encoded bytes start with the \fffe BOM.
History
Date User Action Args
2022-04-11 14:58:10 admin set github: 67188
2014-12-05 10:36:01 amaury.forgeotdarc set messages: +
2014-12-05 10:32:39 amaury.forgeotdarc set status: open -> closednosy: + amaury.forgeotdarcmessages: + resolution: not a bug
2014-12-05 10:11:03 Cees.Timmerman set files: + test_clipboard_win.pymessages: +
2014-12-05 09:58:05 Cees.Timmerman create