msg198464 - (view) |
Author: Musashi Tamura (miwa) |
Date: 2013-09-27 08:28 |
C:\>python Python 2.7.5 (default, May 15 2013, 22:44:16) [MSC v.1500 64 bit AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import struct >>> struct.pack(u'B',1) Traceback (most recent call last): File "", line 1, in TypeError: Struct() argument 1 must be string, not unicode >>> struct.pack(u'B',1) Traceback (most recent call last): File "", line 1, in TypeError: Struct() argument 1 must be string, not unicode >>> struct.pack('B',1) # this is ok '\x01' >>> struct.pack(u'B',1) '\x01' |
|
|
msg198708 - (view) |
Author: Vajrasky Kok (vajrasky) * |
Date: 2013-09-30 15:29 |
Here's the preliminary patch. I am assuming that we should accept unicode argument not reject it straight away. Python3 does that. >>> import struct >>> struct.pack('b', 3) b'\x03' >>> struct.pack(b'b', 3) b'\x03' >>> struct.pack(b'\xff', 3) Traceback (most recent call last): File "", line 1, in struct.error: bad char in struct format |
|
|
msg198714 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-09-30 18:23 |
Struct constructor accepts only str and not unicode. But struct.pack() uses caching and it found Struct('B') in the cache (because u'B' and 'B' are equal and have same hash). I doubt we should fix this. Adding support of Unicode argument is new feature. |
|
|
msg198716 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2013-09-30 18:40 |
Either way, the runtime inconsistency is a bug. Since we shouldn't break existing code, I would vote for always allowing unicode format strings, rather than always disallowing them. Another argument is that str and unicode are generally substituible in 2.x when they are pure ASCII (which they are here). |
|
|
msg198747 - (view) |
Author: Musashi Tamura (miwa) |
Date: 2013-09-30 23:29 |
Thanks for feedback. I think it should be fixed with allowing unicode. "from __future__ import unicode_literals" may break existing code. |
|
|
msg198749 - (view) |
Author: Vajrasky Kok (vajrasky) * |
Date: 2013-10-01 03:52 |
Refactor test to clear the cache before using unicode format. |
|
|
msg198754 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-10-01 08:27 |
struct.Struct() should be changed instead of struct.pack(). Here is a patch. |
|
|
msg198758 - (view) |
Author: Vajrasky Kok (vajrasky) * |
Date: 2013-10-01 08:57 |
Serhiy, you don't want to clear the cache in the test to simulate the bug? struct._clearcache() Other than that, should we raise struct.error instead of ValueError? Right now, the current behaviour in python 2.7 is: >>> struct.pack('\x80', 3) Traceback (most recent call last): File "", line 1, in struct.error: bad char in struct format But you are right. Struct() should have been changed instead of struct.pack. |
|
|
msg198759 - (view) |
Author: Vajrasky Kok (vajrasky) * |
Date: 2013-10-01 09:03 |
Nevermind about my comment about clearing cache. It only happens if we use struct.pack not struct.Struct. |
|
|
msg198769 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-10-01 12:55 |
> Other than that, should we raise struct.error instead of ValueError? Python 3 raises UnicodeEncodeError. And Python 2 raises UnicodeEncodeError when coerce non-ASCII unicode to str. |
|
|
msg203063 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-11-16 16:38 |
Any comments? |
|
|
msg205571 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-12-08 15:45 |
New changeset 42d3afd29460 by Serhiy Storchaka in branch '2.7': Issue #19099: The struct module now supports Unicode format strings. http://hg.python.org/cpython/rev/42d3afd29460 |
|
|
msg205572 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-12-08 15:48 |
Fixed. Thank you Musashi for your report. |
|
|
msg206165 - (view) |
Author: Vajrasky Kok (vajrasky) * |
Date: 2013-12-14 04:13 |
Okay, I think the error message can be improved because in Python 2.7 we differentiate very clearly the string from the unicode. >>> import struct >>> struct.Struct(1) Traceback (most recent call last): File "", line 1, in TypeError: Struct() argument 1 must be string, not int But you can give unicode, right? >>> struct.Struct(u'b') <Struct object at 0x1f484b8> This is consistent with other example: >>> " cutecat ".strip(1) Traceback (most recent call last): File "", line 1, in TypeError: strip arg must be None, str or unicode What do you say, Serhiy? Here is the patch. |
|
|
msg206220 - (view) |
Author: Vajrasky Kok (vajrasky) * |
Date: 2013-12-15 09:31 |
Nevermind, I already created this issue. http://bugs.python.org/issue19985 |
|
|