msg261739 - (view) |
Author: Marco Sulla (marco.sulla) |
Date: 2016-03-14 10:29 |
Steps to reproduce 1. create a format_bytes.py with: "Hello {}".format(b"World") 2. launch it with python3 -bb format_bytes.py Result: Traceback (most recent call last): File "format_bytes.py", line 1, in "Hello {}".format(b"World") BytesWarning: str() on a bytes instance Expected: No warning |
|
|
msg261740 - (view) |
Author: Marco Sulla (marco.sulla) |
Date: 2016-03-14 10:31 |
I want to clarify more: I do not want to suppress the warning, I would that the format minilanguage will convert bytes to string properly. |
|
|
msg261742 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-14 10:35 |
> I would that the format minilanguage will convert bytes to string properly. Sorry, nope, Python 3 doesn't guess the encoding of byte strings anymore. You have to decode manually. Example: "Hello {}".format(b"World".decode('ascii')) Or format to bytes: b"Hello {}".format(b"World") It's not a bug. It's a feature. |
|
|
msg261743 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-14 10:38 |
More about Unicode: * https://docs.python.org/dev/howto/unicode.html * http://unicodebook.readthedocs.org/ * etc. |
|
|
msg261751 - (view) |
Author: Marco Sulla (marco.sulla) |
Date: 2016-03-14 13:19 |
> Python 3 doesn't guess the encoding of byte strings anymore And I agree, but I think format minilanguage could convert it by default to utf8, and if something goes wrong raise an error (or try str()). More simple to use and robust at the same time. My 2 cents. |
|
|
msg261752 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-14 13:20 |
>> Python 3 doesn't guess the encoding of byte strings anymore > And I agree, but I think format minilanguage could convert it by default to utf8, .. Using utf8 means guessing the encoding of a byte string. Python 3 doesn't do that anymore, there is no more exception. |
|
|
msg261753 - (view) |
Author: Marco Sulla (marco.sulla) |
Date: 2016-03-14 13:31 |
> Using utf8 means guessing the encoding Well, it's not what format() is doing now, using str()? :) |
|
|
msg261755 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-14 14:33 |
> Well, it's not what format() is doing now, using str()? :) Hum, are you sure that you tried Python 3, and not Python 2? str(bytes) on Python 3 is well defined: >>> print(str(b'hello')) b'hello' >>> print(str('h\xe9llo'.encode('utf8'))) b'h\xc3\xa9llo' I'm not sure that you expect the b'...' format. Non-ASCII characters are escaped as \xHH format. |
|
|