Issue 26555: string.format(bytes) raise warning (original) (raw)

Created on 2016-03-14 10:29 by marco.sulla, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (8)
msg261739 - (view) Author: Marco Sulla (marco.sulla) Date: 2016-03-14 10:29
Steps to reproduce 1. create a format_bytes.py with: "Hello {}".format(b"World") 2. launch it with python3 -bb format_bytes.py Result: Traceback (most recent call last): File "format_bytes.py", line 1, in "Hello {}".format(b"World") BytesWarning: str() on a bytes instance Expected: No warning
msg261740 - (view) Author: Marco Sulla (marco.sulla) Date: 2016-03-14 10:31
I want to clarify more: I do not want to suppress the warning, I would that the format minilanguage will convert bytes to string properly.
msg261742 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-03-14 10:35
> I would that the format minilanguage will convert bytes to string properly. Sorry, nope, Python 3 doesn't guess the encoding of byte strings anymore. You have to decode manually. Example: "Hello {}".format(b"World".decode('ascii')) Or format to bytes: b"Hello {}".format(b"World") It's not a bug. It's a feature.
msg261743 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-03-14 10:38
More about Unicode: * https://docs.python.org/dev/howto/unicode.html * http://unicodebook.readthedocs.org/ * etc.
msg261751 - (view) Author: Marco Sulla (marco.sulla) Date: 2016-03-14 13:19
> Python 3 doesn't guess the encoding of byte strings anymore And I agree, but I think format minilanguage could convert it by default to utf8, and if something goes wrong raise an error (or try str()). More simple to use and robust at the same time. My 2 cents.
msg261752 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-03-14 13:20
>> Python 3 doesn't guess the encoding of byte strings anymore > And I agree, but I think format minilanguage could convert it by default to utf8, .. Using utf8 means guessing the encoding of a byte string. Python 3 doesn't do that anymore, there is no more exception.
msg261753 - (view) Author: Marco Sulla (marco.sulla) Date: 2016-03-14 13:31
> Using utf8 means guessing the encoding Well, it's not what format() is doing now, using str()? :)
msg261755 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-03-14 14:33
> Well, it's not what format() is doing now, using str()? :) Hum, are you sure that you tried Python 3, and not Python 2? str(bytes) on Python 3 is well defined: >>> print(str(b'hello')) b'hello' >>> print(str('h\xe9llo'.encode('utf8'))) b'h\xc3\xa9llo' I'm not sure that you expect the b'...' format. Non-ASCII characters are escaped as \xHH format.
History
Date User Action Args
2022-04-11 14:58:28 admin set github: 70742
2016-03-14 14:33:28 vstinner set messages: +
2016-03-14 13:31:51 marco.sulla set messages: +
2016-03-14 13:20:39 vstinner set messages: +
2016-03-14 13:19:30 marco.sulla set messages: +
2016-03-14 10:38:23 vstinner set messages: +
2016-03-14 10:35:51 vstinner set status: open -> closednosy: + vstinnermessages: + resolution: not a bug
2016-03-14 10:31:13 marco.sulla set messages: +
2016-03-14 10:29:53 marco.sulla create