[Python-Dev] PEP 460: allowing %d and %f and NOT ALLOWING mojibake :) (original) (raw)

Georg Brandl g.brandl at gmx.net
Sat Jan 11 19:29:27 CET 2014


Am 11.01.2014 18:41, schrieb Victor Stinner:

Hi,

I'm in favor of adding support of formatting integer and floatting point numbers in the PEP 460: %d, %u, %o, %x, %f with padding and precision (%10d, %010d, %1.5f) and sign (%-i, %+i) but without alternate format ("{:#x}"). %s would also accept int and float for convenience. int and float subclasses would not be handled differently, their str and format would be ignored. Other int-like and float-like types (ex: defining int or index) are not supported. Explicit cast would be required. For %s, the choice between string and number is made using "(PyLongCheck() || PyFloatCheck())". If you agree, I will modify the PEP. If Antoine disagree, I will fork the PEP 460 ;-) --- %s should not support precision (ex: %.100s), use Unicode for that. --- The PEP 460 should not reintroduce bytes+unicode, implicit decoding or implement encoding. b'x=%s' % 10 is well defined, it's pure bytes. If you consider that bytes should not contain text, why does the bytes type have methods like isalpha() or upper()? And why binary files have a readline() method? A "line" doesn't mean anything in pure bytes. It's an example of "practicality beats purity". Python 3 should not enforce Unicode if the developers chose to use bytes to handle mixed binary/text protocols like HTTP. But I'm against of adding "%r" and "%a" because they use Unicode and would require an implicit encoding. type(ascii(obj)) is str, not bytes. If you really want to use repr() and ascii(), encode the result explicitly.

I agree. For non-ASCII characters what ascii() gives you is almost always not what you want anyway.

Georg



More information about the Python-Dev mailing list