[Python-Dev] Migration from Python 2.7 and bytes formatting (original) (raw)

Neil Schemenauer nas at arctrix.com
Fri Jan 17 21:09:45 CET 2014


I've refined this idea a little in my latest PEP 461 patch (issue 20284). Continuing to use %s instead of introducing %b seems better. I've called the commmand-line option -2, it could be used to enable other similar porting aids.

I'd like to try porting code making use of the -2 feature to see how helpful it is. The behavior is partway between Python 2.x laziness and Python 3.x strictness in terms of specifying encodings.

Python 2.x:

- coerce byte strings to unicode strings to avoid making a
  decision about encoding

- when writing a unicode string to a bytes stream without
  a specified encoding, encode with ASCII.  Blow up with an
  exception if a non-ASCII character is encounted, often far
  from where the real bug is.

Python 3.x:

- refuse to accept unicode strings where bytes are expected,
  require explicit encoding to be preformed

Python 3.x with -2 command-line option:

- when objects are formatted into bytes, immediately
  encode them using strict ASCII encoding.

No code would be considered fully ported to Python 3 unless it can run without the -2 command line option.

Neil



More information about the Python-Dev mailing list