[Python-Dev] bytes (original) (raw)

[Python-Dev] bytes / unicode

P.J. Eby pje at telecommunity.com
Mon Jun 21 19:46:56 CEST 2010


At 12:56 PM 6/21/2010 -0400, Toshio Kuratomi wrote:

One comment here -- you can also have uri's that aren't decodable into their true textual meaning using a single encoding.

Apache will happily serve out uris that have utf-8, shift-jis, and euc-jp components inside of their path but the textual representation that was intended will be garbled (or be represented by escaped byte sequences). For that matter, apache will serve requests that have no true textual representation as it is working on the byte level rather than the character level. So a complete solution really should allow the programmer to pass in uris as bytes when the programmer knows that they need it.

ebytes(somebytes, 'garbage'), perhaps, which would be like ascii, but where combining with non-garbage would results in another 'garbage' ebytes?



More information about the Python-Dev mailing list