[Python-Dev] Re: [Python-checkins] python/dist/src/Lib textwrap.py,1.18,1.19 (original) (raw)
Guido van Rossum guido@python.org
Wed, 11 Dec 2002 10:39:56 -0500
- Previous message: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib textwrap.py,1.18,1.19
- Next message: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib textwrap.py,1.18,1.19
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[/F proves beyond a shadow of a doubt that string.whitespace is locale-sensitive]
Thanks, Fredrik! That clarifies the behaviour Just is seeing. Hey: I just realized that making textwrap trust string.whitespace is wrong in at least one case: 0xa0 is non-breaking space in ISO-8859-1, and converting it to 0x20 (regular ol' space) is clearly wrong -- the "non-break" request will be ignored. So Unicode or not, textwrap should probably just hard-code the US-ASCII whitespace chars.
+1
My attitude is that textwrap should work on European languages, whether they are encoded in 8-bit "ASCII" or Unicode. I suspect that passing an arbitrary Unicode string to it is meaningles -- what the heck does it even mean to wrap a string of Chinese or Hebrew or Devangari characters? Beats me, and I think they're out of scope for textwrap.
Correct -- you can't trust the width of characters to be all the same. (I'm not even sure if that's true for Latin-1, Cyrillic or Greek, but it seems likely.)
So: do I even need to worry about the cornucopia of Unicode whitespace characters at all? Or can I sweep that can of worms under the rug? (Pardon the horribly mixed metaphor.)
Please shove them under the garage.
--Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib textwrap.py,1.18,1.19
- Next message: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib textwrap.py,1.18,1.19
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]