[Python-Dev] len(chr(i)) = 2? (original) (raw)

Antoine Pitrou solipsis at pitrou.net
Wed Nov 24 11:27:30 CET 2010

Previous message: [Python-Dev] len(chr(i)) = 2?
Next message: [Python-Dev] len(chr(i)) = 2?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 24 Nov 2010 18:51:49 +0900 "Stephen J. Turnbull" <stephen at xemacs.org> wrote:

James Y Knight writes:

> But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly > superior [...]a because it is an ASCII superset, and thus more > easily compatible with other software. That also makes it most > commonly used for internet communication. Sure, UTF-8 is very nice as a protocol for communicating text. So what? If your application involves shoveling octets real fast, don't convert and shovel those octets. If your application involves significant text processing, well, conversion can almost always be done as fast as you can do I/O so it doesn't cost wallclock time, and generally doesn't require a huge percentage of CPU time compared to the actual text processing. It's just a specialization of serialization, that we do all the time for more complex data structures. So wire protocols are not a killer argument for or against any particular internal representation of text.

Agreed. Decoding and encoding utf-8 is so fast that it should be dwarfed by any actual processing done on the text.

Regards

Antoine.

Previous message: [Python-Dev] len(chr(i)) = 2?
Next message: [Python-Dev] len(chr(i)) = 2?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list