[Python-Dev] PEP 383 update: utf8b is now the error handler (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Wed May 6 10:03:47 CEST 2009


Yeah, yeah, this is the same old same old from PEP 3131. Anything that handles the various attacks based on ASCII-alike characters should at least rule out invalid Unicode, too!

And where is this U+DC2F supposed to be coming from, anyway? The user's local environment or the user's local filesystem!

Why is that not a threat? Suppose you have a setuid application, and you pass some string on the command line that decodes to /../. Then the setuid application will be tricked into modifying files it didn't mean to modify.

Likewise, it might come from a relational database. Use a relational database that supports unicode code units, or lone surrogates through utf-8, and fill in some bogus data. Then have the Python application (running as root) read it.

Of course I can't prove that there's no vector for an exploit here (in fact, I'm sure there is one with sufficiently careless handling of input), but I think "consenting adults" covers the Shift JIS use case. Make it an option, but it should be explicitly part of the PEP.

Nothing is lost at the moment. If users complain, we can still think of ways to enhance the experience.

In any case, Python 3.1b1 may get released today, so it's way too late for new features in the PEP. They can wait for Python 3.2.

Regards, Martin



More information about the Python-Dev mailing list