[Python-Dev] PEP 383 (again) (original) (raw)

Glenn Linderman v+python at g.nevcal.com
Wed Apr 29 09:49:37 CEST 2009

Previous message: [Python-Dev] PEP 383 (again)
Next message: [Python-Dev] PEP 383 (again)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On approximately 4/29/2009 12:17 AM, came the following characters from the keyboard of Martin v. Löwis:

OK, so you are saying that under PEP 383, utf-8b wouldn't be used anywhere on Windows by default. That's not clear from your proposal. You didn't read it carefully enough. The first three paragraphs of the "Specification" section make that clear.

Sorry, rereading those paragraphs even with this declaration in mind, does not make that clear. It is not enough to have a solution that works; it is necessary to communicate that solution clearly enough that people understand it. By the huge amount of feedback you have received, it is clear that either the solution doesn't work, or that it wasn't communicated clearly.

The following comments are an attempt to help you make the PEP clear, based on your above declaration that UTF-8b wouldn't be used on Windows. I may still be unclear about what you mean, but if you can accept these enhancements to the PEP, then maybe we are approaching a common understanding; if not, you should be aware that the PEP still needs clarification.

In the first paragraph, you should make it clear that Python 3.0 does not use the Windows bytes interfaces, if it doesn't. "Python uses only the wide character APIs..." would suffice. As stated, it seems like Python does use the wide character APIs, but leaves open the possibility that it might use byte APIs also. A short description of what happens on Windows when Python code uses bytes APIs would also be helpful.

In the second paragraph, it speaks of "currently" but then speaks of using the half-surrogates. I don't believe that happens "currently". You did change tense, but that paragraph is quite confusing, currently, because of the tense change. You should describe there, the action that is currently taken by Python for non-decodable byes, and then in the next paragraph talk about what the PEP changes.

The 4th paragraph is now confusing too... would it not be the decode error handler that returns the byte strings, in addition to the Unicode strings?

The 5th paragraph has apparently confused some people into thinking this PEP only applies to locale's using UTF-8 encodings; you should have an "else clause" to clear that up, pointing out that the reverse encoding of half-surrogates by other encodings already produces errors, that UTF-8 is a special case, not the only case.

The code added to the discussion has mismatched (), making me wonder if it is complete. There is a reasonable possibility that only the final ) is missing.

-- Glenn -- http://nevcal.com/

A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

Previous message: [Python-Dev] PEP 383 (again)
Next message: [Python-Dev] PEP 383 (again)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list