[Python-Dev] PEP 383 (again) (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Wed Apr 29 22:06:33 CEST 2009


In the first paragraph, you should make it clear that Python 3.0 does not use the Windows bytes interfaces, if it doesn't. "Python uses only the wide character APIs..." would suffice.

That's not quite exact. It uses both ANSI and Wide APIs - depending on whether you pass bytes as input or strings. Please see the Python source code to find out how this works, and what that means.

As stated, it seems like Python does use the wide character APIs, but leaves open the possibility that it might use byte APIs also. A short description of what happens on Windows when Python code uses bytes APIs would also be helpful.

I'm at a loss how to make the text more clear than it already is. I'm really not good at writing long essays, with a lot of explanatory-but-non-normative text. I also think that explanations do not belong in the section titled specification, nor does a full description of the status quo belongs into the PEP at all. The reader should consult the current Python source code if in doubt what the status quo is.

In the second paragraph, it speaks of "currently" but then speaks of using the half-surrogates. I don't believe that happens "currently". You did change tense, but that paragraph is quite confusing, currently, because of the tense change. You should describe there, the action that is currently taken by Python for non-decodable byes, and then in the next paragraph talk about what the PEP changes.

Thanks, fixed.

The 4th paragraph is now confusing too... would it not be the decode error handler that returns the byte strings, in addition to the Unicode strings?

No, why do you think so? That's intended as stated.

The 5th paragraph has apparently confused some people into thinking this PEP only applies to locale's using UTF-8 encodings; you should have an "else clause" to clear that up, pointing out that the reverse encoding of half-surrogates by other encodings already produces errors, that UTF-8 is a special case, not the only case.

I have fixed that by extending the third paragraph.

The code added to the discussion has mismatched (), making me wonder if it is complete. There is a reasonable possibility that only the final ) is missing.

Indeed; this is now also fixed.

Regards, Martin



More information about the Python-Dev mailing list