[Python-3000] pep 3131 again (original) (raw)

James Y Knight foom at fuhm.net
Thu May 17 05:14:21 CEST 2007


On May 16, 2007, at 9:06 PM, tomer filiba wrote:

=== RTL/LTR === i pointed out already that no existing editor can handle LTR-RTL representation correctly, which essentially renders all RTL languages out of the scope of this PEP. that doesn't bother me personally so much, as i'm not going to use this feature anyway, but that still leaves us with the "european imposed colonialism" :)

the only practical way to use RTL languages in code is to have an RTL programming language, where "if" is spelled "אם", "for" as "עבור", "in" as "בתוך", and so on, and the entire program is RTL. having code like --

for קקי in פיפי(1,2,3)

is only unreadable by all means (since the parenthesis are LTR, while the name is RTL, etc.)

It is interesting to contrast the rendering of that (ABC being
substitutes for hebrew characters): for ABB in 1,2,3)ACAC)

with the rendering of: for קקי in פיפי(a,b,c) as: for ABB in ACAC(a,b,c)

This is I suppose due to numbers and punctuation having weak
directionality in the bidi algorithm, which isn't really appropriate
for tokens in a programming language. So yes, clearly, an editor that
takes into account the special needs of programming languages is
necessary to effectively write bidi code. But it's certainly not
inconceivable, and I don't see that the non-existence of an effective
bidi editor should influence the decision to allow unicode characters
in python at all. For a majority of languages that are LTR, it is not
an issue, and I have every confidence that the bidi programming
editor problem will be solved at some point in the future. The only
thing python can possibly do to help with this is to ignore any RLO/ LRO/LRE/RLE/PDF/RLM/LRM characters it sees during tokenization.
(probably ought to ignore anything with the
"Default_Ignorable_Code_Point" unicode property).

This would allow a smart editor to save the text with such formatting
characters in it, so that other "dumb" viewers would not be confused. For example, with explicit formatting added, rendering can be made
correct: for ‪קקי‬ in ‪פיפי‭‌(1,2,3)

http://imagic.weizmann.ac.il/~dov/Hebrew/logicUI24.htm#h1-25 shows
someone has thought about this at least a little from the editor
perspective...

James



More information about the Python-3000 mailing list