[Python-Dev] Security implications of pep 383 (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Tue Mar 29 23:17:32 CEST 2011
- Previous message: [Python-Dev] Security implications of pep 383
- Next message: [Python-Dev] Security implications of pep 383
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
'\N{LATIN SMALL LETTER O}\N{COMBINING DIAERESIS}' != '\N{LATIN SMALL LETTER O WITH DIAERESIS}'
I guess the filesystem shouldn't treat these as the same (even though they are), but what if some webservice does? I suspect you should normalize both strings before comparing them in any blacklist, and what happens with surrogates when you normalize?
I think the whole blacklist example is artificial. The string in the blacklist is actually a Chinese "hello" greeting, so it surely isn't the string being blacklisted. For proper blacklisting, you would likely use substring searches, case-insensitivity, transliterations, and perhaps even regular expressions and word stemming. If you consider all these things, proper or alternative encodings of the same text are just another issue to consider.
Regards, Martin
- Previous message: [Python-Dev] Security implications of pep 383
- Next message: [Python-Dev] Security implications of pep 383
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]