[Python-Dev] Python and the Unicode Character Database (original) (raw)
Vlastimil Brom vlastimil.brom at gmail.com
Tue Dec 7 14:02:47 CET 2010
- Previous message: [Python-Dev] Python and the Unicode Character Database
- Next message: [Python-Dev] Python and the Unicode Character Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
2010/12/7 Alexander Belopolsky <alexander.belopolsky at gmail.com>:
On Sat, Dec 4, 2010 at 5:58 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
I actually wonder if Python's re module can claim to provide even Basic Unicode Support.
Do you really wonder? Most definitely it does not. Were you more optimistic four years ago? http://bugs.python.org/issue1528154#msg54864 I was hoping that regex syntax would be useful in explaining/documenting Python text processing routines (including string to number conversions) without a heavy dose of Unicode terminology.
The new regex version http://bugs.python.org/issue2636 supports much more features, including unicode properties, and the mentioned possix classes etc. but definitely not all of the requirements of that rather "generous" list. http://www.unicode.org/reports/tr18/ It seems, e.g. in Perl, there are some omissions too http://perldoc.perl.org/perlunicode.html#Unicode-Regular-Expression-Support-Level
Do you know of any re engine fully complying to to tr18, even at the first level: "Basic Unicode Support"?
vbr
- Previous message: [Python-Dev] Python and the Unicode Character Database
- Next message: [Python-Dev] Python and the Unicode Character Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]