Issue 14045: In regex pattern long unicode character isn't recognized by repetition characters +, * and {} (original) (raw)
Issue14045
Created on 2012-02-18 03:13 by py.user, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Messages (3) | ||
---|---|---|
msg153629 - (view) | Author: py.user (py.user) * | Date: 2012-02-18 03:13 |
>>> import re >>> '\U00000061' 'a' >>> '\U00100061' '\U00100061' >>> re.search('\U00100061', '\U00100061' * 10).group() '\U00100061' >>> re.search('\U00100061+', '\U00100061' * 10).group() '\U00100061' >>> re.search('(\U00100061)+', '\U00100061' * 10).group() '\U00100061\U00100061\U00100061\U00100061\U00100061\U00100061\U00100061\U00100061\U00100061\U00100061' >>> >>> >>> re.search('\U00100061{3}', '\U00100061' * 10) >>> re.search('(\U00100061){3}', '\U00100061' * 10).group() '\U00100061\U00100061\U00100061' >>> | ||
msg153630 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2012-02-18 03:26 |
The re module doesn't support non-BMP characters in Python 3.2 compiled in narrow mode (sys.maxunicode==65535). This issue is already fixed in Python 3.3 which doesn't have narrow or wide mode anymore thanks to the PEP 393! | ||
msg153631 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2012-02-18 03:53 |
As Victor says, this issue is fixed in Python 3.3. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:26 | admin | set | github: 58253 |
2012-02-18 03:53:35 | loewis | set | status: open -> closedresolution: fixedmessages: + |
2012-02-18 03:26:19 | vstinner | set | nosy: + loewis, vstinnermessages: + |
2012-02-18 03:13:37 | py.user | create |