[Python-Dev] Make re.compile faster (original) (raw)
Serhiy Storchaka storchaka at gmail.com
Tue Oct 3 01:35:52 EDT 2017
- Previous message (by thread): [Python-Dev] Make re.compile faster
- Next message (by thread): [Python-Dev] Make re.compile faster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
03.10.17 06:29, INADA Naoki пише:
Before deferring re.compile, can we make it faster?
I profiled
import string
and small optimization can make it 2x faster! (but it's not backward compatible)
Please open an issue for this.
I found:
* RegexFlag.and and new is called very often. * optimizecharset is slow, because re.UNICODE | re.IGNORECASE diff --git a/Lib/srecompile.py b/Lib/srecompile.py index 144620c6d1..7c662247d4 100644 --- a/Lib/srecompile.py +++ b/Lib/srecompile.py @@ -582,7 +582,7 @@ def isstring(obj): def code(p, flags): - flags = p.pattern.flags | flags + flags = int(p.pattern.flags) | int(flags) code = [] # compile info block
Maybe cast flags to int earlier, in sre_compile.compile()?
diff --git a/Lib/string.py b/Lib/string.py index b46e60c38f..fedd92246d 100644 --- a/Lib/string.py +++ b/Lib/string.py @@ -81,7 +81,7 @@ class Template(metaclass=TemplateMetaclass): delimiter = '$' idpattern = r'[a-z][a-z0-9]*' braceidpattern = None - flags = re.IGNORECASE + flags = re.IGNORECASE | re.ASCII
def init(self, template): self.template = template patched: import time: 1191 | 8479 | string Of course, this patch is not backward compatible. [a-z] doesn't match with 'ı' or 'ſ' anymore. But who cares?
This looks like a bug fix. I'm wondering if it is worth to backport it to 3.6. But the change itself can break a user code that changes idpattern without touching flags. There is other way, but it should be discussed on the bug tracker.
- Previous message (by thread): [Python-Dev] Make re.compile faster
- Next message (by thread): [Python-Dev] Make re.compile faster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]