Issue 35072: re.sub does not play nice with chr(92) (original) (raw)
Bug with regex substitutions. When calling the re.sub() method directly char(92), the double backslash charecter as the replacement, throws an exception. Whereas compiling a regex object then calling its own .sub() method works completely fine. I did a quick look through the bug tracker search for similar issues and none were reported.
Breaks
re.sub(r'\\', chr(92), stringy_thingy)
vs
Works
parser = re.compile(r'\\') parser.sub(chr(92), stringy_thingy)
Where stringy_thingy is a string that is being substituted
I'm assuming you want to replace double backslashes with single backslashes in stringy_thing, so I defined stringy_thingy and tried both your snippets but they are both failing:
stringy_thingy = r'foo\bar\baz' print(stringy_thingy) # stringy_thingy contains double backslashes foo\bar\baz re.sub(r'\\', chr(92), stringy_thingy) # fails Traceback (most recent call last): ... File "/usr/lib/python3.6/sre_parse.py", line 245, in __next self.string, len(self.string) - 1) from None sre_constants.error: bad escape (end of pattern) at position 0
parser = re.compile(r'\\') parser.sub(chr(92), stringy_thingy) # also fails Traceback (most recent call last): ... File "/usr/lib/python3.6/sre_parse.py", line 245, in __next self.string, len(self.string) - 1) from None sre_constants.error: bad escape (end of pattern) at position 0
Replacing chr(92) with r'\' works for both:
print(re.sub(r'\\', r'\', stringy_thingy)) foo\bar\baz print(parser.sub(r'\', stringy_thingy)) foo\bar\baz
The docs0 says: "repl can be a string or a function; if it is a string, any backslash escapes in it are processed." So passing chr(92) (or '\', which is equivalent) result in the above error ("bad escape (end of pattern)") because it's seen as an incomplete escape sequence. Passing r'\' seems to work as intended.
ISTM there is no bug and re.sub works as documented. Can you provide a stringy_thingy for which the first of your snippet fails but the second succeeds?