msg292079 - (view) |
Author: Patrick Foley (Patrick Foley) * |
Date: 2017-04-21 21:18 |
The following code demonstrates: import re text = 'ab\\' exp = re.compile('a') print(re.sub(exp, text, '')) If you remove the backslash(es), the code runs fine. This appears to be specific to the re module and only to strings that end in (even properly escaped) backslashes. You could easily receive raw data like this from freehand input sources so it would be nice not to have to remove trailing backslashes before running a regular expression. |
|
|
msg292080 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-04-21 21:51 |
I think you are missing a re.escape around text. Text is otherwise not a valid replacement pattern. |
|
|
msg292093 - (view) |
Author: Matthew Barnett (mrabarnett) *  |
Date: 2017-04-22 01:17 |
Yes, the second argument is a replacement template, not a literal. This issue does point out a different problem, though: re.escape will add backslashes that will then be treated as literals in the template, for example: >>> re.sub(r'a', re.escape('(A)'), 'a') '\\(A\\)' re.escape doesn't always help. The solution here is to pass a replacement function instead: >>> re.sub(r'a', lambda m: '(A)', 'a') '(A)' |
|
|
msg292094 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-04-22 02:18 |
Good point, re.escape is for literal text you want to insert into a matching pattern, but the replacement template isn't a matching pattern. Do we need a different escape function? I guess the function solution is enough? |
|
|
msg292095 - (view) |
Author: Matthew Barnett (mrabarnett) *  |
Date: 2017-04-22 02:54 |
The function solution does have a larger overhead than a literal. Could the template be made more accepting of backslashes without breaking anything? (There's also "re.escape() escapes too much", which might help.) |
|
|
msg292102 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-04-22 04:49 |
re.escape() shouldn't be used for a replacement template. You need just double backslashes when escape a literal string for a replacement template: s.replace('\\', '\\\\'). This should be documented if still is not documented. |
|
|
msg306358 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-11-16 13:25 |
The proper way of escaping the replacement string has been documented by . |
|
|