msg226570 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-09-08 11:07 |
Currently the re module accepts octal escapes from \400 to \777, but ignore highest bit. >>> re.search(r'\542', 'abc') <_sre.SRE_Match object; span=(1, 2), match='b'> This behavior looks surprising and is inconsistent with the regex module which preserve highest bit. Such escaping is not portable across different regular exception engines. I propose to add a warning when octal escape value is larger than 0o377. Here is preliminary patch which adds UserWarning. Or may be better to emit DeprecationWarning and then replace it by ValueError in future releases? |
|
|
msg226798 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2014-09-11 19:20 |
I think we should simply raise ValueError in 3.5. There's no reason to accept such invalid escapes. |
|
|
msg226801 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-09-11 20:34 |
Well, here is a patch which makes re raise an exception on suspicious octals. |
|
|
msg226809 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2014-09-12 07:36 |
re_octal_escape_overflow_raise.patch: you should write a subfunction to not repeat the error message 3 times. + if c > 0o377: Hum, I never use octal. 255 instead of 0o377 would be less surprising :-p By the way, you should also check for negative numbers. >>> -3 & 0xff 253 Before, "& 0xff" also converted negative numbers to positive in range 0..255. |
|
|
msg226826 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-09-12 16:29 |
> By the way, you should also check for negative numbers. Not in this case. You can't construct negative number from three octal digits. |
|
|
msg227036 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-09-18 10:03 |
Warning or exception? This is a question. |
|
|
msg227039 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2014-09-18 12:44 |
> Warning or exception? This is a question. Using -Werror, warnings raise exceptions :-) |
|
|
msg227040 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2014-09-18 13:17 |
This is an error, so it should really be an exception. There's no use case for being lenient, IMO. |
|
|
msg227238 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-09-21 20:50 |
If this is error, should the patch be applied to maintained releases? |
|
|
msg227386 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2014-09-23 20:26 |
New changeset 3b32f495fb38 by Serhiy Storchaka in branch 'default': Issue #22362: Forbidden ambiguous octal escapes out of range 0-0o377 in https://hg.python.org/cpython/rev/3b32f495fb38 |
|
|
msg227387 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-09-23 20:28 |
Thanks Antoine and Victor for the review. |
|
|