[Python-Dev] Misc re.match() complaint (original) (raw)

Terry Reedy tjreedy at udel.edu
Wed Jul 17 11:05:50 CEST 2013


On 7/17/2013 12:15 AM, Stephen J. Turnbull wrote:

Terry Reedy writes: > On 7/15/2013 10:20 PM, Guido van Rossum wrote: > > >> Or is this something deeper, that a group is a new object in > >> principle? > > > > No, I just think of it as returning "a string" > > That is exactly what the doc says it does. See my other post.

The problem is that IIUC '"a string"' is intentionally not referring to the usual "str or bytes objects" (at least that's one of the standard uses for scare quotes, to indicate an unusual usage).

There are no 'scare quotes' in the doc. I put quote marks on things to indicated that I was quoting. I do not know how Guido regarded his marks.

Either the docstring is using "string" in a similarly ambiguous way, or else it's incorrect under the interpretation that buffer objects are not "strings", so they should be inadmissible as targets.

Saying that input arguments can be "Unicode strings as well as 8-bit strings' (the wording is from 2.x, carried over to 3.x) does not necessary exclude other inputs. CPython is somethimes more more permissive than the doc requires. If the doc said str, bytes, butearray, or memoryview, then other implementations would have to do the same to be conforming. I do not know if that is intended or not.

The question is whether CPython should be just as permissive as to the output types of .group(). (And what, if any requirement should be imposed on other implementations.)

Something should be fixed, and I suppose it should be the return type of group().

BTW, I suggest that Terry's usage of "string" (to mean "str or bytes" in 3.x, "unicode or str" in 2.x) be adopted, and Guido's "stringish"

This word is an adjective, not a noun.

be given expanded meaning, including buffer objects. Then we can say informally that in searching and matching a target is a stringish, the pattern is a stringish (?) or compiled re, but the group method returns a string.

Guido's idea to fix (tighten up) the output in 3.4 is fine with me.

-- Terry Jan Reedy



More information about the Python-Dev mailing list