Issue 36158: Regex search behaves differently in list comprehension (original) (raw)

Created on 2019-03-01 17:21 by Matthew Drago, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg336936 - (view) Author: Matthew Drago (Matthew Drago) Date: 2019-03-01 17:21
Say for example i want to apply a regex on a list of strings. Using list comprehension as such results in the group method not being found. ``` name_regex = compile(r'\[\"([a-zA-Z\s]*)\"{1}') named_entities = [name_regex.match(entity.trigger).group(1) for entity in entities[0]] ``` This unexpected behavior can also be observed when implementing this using a map. ``` list(map(lambda x: name_regex.search(x.trigger).group(), entities[0])) ``` However using the traditional for loop implementation the group method is resolved. ``` named_entities = [] for entity in entities[0]: named_entities.append(name_regex.match(entity.trigger).group(1)) ```
msg336943 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-03-01 17:55
Can you please add a short script with data for entities to try reproducing this? >>> from re import compile >>> name_regex = compile(r'\[\"([a-zA-Z\s]*)\"{1}') >>> [name_regex.match(a).group(1) for a in ['["a"a]']] ['a'] >>> list(map(lambda a: name_regex.match(a).group(1), ['["a"a]'])) ['a']
msg336990 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-03-02 04:51
> i want to apply a regex on a list of strings. The example you give doesn't include a list of strings, it has some unknown "entity" object with an unknown "trigger" attribute. Please refactor the code to remove the use of a class we don't have access to. You may find that entity.trigger does not contain what you think it contains.
msg336999 - (view) Author: Ma Lin (malin) * Date: 2019-03-02 10:36
Just remind, the pattern r'"{1}', is same as r'"', means " repeats 1 time.
msg337141 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-03-04 17:28
Sounds like at least one such entity's trigger attribute doesn't match the regex. In the spelled out loop, you'd still get the exception on a failed match, but you'd store the results for however many entities matched before then (so catching the exception and continuing on would work). List comprehensions are all or nothing; if an exception is raised before it finishes, the list in progress is thrown away. While wasteful, this should work just fine: named_entities = [name_regex.match(entity.trigger).group(1) for entity in entities[0] if name_regex.match(entity.trigger)] or in 3.8 with assignment expression to avoid repetitive work: named_entities = [match.group(1) for entity in entities[0] if match := name_regex.match(entity.trigger)] The former is wasteful, but works in any Python version; the latter is directly equivalent to: named_entities = [] for entity in entities[0]: match = name_regex.match(entity.trigger) if match: named_entities.append(match.group(1)) The ultimate problem is your regex isn't always matching; list comprehensions just change whether or no you store the partial results.
History
Date User Action Args
2022-04-11 14:59:11 admin set github: 80339
2019-04-02 17:01:44 josh.r set status: pending -> closedresolution: not a bugstage: resolved
2019-03-06 03:58:31 josh.r set status: open -> pending
2019-03-04 17:28:53 josh.r set status: pending -> opennosy: + josh.rmessages: +
2019-03-04 14:08:42 serhiy.storchaka set status: open -> pending
2019-03-02 10:36:31 malin set nosy: + malinmessages: +
2019-03-02 04:51:07 steven.daprano set nosy: + steven.dapranomessages: +
2019-03-01 17:55:17 xtreak set nosy: + xtreakmessages: +
2019-03-01 17:21:57 Matthew Drago create