Letting RegExp method return something iterable? (original) (raw)
At the moment, the following two methods abuse regular expressions as iterators (if the /g flag is set):
RegExp.prototype.test()
RegExp.prototype.exec()
Would it make sense to create similar methods that return something iterable, so that for-of can iterate over the result?
At the moment, the following two methods abuse regular expressions as iterators (if the /g flag is set):
RegExp.prototype.test()
RegExp.prototype.exec()
Would it make sense to create similar methods that return something iterable, so that for-of can iterate over the result?
Axel
-- Dr. Axel Rauschmayer axel at rauschma.de
home: rauschma.de twitter: twitter.com/rauschma blog: 2ality.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130824/371ed2ae/attachment.html
Well, obviously it doesn’t make much sense to do that for test()
, but it would be great to have for exec()
.
Well, obviously it doesn’t make much sense to do that for text()
, but it would be great to have for exec()
.
On Aug 24, 2013, at 21:39 , Axel Rauschmayer wrote:
At the moment, the following two methods abuse regular expressions as iterators (if the /g flag is set):
RegExp.prototype.test()
RegExp.prototype.exec()
Would it make sense to create similar methods that return something iterable, so that for-of can iterate over the result?
-- Dr. Axel Rauschmayer axel at rauschma.de
home: rauschma.de twitter: twitter.com/rauschma blog: 2ality.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130824/1859b413/attachment-0001.html
An example to make things clearer (thanks for the suggestion, Domenic):
console.log(extractTagNamesES5('<a> and <b> or <c>')); // [ 'a', 'b', 'c' ]
// If exec() is invoked on a regular expression whose flag /g is set
// then the regular expression is abused as an iterator:
// Its property `lastIndex` tracks how far along the iteration is
// and must be reset. It also means that the regular expression can’t be frozen.
var regexES5 = /<(.*?)>/g;
function extractTagNamesES5(str) {
regexES5.lastIndex = 0; // to be safe
var results = [];
while (true) {
var match = regexES5.exec(str);
if (!match) break;
results.push(match[1]);
}
return results;
}
// If we had a method `execMultiple()` that returns an iterable,
// the above code would become simpler in ES6.
const REGEX_ES6 = /<(.*?)>/; // no need to set flag /g
function extractTagNamesES6a(str) {
let results = [];
for (let match of REGEX_ES6.execMultiple(str)) {
results.push(match[1]);
}
return results;
}
// Even shorter:
function extractTagNamesES6b(str) {
return Array.from(REGEX_ES6.execMultiple(str), x => x[1]);
}
// Shorter yet:
function extractTagNamesES6c(str) {
return [ for (x of REGEX_ES6.execMultiple(str)) x[1] ];
}
gist.github.com/rauschma/6330265
An example to make things clearer (thanks for the suggestion, Domenic):
https://gist.github.com/rauschma/6330265
On Aug 24, 2013, at 21:43 , Axel Rauschmayer wrote:
Well, obviously it doesn’t make much sense to do that for
text()
, but it would be great to have forexec()
.On Aug 24, 2013, at 21:39 , Axel Rauschmayer wrote:
At the moment, the following two methods abuse regular expressions as iterators (if the /g flag is set):
RegExp.prototype.test()
RegExp.prototype.exec()
Would it make sense to create similar methods that return something iterable, so that for-of can iterate over the result?
-- Dr. Axel Rauschmayer axel at rauschma.de
home: rauschma.de twitter: twitter.com/rauschma blog: 2ality.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130824/53aec06f/attachment.html
This is really nice. lastIndex silliness in ES5 has bitten me quite a few times, and the example code shows how much better this would be. I hope someone on TC39 wants to champion this!
This is really nice. lastIndex silliness in ES5 has bitten me quite a few times, and the example code shows how much better this would be. I hope someone on TC39 wants to champion this!
From: es-discuss [mailto:es-discuss-bounces at mozilla.org] On Behalf Of Axel Rauschmayer Sent: Saturday, August 24, 2013 16:45 To: es-discuss list Subject: Re: Letting RegExp method return something iterable?
An example to make things clearer (thanks for the suggestion, Domenic):
https://gist.github.com/rauschma/6330265
On Aug 24, 2013, at 21:43 , Axel Rauschmayer <axel at rauschma.de<mailto:axel at rauschma.de>> wrote:
Well, obviously it doesn’t make much sense to do that for text()
, but it would be great to have for exec()
.
On Aug 24, 2013, at 21:39 , Axel Rauschmayer <axel at rauschma.de<mailto:axel at rauschma.de>> wrote:
At the moment, the following two methods abuse regular expressions as iterators (if the /g flag is set):
RegExp.prototype.test()
RegExp.prototype.exec()
Would it make sense to create similar methods that return something iterable, so that for-of can iterate over the result?
-- Dr. Axel Rauschmayer axel at rauschma.de<mailto:axel at rauschma.de>
home: rauschma.dehttp://rauschma.de twitter: twitter.com/rauschmahttp://twitter.com/rauschma blog: 2ality.comhttp://2ality.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130824/d82c2449/attachment-0001.html
you don't need to reset the lastIndex
to zero if you don't break the loop before unless you are sharing that regexp with some other part of code you don't control.
What I am saying is that the example is very wrong as it is since there's no way to have an unsafe regexES5
behavior in there.
Moreover, if anyone uses the flag g
to test()
something is wrong unless those properties you are complaining about or somebody might find silly are actually used in a clever way that do not require the creation of Arrays and garbage and track the position of each complex operation allowing incremental parsers based on substrings to keep going and do a fast job without bothering too much RAM and/or GC.
Long story short, I don't see any real use/case or any concrete advantage with those examples so please make it more clear what's the problem you are trying to solve and how these methods will concretely make our life easier ^_^
So far, and for what I can tell there, if you really need that Array you can go probably faster simply abusing replace.
var re = /<(.*?)>/g; // also bad regexp for tags
// it grabs attributes too
function addMatch($0, $1) {
this.push($1);
}
function createMatches(str) {
var matches = [];
str.replace(re, addMatch.bind(matches));
return matches;
}
Above trick also scales more if you need to push more than a match in the RegExp.
Although we might need a way to do similar thing without abusing other methods but I keep guessing when this is so needed that should be in core (1 line operation as your last one is ... anyone that needs that could implement it without problems ;-))
you don't need to reset the lastIndex
to zero if you don't break the loop
before unless you are sharing that regexp with some other part of code you
don't control.
What I am saying is that the example is very wrong as it is since there's
no way to have an unsafe regexES5
behavior in there.
Moreover, if anyone uses the flag g
to test()
something is wrong unless
those properties you are complaining about or somebody might find silly are
actually used in a clever way that do not require the creation of Arrays
and garbage and track the position of each complex operation allowing
incremental parsers based on substrings to keep going and do a fast job
without bothering too much RAM and/or GC.
Long story short, I don't see any real use/case or any concrete advantage with those examples so please make it more clear what's the problem you are trying to solve and how these methods will concretely make our life easier ^_^
So far, and for what I can tell there, if you really need that Array you can go probably faster simply abusing replace.
var re = /<(.*?)>/g; // also bad regexp for tags
// it grabs attributes too
function addMatch($0, $1) {
this.push($1);
}
function createMatches(str) {
var matches = [];
str.replace(re, addMatch.bind(matches));
return matches;
}
Above trick also scales more if you need to push more than a match in the RegExp.
Although we might need a way to do similar thing without abusing other methods but I keep guessing when this is so needed that should be in core (1 line operation as your last one is ... anyone that needs that could implement it without problems ;-))
Regards
On Sat, Aug 24, 2013 at 1:45 PM, Axel Rauschmayer wrote:
An example to make things clearer (thanks for the suggestion, Domenic):
https://gist.github.com/rauschma/6330265
On Aug 24, 2013, at 21:43 , Axel Rauschmayer wrote:
Well, obviously it doesn’t make much sense to do that for
text()
, but it would be great to have forexec()
.On Aug 24, 2013, at 21:39 , Axel Rauschmayer wrote:
At the moment, the following two methods abuse regular expressions as iterators (if the /g flag is set):
RegExp.prototype.test()
RegExp.prototype.exec()
Would it make sense to create similar methods that return something iterable, so that for-of can iterate over the result?
-- Dr. Axel Rauschmayer axel at rauschma.de
home: rauschma.de twitter: twitter.com/rauschma blog: 2ality.com
es-discuss mailing list es-discuss at mozilla.org https://mail.mozilla.org/listinfo/es-discuss
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130826/c55b5d1d/attachment.html
On Mon, Aug 26, 2013 at 1:05 PM, Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:
Long story short, I don't see any real use/case or any concrete advantage with those examples so please make it more clear what's the problem you are trying to solve and how these methods will concretely make our life easier ^_^
Having a way to get an iterable over all the matches is highly useful. It is common to use such a construct in Python.
for m in re.finditer(r"\w+", "aa b ccc"):
print m.group(0)
for (let m of 'aa b ccc'.matchAll(/(\w+/)) {
print(m[0]);
}
On Mon, Aug 26, 2013 at 1:05 PM, Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:
Long story short, I don't see any real use/case or any concrete advantage with those examples so please make it more clear what's the problem you are trying to solve and how these methods will concretely make our life easier ^_^
Having a way to get an iterable over all the matches is highly useful. It is common to use such a construct in Python.
for m in re.finditer(r"\w+", "aa b ccc"):
print m.group(0)
for (let m of 'aa b ccc'.matchAll(/(\w+/)) {
print(m[0]);
}
-- erik
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str))
console.log(m[0])
;
I don't really see any concrete advantage, sorry, but maybe it's me not liking at all this iterable all the things
new trend.
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str))
console.log(m[0])
;
I don't really see any concrete advantage, sorry, but maybe it's me not
liking at all this iterable all the things
new trend.
On Mon, Aug 26, 2013 at 11:33 AM, Erik Arvidsson <erik.arvidsson at gmail.com>wrote:
On Mon, Aug 26, 2013 at 1:05 PM, Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:
Long story short, I don't see any real use/case or any concrete advantage with those examples so please make it more clear what's the problem you are trying to solve and how these methods will concretely make our life easier ^_^
Having a way to get an iterable over all the matches is highly useful. It is common to use such a construct in Python.
for m in re.finditer(r"\w+", "aa b ccc"): print m.group(0)
for (let m of 'aa b ccc'.matchAll(/(\w+/)) { print(m[0]); }-- erik
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130826/f95abffd/attachment.html
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;
It is, for two reasons:
- in JS only for can have a let or var binding in the head.
- the utility extends to all for-of variations: array comprehensions, generator expresisons.
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;
It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
{let m; while(m = re.exec(str)) {
// ... no, really
}}
I don't get the need of this but if this is the trend then String#split needs an iterable too (no!)
{let m; while(m = re.exec(str)) {
// ... no, really
}}
I don't get the need of this but if this is the trend then String#split needs an iterable too (no!)
On Mon, Aug 26, 2013 at 4:23 PM, Brendan Eich wrote:
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130826/b3ba848f/attachment.html
String#split
already is iterable because it returns an array. What it isn't is lazy.
To be equivalent to the for code, the let needs to go inside the body of the while, not outside. This neatly demonstrates the key point:
- as it stands, writing this kind of code tends to be bug prone (i.e. people get it wrong in confusing ways)
- it would be less bug prone if there was just a method that returned an iterable. That could be an Array, rather than a lazy collection.
String#split
already is iterable because it returns an array. What it isn't is lazy.
To be equivalent to the for code, the let needs to go inside the body of the while, not outside. This neatly demonstrates the key point:
- as it stands, writing this kind of code tends to be bug prone (i.e. people get it wrong in confusing ways)
- it would be less bug prone if there was just a method that returned an iterable. That could be an Array, rather than a lazy collection.
On 27 Aug 2013, at 01:20, "Andrea Giammarchi" <andrea.giammarchi at gmail.com<mailto:andrea.giammarchi at gmail.com>> wrote:
{let m; while(m = re.exec(str)) {
// ... no, really
}}
I don't get the need of this but if this is the trend then String#split needs an iterable too (no!)
On Mon, Aug 26, 2013 at 4:23 PM, Brendan Eich <brendan at mozilla.com<mailto:brendan at mozilla.com>> wrote: Andrea Giammarchi wrote: Is it very useful because you wrote for instead of while ?
while (m = re.exec(str))
console.log(m[0])
;
It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
es-discuss mailing list es-discuss at mozilla.org<mailto:es-discuss at mozilla.org> https://mail.mozilla.org/listinfo/es-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130827/8f5c87b6/attachment-0001.html
Forbes Lindesay wrote:
String#split
already is iterable because it returns an array. What it isn't is lazy.To be equivalent to the for code, the let needs to go inside the body of the while, not outside. This neatly demonstrates the key point:
- as it stands, writing this kind of code tends to be bug prone (i.e. people get it wrong in confusing ways)
- it would be less bug prone if there was just a method that returned an iterable. That could be an Array, rather than a lazy collection.
Spot on.
Note the Algol family includes languages that allow bindings in if (), while (), etc. heads as well as for () -- C++ for example. We talked about extending JS this way but for unclear reasons deferred.
Forbes Lindesay wrote:
String#split
already is iterable because it returns an array. What it isn't is lazy.To be equivalent to the for code, the let needs to go inside the body of the while, not outside. This neatly demonstrates the key point:
- as it stands, writing this kind of code tends to be bug prone (i.e. people get it wrong in confusing ways)
- it would be less bug prone if there was just a method that returned an iterable. That could be an Array, rather than a lazy collection.
Spot on.
Note the Algol family includes languages that allow bindings in if (), while (), etc. heads as well as for () -- C++ for example. We talked about extending JS this way but for unclear reasons deferred.
/be
On 27 Aug 2013, at 01:20, "Andrea Giammarchi" <andrea.giammarchi at gmail.com <mailto:andrea.giammarchi at gmail.com>> wrote:
{let m; while(m = re.exec(str)) { // ... no, really }}I don't get the need of this but if this is the trend then String#split needs an iterable too (no!)
On Mon, Aug 26, 2013 at 4:23 PM, Brendan Eich <brendan at mozilla.com <mailto:brendan at mozilla.com>> wrote:
Andrea Giammarchi wrote: Is it very useful because you wrote for instead of while ? ```javascript while (m = re.exec(str)) console.log(m[0]) ; ``` It is, for two reasons: 1. in JS only for can have a let or var binding in the head. 2. the utility extends to all for-of variations: array comprehensions, generator expresisons. /be
es-discuss mailing list es-discuss at mozilla.org <mailto:es-discuss at mozilla.org> https://mail.mozilla.org/listinfo/es-discuss
to be really honest, most people will get it wrong regardless since thanks to JSLint and friends they are use to declare everything on top and they probably forgot for accepts var
declarations.
I've never ever needed this syntax and I don't see the miracle. I am sure somebody one day will use that but I am just worried that resources will be wasted to add methods nobody needed that much 'till now instead of focusing on things that we cannot even write in one line of code as Alex did.
Just my 2 cents on this topic, no hard feelings on the proposal itself.
to be really honest, most people will get it wrong regardless since thanks
to JSLint and friends they are use to declare everything on top and they
probably forgot for accepts var
declarations.
I've never ever needed this syntax and I don't see the miracle. I am sure somebody one day will use that but I am just worried that resources will be wasted to add methods nobody needed that much 'till now instead of focusing on things that we cannot even write in one line of code as Alex did.
Just my 2 cents on this topic, no hard feelings on the proposal itself.
On Mon, Aug 26, 2013 at 5:30 PM, Forbes Lindesay wrote:
String#split
already is iterable because it returns an array. What it isn't is lazy.To be equivalent to the for code, the let needs to go inside the body of the while, not outside. This neatly demonstrates the key point:
- as it stands, writing this kind of code tends to be bug prone (i.e. people get it wrong in confusing ways)
- it would be less bug prone if there was just a method that returned an iterable. That could be an Array, rather than a lazy collection.
On 27 Aug 2013, at 01:20, "Andrea Giammarchi" <andrea.giammarchi at gmail.com> wrote:
{let m; while(m = re.exec(str)) { // ... no, really }}I don't get the need of this but if this is the trend then String#split needs an iterable too (no!)
On Mon, Aug 26, 2013 at 4:23 PM, Brendan Eich wrote:
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
es-discuss mailing list es-discuss at mozilla.org https://mail.mozilla.org/listinfo/es-discuss
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130826/c03603a3/attachment.html
one lazy hilarious thought on that though ...
On Mon, Aug 26, 2013 at 5:30 PM, Forbes Lindesay <forbes at lindesay.co.uk>wrote:
String#split
already is iterable because it returns an array. What it isn't is lazy.
it's straight forward to make String#split(re) lazy using the lastIndex
indeed:
function* lazySplit(str, re) {
for (var
i = 0;
re.test(str);
i = re.lastIndex
)
yield str.slice(
i, re.lastIndex - RegExp.lastMatch.length
)
;
yield str.slice(i);
}
one lazy hilarious thought on that though ...
On Mon, Aug 26, 2013 at 5:30 PM, Forbes Lindesay wrote:
String#split
already is iterable because it returns an array. What it isn't is lazy.
it's straight forward to make String#split(re) lazy using the lastIndex
indeed:
function* lazySplit(str, re) {
for (var
i = 0;
re.test(str);
i = re.lastIndex
)
yield str.slice(
i, re.lastIndex - RegExp.lastMatch.length
)
;
yield str.slice(i);
}
There, Best Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130826/f7a53881/attachment.html
Le 27 août 2013 à 01:23, Brendan Eich <brendan at mozilla.com> a écrit :
It is, for two reasons:
- in JS only for can have a let or var binding in the head.
- the utility extends to all for-of variations: array comprehensions, generator expresisons.
There is a third reason. The syntax:
for (let m of re.execAll(str) {
// ...
}
has the clear advantage to express the intention of the programmer, and nothing more. It does not require good knowledge of the details of the language to understand what happens.
Indeed, when I read while(m = re.exec(str))
, I really have to analyse the following additional points:
=
is not a typo for==
(here, some annotation would be useful);RegExp#exec
returns a falsy value if and only if there is no more match;re
has its global flag set, and its.lastIndex
property has not been disturbed.
All these tricks are unrelated to the intention of the programmer, and are just distracting points, especially for any reader that use only occasionally RegExp#exec
with the global flag set.
In summary, citing 1: "Don’t be clever, don’t make me think."
Le 27 août 2013 à 01:23, Brendan Eich a écrit :
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
There is a third reason. The syntax:
for (let m of re.execAll(str) {
// ...
}
has the clear advantage to express the intention of the programmer, and nothing more. It does not require good knowledge of the details of the language to understand what happens.
Indeed, when I read while(m = re.exec(str))
, I really have to analyse the following additional points:
=
is not a typo for==
(here, some annotation would be useful);RegExp#exec
returns a falsy value if and only if there is no more match;re
has its global flag set, and its.lastIndex
property has not been disturbed.
All these tricks are unrelated to the intention of the programmer, and are just distracting points, especially for any reader that use only occasionally RegExp#exec
with the global flag set.
In summary, citing [1]: "Don’t be clever, don’t make me think."
—Claude
[1] http://www.2ality.com/2013/07/meta-style-guide.html
sure you know everything as soon as you read of
... right ? How objectives are your points ? If you know JS that while looks very simple, IMO
sure you know everything as soon as you read of
... right ? How
objectives are your points ? If you know JS that while looks very simple,
IMO
On Tue, Aug 27, 2013 at 5:24 AM, Claude Pache <claude.pache at gmail.com>wrote:
Le 27 août 2013 à 01:23, Brendan Eich a écrit :
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
There is a third reason. The syntax:
for (let m of re.execAll(str) { // ... }has the clear advantage to express the intention of the programmer, and nothing more. It does not require good knowledge of the details of the language to understand what happens.
Indeed, when I read
while(m = re.exec(str))
, I really have to analyse the following additional points:
=
is not a typo for==
(here, some annotation would be useful);RegExp#exec
returns a falsy value if and only if there is no more match;re
has its global flag set, and its.lastIndex
property has not been disturbed.All these tricks are unrelated to the intention of the programmer, and are just distracting points, especially for any reader that use only occasionally
RegExp#exec
with the global flag set.In summary, citing [1]: "Don’t be clever, don’t make me think."
—Claude
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130827/bb0c013a/attachment-0001.html
On Aug 27, 2013, at 9:42 AM, Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:
sure you know everything as soon as you read
of
... right ?
Wrong. The nested assignment is idiomatic in C but not good for everyone (see gcc's warning when not parenthesized in such contexts) due to == and = being so close as to make typo and n00b hazards.
Furthermore, the exogenous binding / hoisting problem is objectively greater cognitive load and bug habitat.
How objectives are your points ? If you know JS that while looks very simple, IMO
Please learn when to fold a losing argument :-|.
On Aug 27, 2013, at 9:42 AM, Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:
sure you know everything as soon as you read
of
... right ?
Wrong. The nested assignment is idiomatic in C but not good for everyone (see gcc's warning when not parenthesized in such contexts) due to == and = being so close as to make typo and n00b hazards.
Furthermore, the exogenous binding / hoisting problem is objectively greater cognitive load and bug habitat.
How objectives are your points ? If you know JS that while looks very simple, IMO
Please learn when to fold a losing argument :-|.
/be
On Tue, Aug 27, 2013 at 5:24 AM, Claude Pache <claude.pache at gmail.com> wrote:
Le 27 août 2013 à 01:23, Brendan Eich a écrit :
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
There is a third reason. The syntax:
for (let m of re.execAll(str) { // ... }has the clear advantage to express the intention of the programmer, and nothing more. It does not require good knowledge of the details of the language to understand what happens.
Indeed, when I read
while(m = re.exec(str))
, I really have to analyse the following additional points:
=
is not a typo for==
(here, some annotation would be useful);RegExp#exec
returns a falsy value if and only if there is no more match;re
has its global flag set, and its.lastIndex
property has not been disturbed.All these tricks are unrelated to the intention of the programmer, and are just distracting points, especially for any reader that use only occasionally
RegExp#exec
with the global flag set.In summary, citing [1]: "Don’t be clever, don’t make me think."
—Claude
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130827/da0a822b/attachment.html
let me rephrase ... I've no idea what this code does if not a syntax error (and for different reasons)
for (let m of re.execAll(str) {
what is of
... will let
mark that variable as local ? what is returned and what will be m
?
I need to know these things ... this has nothing to do with "Don’t be clever, don’t make me think." a point which also I don't understand (I have to think about such statement ... I don't demand Ocaml language to be C like 'cause I don't get it)
Anyway, I've already commented my point of view.
let me rephrase ... I've no idea what this code does if not a syntax error (and for different reasons)
for (let m of re.execAll(str) {
what is of
... will let
mark that variable as local ? what is returned
and what will be m
?
I need to know these things ... this has nothing to do with "Don’t be clever, don’t make me think." a point which also I don't understand (I have to think about such statement ... I don't demand Ocaml language to be C like 'cause I don't get it)
Anyway, I've already commented my point of view.
Regards
On Tue, Aug 27, 2013 at 9:42 AM, Andrea Giammarchi < andrea.giammarchi at gmail.com> wrote:
sure you know everything as soon as you read
of
... right ? How objectives are your points ? If you know JS that while looks very simple, IMOOn Tue, Aug 27, 2013 at 5:24 AM, Claude Pache <claude.pache at gmail.com>wrote:
Le 27 août 2013 à 01:23, Brendan Eich a écrit :
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
There is a third reason. The syntax:
for (let m of re.execAll(str) { // ... }has the clear advantage to express the intention of the programmer, and nothing more. It does not require good knowledge of the details of the language to understand what happens.
Indeed, when I read
while(m = re.exec(str))
, I really have to analyse the following additional points:
=
is not a typo for==
(here, some annotation would be useful);RegExp#exec
returns a falsy value if and only if there is no more match;re
has its global flag set, and its.lastIndex
property has not been disturbed.All these tricks are unrelated to the intention of the programmer, and are just distracting points, especially for any reader that use only occasionally
RegExp#exec
with the global flag set.In summary, citing [1]: "Don’t be clever, don’t make me think."
—Claude
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130827/b9080344/attachment-0001.html
losing argument ... as if assignment within condition has been a real problem except for JSLint ... uhm, I don't think so but I am off this conversation. Already said my point, feel free to (as usual) disagree ^_^
losing argument ... as if assignment within condition has been a real problem except for JSLint ... uhm, I don't think so but I am off this conversation. Already said my point, feel free to (as usual) disagree ^_^
On Tue, Aug 27, 2013 at 9:48 AM, Brendan Eich wrote:
On Aug 27, 2013, at 9:42 AM, Andrea Giammarchi < andrea.giammarchi at gmail.com> wrote:
sure you know everything as soon as you read
of
... right ?Wrong. The nested assignment is idiomatic in C but not good for everyone (see gcc's warning when not parenthesized in such contexts) due to == and = being so close as to make typo and n00b hazards.
Furthermore, the exogenous binding / hoisting problem is objectively greater cognitive load and bug habitat.
How objectives are your points ? If you know JS that while looks very simple, IMO
Please learn when to fold a losing argument :-|.
/be
On Tue, Aug 27, 2013 at 5:24 AM, Claude Pache <claude.pache at gmail.com>wrote:
Le 27 août 2013 à 01:23, Brendan Eich a écrit :
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
There is a third reason. The syntax:
for (let m of re.execAll(str) { // ... }has the clear advantage to express the intention of the programmer, and nothing more. It does not require good knowledge of the details of the language to understand what happens.
Indeed, when I read
while(m = re.exec(str))
, I really have to analyse the following additional points:
=
is not a typo for==
(here, some annotation would be useful);RegExp#exec
returns a falsy value if and only if there is no more match;re
has its global flag set, and its.lastIndex
property has not been disturbed.All these tricks are unrelated to the intention of the programmer, and are just distracting points, especially for any reader that use only occasionally
RegExp#exec
with the global flag set.In summary, citing [1]: "Don’t be clever, don’t make me think."
—Claude
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130827/3878583f/attachment.html
Le 27 août 2013 à 18:48, Andrea Giammarchi <andrea.giammarchi at gmail.com> a écrit :
let me rephrase ... I've no idea what this code does if not a syntax error (and for different reasons)
for (let m of re.execAll(str) {
what is
of
... willlet
mark that variable as local ? what is returned and what will bem
?I need to know these things ... this has nothing to do with "Don’t be clever, don’t make me think." a point which also I don't understand (I have to think about such statement ... I don't demand Ocaml language to be C like 'cause I don't get it)
Trying to reexplain: for (let m of re.execAll(str))
is a direct, one-to-one translation of the meaning of the programmer into (the expected) EcmaScript 6. But with while (m = re.exec(str))
, you exploit some secondary fact about the value of re.exec(str)
(falsy iff when over) which is unrelated to the object of the code. (And yes, you have to learn the complete syntax of for/of
in order to understand the code, but it is unrelated to the point.)
Le 27 août 2013 à 18:48, Andrea Giammarchi <andrea.giammarchi at gmail.com> a écrit :
let me rephrase ... I've no idea what this code does if not a syntax error (and for different reasons)
for (let m of re.execAll(str) {
what is
of
... willlet
mark that variable as local ? what is returned and what will bem
?I need to know these things ... this has nothing to do with "Don’t be clever, don’t make me think." a point which also I don't understand (I have to think about such statement ... I don't demand Ocaml language to be C like 'cause I don't get it)
Trying to reexplain: for (let m of re.execAll(str))
is a direct, one-to-one translation of the meaning of the programmer into (the expected) EcmaScript 6. But with while (m = re.exec(str))
, you exploit some secondary fact about the value of re.exec(str)
(falsy iff when over) which is unrelated to the object of the code. (And yes, you have to learn the complete syntax of for/of
in order to understand the code, but it is unrelated to the point.)
—Claude
Anyway, I've already commented my point of view.
Regards
On Tue, Aug 27, 2013 at 9:42 AM, Andrea Giammarchi <andrea.giammarchi at gmail.com> wrote:
sure you know everything as soon as you read
of
... right ? How objectives are your points ? If you know JS that while looks very simple, IMOOn Tue, Aug 27, 2013 at 5:24 AM, Claude Pache <claude.pache at gmail.com> wrote:
Le 27 août 2013 à 01:23, Brendan Eich a écrit :
Andrea Giammarchi wrote:
Is it very useful because you wrote for instead of while ?
while (m = re.exec(str)) console.log(m[0]) ;It is, for two reasons:
in JS only for can have a let or var binding in the head.
the utility extends to all for-of variations: array comprehensions, generator expresisons.
/be
There is a third reason. The syntax:
for (let m of re.execAll(str) { // ... }has the clear advantage to express the intention of the programmer, and nothing more. It does not require good knowledge of the details of the language to understand what happens.
Indeed, when I read
while(m = re.exec(str))
, I really have to analyse the following additional points:
=
is not a typo for==
(here, some annotation would be useful);RegExp#exec
returns a falsy value if and only if there is no more match;re
has its global flag set, and its.lastIndex
property has not been disturbed.All these tricks are unrelated to the intention of the programmer, and are just distracting points, especially for any reader that use only occasionally
RegExp#exec
with the global flag set.In summary, citing [1]: "Don’t be clever, don’t make me think."
—Claude
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130827/2d8f022c/attachment.html
Right, my impression is that most of us are in agreement that it would be extremely useful to have a simple way to loop over the list of matches for a regular expression and do something with each one. I don't see why @andrea doesn't see this need (maybe it's not something he's found need to do recently).
I think to move on, it would be useful to consider whether the method should return an array (which would be iterable and also have methods like .map
built in) or a custom, lazy iterable (which might be better for efficiency if that laziness were useful, but will have disadvantages like lacking the array prototype methods and presumably failing if you try to loop over it twice).
I'm guessing that code like:
var matches = /foo/.execMultipleLazy('str')
for (let match of matches) {
//do something
}
for (let match of matches) {
//do something
}
for (let match of matches) {
//do something
}
Would go wrong somehow whereas:
var matches = /foo/.execMultipleGreedy('str')
for (let match of matches) {
//do something
}
for (let match of matches) {
//do something
}
for (let match of matches) {
//do something
}
Would work fine?
Right, my impression is that most of us are in agreement that it would be extremely useful to have a simple way to loop over the list of matches for a regular expression and do something with each one. I don't see why @andrea doesn't see this need (maybe it's not something he's found need to do recently).
I think to move on, it would be useful to consider whether the method should return an array (which would be iterable and also have methods like .map
built in) or a custom, lazy iterable (which might be better for efficiency if that laziness were useful, but will have disadvantages like lacking the array prototype methods and presumably failing if you try to loop over it twice).
I'm guessing that code like:
var matches = /foo/.execMultipleLazy('str')
for (let match of matches) {
//do something
}
for (let match of matches) {
//do something
}
for (let match of matches) {
//do something
}
Would go wrong somehow whereas:
var matches = /foo/.execMultipleGreedy('str')
for (let match of matches) {
//do something
}
for (let match of matches) {
//do something
}
for (let match of matches) {
//do something
}
Would work fine?
On Wed, Aug 28, 2013 at 2:12 AM, Forbes Lindesay <forbes at lindesay.co.uk> wrote:
Right, my impression is that most of us are in agreement that it would be extremely useful to have a simple way to loop over the list of matches for a regular expression and do something with each one. I don't see why @andrea doesn't see this need (maybe it's not something he's found need to do recently).
I think to move on, it would be useful to consider whether the method should return an array (which would be iterable and also have methods like
.map
built in) or a custom, lazy iterable (which might be better for efficiency if that laziness were useful, but will have disadvantages like lacking the array prototype methods and presumably failing if you try to loop over it twice).I'm guessing that code like:
var matches = /foo/.execMultipleLazy('str') for (let match of matches) { //do something } for (let match of matches) { //do something } for (let match of matches) { //do something }
Would go wrong somehow whereas:
var matches = /foo/.execMultipleGreedy('str') for (let match of matches) { //do something } for (let match of matches) { //do something } for (let match of matches) { //do something }
Yes. This is a standard Python issue - if you want to make sure you can loop over something twice, regardless of whether it's an array or an iterator, just pass it through list() first.
Similarly, in JS you'd just pass it through Array.from() first.
On Wed, Aug 28, 2013 at 2:12 AM, Forbes Lindesay wrote:
Right, my impression is that most of us are in agreement that it would be extremely useful to have a simple way to loop over the list of matches for a regular expression and do something with each one. I don't see why @andrea doesn't see this need (maybe it's not something he's found need to do recently).
I think to move on, it would be useful to consider whether the method should return an array (which would be iterable and also have methods like
.map
built in) or a custom, lazy iterable (which might be better for efficiency if that laziness were useful, but will have disadvantages like lacking the array prototype methods and presumably failing if you try to loop over it twice).I'm guessing that code like:
var matches = /foo/.execMultipleLazy('str') for (let match of matches) { //do something } for (let match of matches) { //do something } for (let match of matches) { //do something }Would go wrong somehow whereas:
var matches = /foo/.execMultipleGreedy('str') for (let match of matches) { //do something } for (let match of matches) { //do something } for (let match of matches) { //do something }
Yes. This is a standard Python issue - if you want to make sure you can loop over something twice, regardless of whether it's an array or an iterator, just pass it through list() first.
Similarly, in JS you'd just pass it through Array.from() first.
~TJ
So you're in favor of returning the Iterable and then having people use Array.from
if they need an array?
So you're in favor of returning the Iterable and then having people use Array.from
if they need an array?
I'm guessing that code like:
var matches = /foo/.execMultipleLazy('str') for (let match of matches) { //do something } for (let match of matches) { //do something } for (let match of matches) { //do something }
Would go wrong somehow
Yes. This is a standard Python issue - if you want to make sure you can loop over something twice, regardless of whether it's an array or an iterator, just pass it through list() first.
Similarly, in JS you'd just pass it through Array.from() first.
Additional option:
var arrayOfMatches = [ .../foo/.execMultipleLazy('str') ]
I'm guessing that code like:
var matches = /foo/.execMultipleLazy('str') for (let match of matches) { //do something } for (let match of matches) { //do something } for (let match of matches) { //do something }Would go wrong somehow
Yes. This is a standard Python issue - if you want to make sure you can loop over something twice, regardless of whether it's an array or an iterator, just pass it through list() first.
Similarly, in JS you'd just pass it through Array.from() first.
Additional option:
var arrayOfMatches = [ .../foo/.execMultipleLazy('str') ]
-- Dr. Axel Rauschmayer axel at rauschma.de
home: rauschma.de twitter: twitter.com/rauschma blog: 2ality.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130828/c8b44b96/attachment-0001.html
On Wed, Aug 28, 2013 at 2:12 AM, Forbes Lindesay <forbes at lindesay.co.uk>wrote:
a simple way to loop over the list of matches for a regular expression
it's about 10 years or more we have that .. so to make my very personal statement clear:
I've got 99 problems in JS, make everything an iterator ain't one
Specially when the need comes out of an example and code that does not fully understand or use what's available since ever in JS, specially for something that has never been a real-world problem (but please tell me how better your app would have been otherwise)
for(let m of lazySplit('a.b.c', /\./g))
taking my example and prioritize something else since generators give already us the ability to do that? ^_^
Or maybe not, fine for me.
On Wed, Aug 28, 2013 at 2:12 AM, Forbes Lindesay wrote:
a simple way to loop over the list of matches for a regular expression
it's about 10 years or more we have that .. so to make my very personal statement clear:
I've got 99 problems in JS, make everything an iterator ain't one
Specially when the need comes out of an example and code that does not fully understand or use what's available since ever in JS, specially for something that has never been a real-world problem (but please tell me how better your app would have been otherwise)
for(let m of lazySplit('a.b.c', /\./g))
taking my example and prioritize
something else since generators give already us the ability to do that? ^_^
Or maybe not, fine for me. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130828/da9a28c6/attachment.html
Right, I don't care whether it's lazy. I only care that it exists. Nobody's crying out for a lazy version of string split (at least not yet anyway). I have had the issue of needing to loop over all the matches that a regular expression has. It is a common, recurring issue that many developers face.
Let's move on from whether it should exist (clearly it should) and stick to whether it should be an array, or lazy. Does anyone have a strong opinion either way? The fact that all our regular expression iteration thus far has been lazy to me suggests this probably should be too, but maybe it would be simpler if it returned an array. I really hope someone will chime in on this.
Right, I don't care whether it's lazy. I only care that it exists. Nobody's crying out for a lazy version of string split (at least not yet anyway). I have had the issue of needing to loop over all the matches that a regular expression has. It is a common, recurring issue that many developers face.
Let's move on from whether it should exist (clearly it should) and stick to whether it should be an array, or lazy. Does anyone have a strong opinion either way? The fact that all our regular expression iteration thus far has been lazy to me suggests this probably should be too, but maybe it would be simpler if it returned an array. I really hope someone will chime in on this.
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130828/b7776b10/attachment.html
Forbes Lindesay wrote:
Let’s move on from whether it should exist (clearly it should)
Does String.prototype.match not count?
and stick to whether it should be an array, or lazy. Does anyone have a strong opinion either way? The fact that all our regular expression iteration thus far has been lazy to me suggests this probably should be too, but maybe it would be simpler if it returned an array. I really hope someone will chime in on this.
The fact that s.match(/re/g) returns the array of all matches (with captures) sucks some of the oxygen away from any /re/g.execAll(s) proposal.
But String.prototype.match has perlish hair (e.g., those capture groups showing up in the result array). Perhaps we do want execAll (with a better name) just to break down the composite perl4-era legacy into compositional methods.
Forbes Lindesay wrote:
Let’s move on from whether it should exist (clearly it should)
Does String.prototype.match not count?
and stick to whether it should be an array, or lazy. Does anyone have a strong opinion either way? The fact that all our regular expression iteration thus far has been lazy to me suggests this probably should be too, but maybe it would be simpler if it returned an array. I really hope someone will chime in on this.
The fact that s.match(/re/g) returns the array of all matches (with captures) sucks some of the oxygen away from any /re/g.execAll(s) proposal.
But String.prototype.match has perlish hair (e.g., those capture groups showing up in the result array). Perhaps we do want execAll (with a better name) just to break down the composite perl4-era legacy into compositional methods.
/be
The advantage of a lazy execAll, is the ability to break out of the for..of loop without the need to continue to traverse the input string looking for matches. This is the same advantage that the while(m = re.exec())
has going for it. You can always be greedy by using Array.from or an array comprehension if execAll is lazy, but you are back to using a while loop if execAll is greedy and you want lazy matching, which limits its usefulness in some scenarios.
Ron
Sent from my Windows Phone
The advantage of a lazy execAll, is the ability to break out of the for..of loop without the need to continue to traverse the input string looking for matches. This is the same advantage that the while(m = re.exec())
has going for it. You can always be greedy by using Array.from or an array comprehension if execAll is lazy, but you are back to using a while loop if execAll is greedy and you want lazy matching, which limits its usefulness in some scenarios.
Ron
Sent from my Windows Phone
From: Forbes Lindesay<mailto:forbes at lindesay.co.uk> Sent: 8/28/2013 4:55 PM To: Andrea Giammarchi<mailto:andrea.giammarchi at gmail.com> Cc: Brendan Eich<mailto:brendan at mozilla.com>; es-discuss list<mailto:es-discuss at mozilla.org>; Erik Arvidsson<mailto:erik.arvidsson at gmail.com> Subject: RE: Letting RegExp method return something iterable?
Right, I don’t care whether it’s lazy. I only care that it exists. Nobody’s crying out for a lazy version of string split (at least not yet anyway). I have had the issue of needing to loop over all the matches that a regular expression has. It is a common, recurring issue that many developers face.
Let’s move on from whether it should exist (clearly it should) and stick to whether it should be an array, or lazy. Does anyone have a strong opinion either way? The fact that all our regular expression iteration thus far has been lazy to me suggests this probably should be too, but maybe it would be simpler if it returned an array. I really hope someone will chime in on this.
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130829/f8bc333e/attachment.html
Ron Buckton wrote:
The advantage of a lazy execAll, is the ability to break out of the for..of loop without the need to continue to traverse the input string looking for matches. This is the same advantage that the
while(m = re.exec())
has going for it. You can always be greedy by using Array.from or an array comprehension if execAll is lazy, but you are back to using a while loop if execAll is greedy and you want lazy matching, which limits its usefulness in some scenarios.
Good point -- on top of the quasi-redundancy of an eager execAll viz. String.prototype.match, I think this makes a good case for a lazy execAll -- with a much better name.
Candidates: r.iterate(s), r.iterateOver(s), r.execIterator(s) (blech!). Suggest some!
Ron Buckton wrote:
The advantage of a lazy execAll, is the ability to break out of the for..of loop without the need to continue to traverse the input string looking for matches. This is the same advantage that the
while(m = re.exec())
has going for it. You can always be greedy by using Array.from or an array comprehension if execAll is lazy, but you are back to using a while loop if execAll is greedy and you want lazy matching, which limits its usefulness in some scenarios.
Good point -- on top of the quasi-redundancy of an eager execAll viz. String.prototype.match, I think this makes a good case for a lazy execAll -- with a much better name.
Candidates: r.iterate(s), r.iterateOver(s), r.execIterator(s) (blech!). Suggest some!
/be
The fact that s.match(/re/g) returns the array of all matches (with captures) sucks some of the oxygen away from any /re/g.execAll(s) proposal.
But String.prototype.match has perlish hair (e.g., those capture groups showing up in the result array).
Really? AFAICT, only the complete matches (group 0) are returned.
The fact that s.match(/re/g) returns the array of all matches (with captures) sucks some of the oxygen away from any /re/g.execAll(s) proposal.
But String.prototype.match has perlish hair (e.g., those capture groups showing up in the result array).
Really? AFAICT, only the complete matches (group 0) are returned.
-- Dr. Axel Rauschmayer axel at rauschma.de
home: rauschma.de twitter: twitter.com/rauschma blog: 2ality.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130829/b8266966/attachment.html
[...] I think this makes a good case for a lazy execAll -- with a much better name.
Candidates: r.iterate(s), r.iterateOver(s), r.execIterator(s) (blech!). Suggest some!
I think “exec” should be in the name, to indicate that the new method is a version of exec()
.
Ideas:
- execMulti()
- execIter()
execAll()
may not be that bad. It’s not pretty, but it’s fairly easy to guess what it does (if one know what the normal exec()
does).
[...] I think this makes a good case for a lazy execAll -- with a much better name.
Candidates: r.iterate(s), r.iterateOver(s), r.execIterator(s) (blech!). Suggest some!
I think “exec” should be in the name, to indicate that the new method is a version of exec()
.
Ideas:
– execMulti() – execIter()
execAll()
may not be that bad. It’s not pretty, but it’s fairly easy to guess what it does (if one know what the normal exec()
does).
-- Dr. Axel Rauschmayer axel at rauschma.de
home: rauschma.de twitter: twitter.com/rauschma blog: 2ality.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130829/508c3774/attachment-0001.html
Axel Rauschmayer wrote:
Really? AFAICT, only the complete matches (group 0) are returned.
Sorry, of course you are right -- how soon I forget -- the subgroups show up only in each exec result array, but are dropped from the match result.
So hair on the other side of the coin, if you will. A naive iterator that calls exec would return, e.g.,["ab", "b"] for the first iteration given r and s as follows:
js> r = /.(.)/g
/.(.)/g
js> s = 'abcdefgh'
"abcdefgh"
js> a = s.match(r)
["ab", "cd", "ef", "gh"]
js> b = r.exec(s)
["ab", "b"]
Is this what the programmer wants? If not, String.prototype.match stands ready, and again takes away motivation for an eager execAll.
But programmers wanting exec with submatches could use a lazy form:
js> r.lastIndex = 0
0
js> RegExp.prototype.execAll = function (s) { let m; while (m = this.exec(s)) yield m; }
(function (s) { let m; while (m = this.exec(s)) yield m; })
js> c = [m for (m of r.execAll(s))]
[["ab", "b"], ["cd", "d"], ["ef", "f"], ["gh", "h"]]
Axel Rauschmayer wrote:
The fact that s.match(/re/g) returns the array of all matches (with captures) sucks some of the oxygen away from any /re/g.execAll(s) proposal.
But String.prototype.match has perlish hair (e.g., those capture groups showing up in the result array).
Really? AFAICT, only the complete matches (group 0) are returned.
Sorry, of course you are right -- how soon I forget -- the subgroups show up only in each exec result array, but are dropped from the match result.
So hair on the other side of the coin, if you will. A naive iterator that calls exec would return, e.g.,["ab", "b"] for the first iteration given r and s as follows:
js> r = /.(.)/g /.(.)/g js> s = 'abcdefgh' "abcdefgh" js> a = s.match(r) ["ab", "cd", "ef", "gh"] js> b = r.exec(s) ["ab", "b"]
Is this what the programmer wants? If not, String.prototype.match stands ready, and again takes away motivation for an eager execAll.
But programmers wanting exec with submatches could use a lazy form:
js> r.lastIndex = 0 0 js> RegExp.prototype.execAll = function (s) { let m; while (m = this.exec(s)) yield m; } (function (s) { let m; while (m = this.exec(s)) yield m; }) js> c = [m for (m of r.execAll(s))] [["ab", "b"], ["cd", "d"], ["ef", "f"], ["gh", "h"]]
/be
Axel Rauschmayer wrote:
– execIter()
Not bad, I think better than execAll, which does not connote return of an iterator, but which does perversely suggest returning an array of all exec results.
execAll()
may not be that bad. It’s not pretty, but it’s fairly easy to guess what it does (if one know what the normalexec()
does).
(If only!)
Axel Rauschmayer wrote:
– execIter()
Not bad, I think better than execAll, which does not connote return of an iterator, but which does perversely suggest returning an array of all exec results.
execAll()
may not be that bad. It’s not pretty, but it’s fairly easy to guess what it does (if one know what the normalexec()
does).
(If only!)
/be
I agree that execAll() is not a 100% winner, more like a clean-up of a quirky corner. But exec() in “multi” mode has a surprising amount of pitfalls:
- /g flag must be set
- lastIndex must be 0
- can’t inline the regex, because it is needed as a pseudo-iterator (more of an anti-pattern, anyway, but still)
- side effects via lastIndex may be a problem
All of these would go away with a execAll(). The thing I’m not sure about is how frequently exec() is used that way. String.prototype.match() does indeed cover a lot of use cases. So does String.prototype.replace().
I agree that execAll() is not a 100% winner, more like a clean-up of a quirky corner. But exec() in “multi” mode has a surprising amount of pitfalls:
- /g flag must be set
- lastIndex must be 0
- can’t inline the regex, because it is needed as a pseudo-iterator (more of an anti-pattern, anyway, but still)
- side effects via lastIndex may be a problem
All of these would go away with a execAll(). The thing I’m not sure about is how frequently exec() is used that way. String.prototype.match() does indeed cover a lot of use cases. So does String.prototype.replace().
On Aug 29, 2013, at 9:45 , Brendan Eich wrote:
Axel Rauschmayer wrote:
The fact that s.match(/re/g) returns the array of all matches (with captures) sucks some of the oxygen away from any /re/g.execAll(s) proposal.
But String.prototype.match has perlish hair (e.g., those capture groups showing up in the result array).
Really? AFAICT, only the complete matches (group 0) are returned.
Sorry, of course you are right -- how soon I forget -- the subgroups show up only in each exec result array, but are dropped from the match result.
So hair on the other side of the coin, if you will. A naive iterator that calls exec would return, e.g.,["ab", "b"] for the first iteration given r and s as follows:
js> r = /.(.)/g /.(.)/g js> s = 'abcdefgh' "abcdefgh" js> a = s.match(r) ["ab", "cd", "ef", "gh"] js> b = r.exec(s) ["ab", "b"]
Is this what the programmer wants? If not, String.prototype.match stands ready, and again takes away motivation for an eager execAll.
But programmers wanting exec with submatches could use a lazy form:
js> r.lastIndex = 0 0 js> RegExp.prototype.execAll = function (s) { let m; while (m = this.exec(s)) yield m; } (function (s) { let m; while (m = this.exec(s)) yield m; }) js> c = [m for (m of r.execAll(s))] [["ab", "b"], ["cd", "d"], ["ef", "f"], ["gh", "h"]]
/be
-- Dr. Axel Rauschmayer axel at rauschma.de
home: rauschma.de twitter: twitter.com/rauschma blog: 2ality.com
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130829/17f4d1f2/attachment.html
Axel Rauschmayer wrote:
- /g flag must be set
- lastIndex must be 0
- can’t inline the regex, because it is needed as a pseudo-iterator (more of an anti-pattern, anyway, but still)
- side effects via lastIndex may be a problem
Anything we do of the execAll/execIter kind had better be immune to the awful Perl4-infused "mutable lastIndex state but only if global" kind. Compositionality required.
The design decision to face is what to do when a global regexp is used. Throw, or ignore its lastIndex?
Axel Rauschmayer wrote:
- /g flag must be set
- lastIndex must be 0
- can’t inline the regex, because it is needed as a pseudo-iterator (more of an anti-pattern, anyway, but still)
- side effects via lastIndex may be a problem
Anything we do of the execAll/execIter kind had better be immune to the awful Perl4-infused "mutable lastIndex state but only if global" kind. Compositionality required.
The design decision to face is what to do when a global regexp is used. Throw, or ignore its lastIndex?
/be
On Thu, Aug 29, 2013 at 4:13 AM, Brendan Eich <brendan at mozilla.com> wrote:
The design decision to face is what to do when a global regexp is used. Throw, or ignore its lastIndex?
I'd hate to see it throw. Ignoring lastIndex seems friendlier, especially if it were called execAll
. It probably shouldn't be called execIter
considering exec
is already an iterator (even if a bit crazy).
I'd love to be able to send
a specific index to the generator, which would be completely equivalent to RegExp.prototype.exec
without the lastIndex smell.
On Thu, Aug 29, 2013 at 4:13 AM, Brendan Eich wrote:
Axel Rauschmayer wrote:
- /g flag must be set
- lastIndex must be 0
- can’t inline the regex, because it is needed as a pseudo-iterator (more of an anti-pattern, anyway, but still)
- side effects via lastIndex may be a problem
Anything we do of the execAll/execIter kind had better be immune to the awful Perl4-infused "mutable lastIndex state but only if global" kind. Compositionality required.
The design decision to face is what to do when a global regexp is used. Throw, or ignore its lastIndex?
I'd hate to see it throw. Ignoring lastIndex seems friendlier, especially
if it were called execAll
. It probably shouldn't be called execIter
considering exec
is already an iterator (even if a bit crazy).
I'd love to be able to send
a specific index to the generator, which
would be completely equivalent to RegExp.prototype.exec
without the
lastIndex smell.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130829/e21a5f51/attachment.html
On Aug 29, 2013, at 1:13 AM, Brendan Eich <brendan at mozilla.com> wrote:
The design decision to face is what to do when a global regexp is used. Throw, or ignore its lastIndex?
I would favor ignoring lastIndex rather than throwing, but to be sure can you clarify what you mean by global regexp?
If we're talking /.../g, then my feeling is that the /g should be ignored -- if you're wanting a regexp iterator for a string (or whatever) I would think that the API would imply that all regexps were intended to be "global".
If we're talking about multiple concurrent iterators with the same regexp/string then it should definitely be ignored :D
Erm.
I'm not sure if that's coherent, but the TLDR is that I favor ignoring all the old side state warts (i would not have iterators update the magic $ properties, etc)
On Aug 29, 2013, at 1:13 AM, Brendan Eich wrote:
Axel Rauschmayer wrote:
- /g flag must be set
- lastIndex must be 0
- can’t inline the regex, because it is needed as a pseudo-iterator (more of an anti-pattern, anyway, but still)
- side effects via lastIndex may be a problem
Anything we do of the execAll/execIter kind had better be immune to the awful Perl4-infused "mutable lastIndex state but only if global" kind. Compositionality required.
The design decision to face is what to do when a global regexp is used. Throw, or ignore its lastIndex?
I would favor ignoring lastIndex rather than throwing, but to be sure can you clarify what you mean by global regexp?
If we're talking /.../g, then my feeling is that the /g should be ignored -- if you're wanting a regexp iterator for a string (or whatever) I would think that the API would imply that all regexps were intended to be "global".
If we're talking about multiple concurrent iterators with the same regexp/string then it should definitely be ignored :D
Erm.
I'm not sure if that's coherent, but the TLDR is that I favor ignoring all the old side state warts (i would not have iterators update the magic $ properties, etc)
--Oliver
/be
es-discuss mailing list es-discuss at mozilla.org https://mail.mozilla.org/listinfo/es-discuss
then you are probably looking for something like this?
String.prototype.matchAll = function (re) {
for (var
re = new RegExp(
re.source,
"g" +
(re.ignoreCase ? "i" : "") +
(re.multiline ? "m" : "")
),
a = [],
m; m = re.exec(this);
a.push(m)
);
return a;
};
// example
'abcdefgh'.matchAll(/.(.)/g);
[
["ab", "b"],
["cd", "d"],
["ef", "f"],
["gh", "h"]
]
then you are probably looking for something like this?
String.prototype.matchAll = function (re) {
for (var
re = new RegExp(
re.source,
"g" +
(re.ignoreCase ? "i" : "") +
(re.multiline ? "m" : "")
),
a = [],
m; m = re.exec(this);
a.push(m)
);
return a;
};
// example
'abcdefgh'.matchAll(/.(.)/g);
[
["ab", "b"],
["cd", "d"],
["ef", "f"],
["gh", "h"]
]
On Thu, Aug 29, 2013 at 9:24 AM, Oliver Hunt wrote:
On Aug 29, 2013, at 1:13 AM, Brendan Eich wrote:
Axel Rauschmayer wrote:
- /g flag must be set
- lastIndex must be 0
- can’t inline the regex, because it is needed as a pseudo-iterator (more of an anti-pattern, anyway, but still)
- side effects via lastIndex may be a problem
Anything we do of the execAll/execIter kind had better be immune to the awful Perl4-infused "mutable lastIndex state but only if global" kind. Compositionality required.
The design decision to face is what to do when a global regexp is used. Throw, or ignore its lastIndex?
I would favor ignoring lastIndex rather than throwing, but to be sure can you clarify what you mean by global regexp?
If we're talking /.../g, then my feeling is that the /g should be ignored -- if you're wanting a regexp iterator for a string (or whatever) I would think that the API would imply that all regexps were intended to be "global".
If we're talking about multiple concurrent iterators with the same regexp/string then it should definitely be ignored :D
Erm.
I'm not sure if that's coherent, but the TLDR is that I favor ignoring all the old side state warts (i would not have iterators update the magic $ properties, etc)
--Oliver
/be
es-discuss mailing list es-discuss at mozilla.org https://mail.mozilla.org/listinfo/es-discuss
es-discuss mailing list es-discuss at mozilla.org https://mail.mozilla.org/listinfo/es-discuss
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20130829/6df42763/attachment-0001.html
Dean Landolt wrote:
I'd hate to see it throw. Ignoring lastIndex seems friendlier, especially if it were called
execAll
. It probably shouldn't be calledexecIter
consideringexec
is already an iterator (even if a bit crazy).
'exec' is not an iterator in any well-defined ES6 sense.
Yes, ok -- I blew it by trying to emulate Perl 4. The jargon there was "list" vs "scalar" context, not "iterator".
I'd love to be able to
send
a specific index to the generator, which would be completely equivalent toRegExp.prototype.exec
without the lastIndex smell.
Why an index? Rarely have I seen anyone assign other than constant 0 to lastIndex.
Dean Landolt wrote:
I'd hate to see it throw. Ignoring lastIndex seems friendlier, especially if it were called
execAll
. It probably shouldn't be calledexecIter
consideringexec
is already an iterator (even if a bit crazy).
'exec' is not an iterator in any well-defined ES6 sense.
Yes, ok -- I blew it by trying to emulate Perl 4. The jargon there was "list" vs "scalar" context, not "iterator".
I'd love to be able to
send
a specific index to the generator, which would be completely equivalent toRegExp.prototype.exec
without the lastIndex smell.
Why an index? Rarely have I seen anyone assign other than constant 0 to lastIndex.
/be
Oliver Hunt wrote:
I would favor ignoring lastIndex rather than throwing, but to be sure can you clarify what you mean by global regexp?
One created with the 'g' flag, either literally (/re/g) or via the constructor (new RegExp(src, 'g')).
If we're talking /.../g, then my feeling is that the /g should be ignored -- if you're wanting a regexp iterator for a string (or whatever) I would think that the API would imply that all regexps were intended to be "global".
Agreed, if we don't just throw from execAll on a global regexp ;-).
If we're talking about multiple concurrent iterators with the same regexp/string then it should definitely be ignored :D
IOW, in general, 'g' should be ignored by new APIs.
Erm.
I'm not sure if that's coherent, but the TLDR is that I favor ignoring all the old side state warts (i would not have iterators update the magic $ properties, etc)
Yes, agreed. The devil is in the details.
Oliver Hunt wrote:
I would favor ignoring lastIndex rather than throwing, but to be sure can you clarify what you mean by global regexp?
One created with the 'g' flag, either literally (/re/g) or via the constructor (new RegExp(src, 'g')).
If we're talking /.../g, then my feeling is that the /g should be ignored -- if you're wanting a regexp iterator for a string (or whatever) I would think that the API would imply that all regexps were intended to be "global".
Agreed, if we don't just throw from execAll on a global regexp ;-).
If we're talking about multiple concurrent iterators with the same regexp/string then it should definitely be ignored :D
IOW, in general, 'g' should be ignored by new APIs.
Erm.
I'm not sure if that's coherent, but the TLDR is that I favor ignoring all the old side state warts (i would not have iterators update the magic $ properties, etc)
Yes, agreed. The devil is in the details.
/be
Although It seems that some people agreed with appendingRegExp#execAll
to EcmaScript 4 years ago, what happened to it after that?
topic:
https://esdiscuss.org/topic/letting-regexp-method-return-something-iterable
implementation: https://www.npmjs.com/package/regexp.execall
There is a String#matchAll
proposal in stage 1.
https://github.com/tc39/String.prototype.matchAllhttp://%20https://github.com/tc39/String.prototype.matchAll
Oriol
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20170618/889dc793/attachment.html
@Oriol Thanks for your reply!
@Oriol Thanks for your reply!
On 2017/06/18 21:33, Oriol _ wrote:
There is a
String#matchAll
proposal in stage 1.https://github.com/tc39/String.prototype.matchAll <http:// https://github.com/tc39/String.prototype.matchAll>
Oriol