s/dotAll flag for regular expressions (original) (raw)
Stage 4 Draft / January 19, 2018
The algorithm listed in 21.2.3.2.2 is modified as follows.
1Runtime Semantics: RegExpInitialize ( obj, pattern, flags )
When the abstract operation RegExpInitialize with arguments obj, pattern, and flags is called, the following steps are taken:
- If pattern is undefined, let P be the empty String.
- Else, let P be ? ToString(pattern).
- If flags is undefined, let F be the empty String.
- Else, let F be ? ToString(flags).
- If F contains any code unit other than
"g"
,"i"
,"m"
,"s"
,"u"
, or"y"
or if it contains the same code unit more than once, throw a SyntaxError exception. - If F contains
"u"
, let BMP be false; else let BMP be true. - If BMP is true, then
- Parse P using the grammars in 21.2.1 and interpreting each of its 16-bit elements as a Unicode BMP code point. UTF-16 decoding is not applied to the elements. The goal symbol for the parse is Pattern[~U]. Throw a SyntaxError exception if P did not conform to the grammar, if any elements of P were not matched by the parse, or if any Early Error conditions exist.
- Let patternCharacters be a List whose elements are the code unit elements of P.
- Else,
- Parse P using the grammars in 21.2.1 and interpreting P as UTF-16 encoded Unicode code points (6.1.4). The goal symbol for the parse is Pattern[+U]. Throw a SyntaxError exception if P did not conform to the grammar, if any elements of P were not matched by the parse, or if any Early Error conditions exist.
- Let patternCharacters be a List whose elements are the code points resulting from applying UTF-16 decoding to P's sequence of elements.
- Set obj.[[OriginalSource]] to P.
- Set obj.[[OriginalFlags]] to F.
- Set obj.[[RegExpMatcher]] to the internal procedure that evaluates the above parse of P by applying the semantics provided in 21.2.2 using patternCharacters as the pattern's List of SourceCharacter values and F as the flag parameters.
- Perform ? Set(obj,
"lastIndex"
, 0, true). - Return obj.
The section 21.2.2.1 Notation is modified as follows.
2Notation
The descriptions below use the following variables:
- Input is a List consisting of all of the characters, in order, of the String being matched by the regular expression pattern. Each character is either a code unit or a code point, depending upon the kind of pattern involved. The notation Input[n] means the nth character of Input, where n can range between 0 (inclusive) and InputLength (exclusive).
- InputLength is the number of characters in Input.
- NcapturingParens is the total number of left capturing parentheses (i.e. the total number of times the Atom::(Disjunction) production is expanded) in the pattern. A left capturing parenthesis is any
(
pattern character that is matched by the(
terminal of the Atom::(Disjunction) production. - DotAll is true if the RegExp object's [[OriginalFlags]] internal slot contains
"s"
and otherwise is false. - IgnoreCase is true if the RegExp object's [[OriginalFlags]] internal slot contains
"i"
and otherwise is false. - Multiline is true if the RegExp object's [[OriginalFlags]] internal slot contains
"m"
and otherwise is false. Unicode is true if the RegExp object's [[OriginalFlags]] internal slot contains"u"
and otherwise is false.
The algorithm listed in 21.2.2.8 Atom is modified as follows.
3Atom
The production Atom::. evaluates as follows:
Let A be the set of all characters except LineTerminator.- If DotAll is true, then
- Let A be the set of all characters.
- Otherwise, let A be the set of all characters except LineTerminator.
- Call CharacterSetMatcher(A, false) and return its Matcher result.
The algorithm listed in 21.2.5.3 is modified as follows.
4get RegExp.prototype.flags
RegExp.prototype.flags
is an accessor property whose set accessor function is undefined. Its get accessor function performs the following steps:
- Let R be the this value.
- If Type(R) is not Object, throw a TypeError exception.
- Let result be the empty String.
- Let global be ToBoolean(? Get(R,
"global"
)). - If global is true, append
"g"
as the last code unit of result. - Let ignoreCase be ToBoolean(? Get(R,
"ignoreCase"
)). - If ignoreCase is true, append
"i"
as the last code unit of result. - Let multiline be ToBoolean(? Get(R,
"multiline"
)). - If multiline is true, append
"m"
as the last code unit of result. - Let dotAll be ToBoolean(? Get(R,
"dotAll"
)). - If dotAll is true, append
"s"
as the last code unit of result. - Let unicode be ToBoolean(? Get(R,
"unicode"
)). - If unicode is true, append
"u"
as the last code unit of result. - Let sticky be ToBoolean(? Get(R,
"sticky"
)). - If sticky is true, append
"y"
as the last code unit of result. - Return result.
The following new section is added before 21.2.5.10 get RegExp.prototype.source.
5get RegExp.prototype.dotAll
RegExp.prototype.dotAll
is an accessor property whose set accessor function is undefined. Its get accessor function performs the following steps:
- Let R be the this value.
- If Type(R) is not Object, throw a TypeError exception.
- If R does not have an [[OriginalFlags]] internal slot, then
- If SameValue(R, %RegExpPrototype%) is true, return undefined.
- Otherwise, throw a TypeError exception.
- Let flags be R.[[OriginalFlags]].
- If flags contains the code unit
"s"
, return true. - Return false.