Supported Modifier Flags · Issue #1 · tc39/proposal-regexp-modifiers (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
In the Oct, 2021 plenary, @michaelficarra asked that we outline and provide motivating examples for each flag we are considering as a supported modifier.
The flags currently under consideration are:
i
— ignore-case- Rationale — Toggling ignore-case is especially useful when matching patterns with varying case sensitivity, or when parsing patterns provided via JSON configuration. Especially useful when working with complex Unicode character ranges.
- Example — Match upper case ascii letter followed by upper or lower case ascii letter or '
const re = /^A-Z[a-z']+$/;
re.test("O'Neill"); // true
re.test("o'neill"); // false
// alternatively (defaulting to ignore-case):
const re2 = /^(?-i:[A-Z])[a-z']+$/i; - Example — Match word starting with
D
followed by word starting withD
ord
(from .NET documentation, see 1)
const re = /\b(D\w+)(?ix)\s(d\w+)\b/g;
const input = "double dare double Double a Drooling dog The Dreaded Deep";
re.exec(input); // ["Drooling dog", "Drooling", "dog"]
re.exec(input); // ["Dreaded Deep", "Dreaded", "Deep"]
m
— multiline- Rationale — Flexibility in matching beginning-of-buffer vs. beginning-of-line or end-of-buffer vs. end-of-line in a complex pattern.
- Example — Match a frontmatter block at the start of a file
const re = /^---(?m)$((?:^(?!---$).$))^---$/;
re.test("---a"); // false
re.test("---\n---"); // true
re.test("---\na: b\n---"); // true
s
— dot-all (i.e., "single line")- Rationale — Control over
.
matching semantics within a pattern. - Example
const re = /a.c(?s:.)*x.z/;
re.test("a\ncx\nz"); // flse
re.test("abcdxyz"); // true
re.test("aBc\nxYz"); // true
- Rationale — Control over
x
— Extended Mode. This flag is proposed by https://github.com/tc39/proposal-regexp-x-modeRationale — Would allow control over significant whitespace handling in a pattern.
Example — Disabling
x
mode when composing a complex pattern:
const idPattern =[a-z]{2} \d{4}
; // space required
const re = new RegExp(String.raw`match the id
(?(?-x:${idPattern}))
match a separator
:\s
match the value
(?\w+)
`, "x");
re.exec("aa0123: foo")?.groups; // undefined
re.exec("aa 0123: foo")?.groups; // { id: "aa 0123", value: "foo" }
Flags likely too complex to support:
u
— Unicode. This flag affects how a pattern is parsed, not how it is matched. Supporting it would likely require a cover grammar and additional static semantics.v
— Extended Unicode. This flag is proposed by https://github.com/tc39/proposal-regexp-set-notation as an extension of theu
flag and would have the same difficulties.
Flags that will never be supported:
g
— Global. This flag affects the index at which matching starts and not the matching behavior itself. Changing it mid pattern would have no effect.y
— Sticky. This flag affects the index at which matching starts and not the matching behavior itself. Changing it mid pattern would have no effect.d
— Indices. This flag affects the match result. Changing it mid pattern would have no effect.