require-unicode-regexp - ESLint - Pluggable JavaScript Linter (original) (raw)
Enforce the use of u
or v
flag on regular expressions
π‘ hasSuggestions
Some problems reported by this rule are manually fixable by editor suggestions
Table of Contents
RegExp u
flag has two effects:
- Make the regular expression handling UTF-16 surrogate pairs correctly.
Especially, character range syntax gets the correct behavior.
/^[π]$/.test("π") //β false
/^[π]$/u.test("π") //β true
- Make the regular expression throwing syntax errors early as disabling Annex B extensions.
Because of historical reason, JavaScript regular expressions are tolerant of syntax errors. For example,/\w{1, 2/
is a syntax error, but JavaScript doesnβt throw the error. It matches strings such as"a{1, 2"
instead. Such a recovering logic is defined in Annex B.
Theu
flag disables the recovering logic Annex B defined. As a result, you can find errors early. This is similar to the strict mode.
The RegExp v
flag, introduced in ECMAScript 2024, is a superset of the u
flag, and offers two more features:
- Unicode properties of strings
With the Unicode property escape, you can use properties of strings.
const re = /^\p{RGI_Emoji}$/v;
// Match an emoji that consists of just 1 code point:
re.test('β½'); // '\u26BD'
// β true β
// Match an emoji that consists of multiple code points:
re.test('π¨πΎββοΈ'); // '\u{1F468}\u{1F3FE}\u200D\u2695\uFE0F'
// β true β
- Set notation
It allows for set operations between character classes.
const re = /[\p{White_Space}&&\p{ASCII}]/v;
re.test('\n'); // β true
re.test('\u2028'); // β false
Therefore, the u
and v
flags let us work better with regular expressions.
Rule Details
This rule aims to enforce the use of u
or v
flag on regular expressions.
Examples of incorrect code for this rule:
/*eslint require-unicode-regexp: error */
const a = /aaa/
const b = /bbb/gi
const c = new RegExp("ccc")
const d = new RegExp("ddd", "gi")
Examples of correct code for this rule:
/*eslint require-unicode-regexp: error */
const a = /aaa/u
const b = /bbb/giu
const c = new RegExp("ccc", "u")
const d = new RegExp("ddd", "giu")
const e = /aaa/v
const f = /bbb/giv
const g = new RegExp("ccc", "v")
const h = new RegExp("ddd", "giv")
// This rule ignores RegExp calls if the flags could not be evaluated to a static value.
function i(flags) {
return new RegExp("eee", flags)
}
Options
This rule has one object option:
"requireFlag": "u"|"v"
requires a particular Unicode regex flag
requireFlag: βuβ
The u
flag may be preferred in environments that do not support the v
flag.
Examples of incorrect code for this rule with the { "requireFlag": "u" }
option:
/*eslint require-unicode-regexp: ["error", { "requireFlag": "u" }] */
const fooEmpty = /foo/;
const fooEmptyRegexp = new RegExp('foo');
const foo = /foo/v;
const fooRegexp = new RegExp('foo', 'v');
Examples of correct code for this rule with the { "requireFlag": "u" }
option:
/*eslint require-unicode-regexp: ["error", { "requireFlag": "u" }] */
const foo = /foo/u;
const fooRegexp = new RegExp('foo', 'u');
requireFlag: βvβ
The v
flag may be a better choice when it is supported because it has more features than the u
flag (e.g., the ability to test Unicode properties of strings). It does have a stricter syntax, however (e.g., the need to escape certain characters within character classes).
Examples of incorrect code for this rule with the { "requireFlag": "v" }
option:
/*eslint require-unicode-regexp: ["error", { "requireFlag": "v" }] */
const fooEmpty = /foo/;
const fooEmptyRegexp = new RegExp('foo');
const foo = /foo/u;
const fooRegexp = new RegExp('foo', 'u');
Examples of correct code for this rule with the { "requireFlag": "v" }
option:
/*eslint require-unicode-regexp: ["error", { "requireFlag": "v" }] */
const foo = /foo/v;
const fooRegexp = new RegExp('foo', 'v');
When Not To Use It
If you donβt want to warn on regular expressions without either a u
or a v
flag, then itβs safe to disable this rule.
Note on i
flag and \w
In some cases, adding the u
flag to a regular expression using both the i
flag and the \w
character class can change its behavior due to Unicode case folding.
For example:
const regexWithoutU = /^\w+$/i;
const regexWithU = /^\w+$/iu;
const str = "\u017f\u212a"; // Example Unicode characters
console.log(regexWithoutU.test(str)); // false
console.log(regexWithU.test(str)); // true
If you prefer to use a non-Unicode-aware regex in this specific case, you can disable this rule using an eslint-disable
comment:
/* eslint-disable require-unicode-regexp */
const regex = /^\w+$/i;
/* eslint-enable require-unicode-regexp */
Version
This rule was introduced in ESLint v5.3.0.
Further Reading