PHP: Hypertext Preprocessor (original) (raw)

mb_regex_encoding

(PHP 4 >= 4.2.0, PHP 5, PHP 7, PHP 8)

mb_regex_encoding — Set/Get character encoding for multibyte regex

Description

Parameters

encoding

The encodingparameter is the character encoding. If it is omitted or [null](reserved.constants.php#constant.null), the internal character encoding value will be used.

Return Values

If encoding is set, then Returns [true](reserved.constants.php#constant.true) on success or [false](reserved.constants.php#constant.false) on failure. In this case, the internal character encoding is NOT changed. If encoding is omitted, then the current character encoding name for a multibyte regex is returned.

Changelog

Version Description
8.0.0 encoding is nullable now.

See Also

Found A Problem?

GerryH

7 years ago

`mb_ereg functionality is provided via Oniguruma RegEx library and not via PCRE. mb_regex_encoding() does only support a subset of encoding names, compared to mb_list_encodings() and mb_encoding_aliases().
Currently the following names are supported (case-insensitive):

UCS-4
UCS-4LE
UTF-32
UTF-32BE
UTF-32LE
UTF-16
UTF-16BE
UTF-16LE
UTF-8
utf8
ASCII
US-ASCII
EUC-JP
eucJP
x-euc-jp
SJIS
eucJP-win
SJIS-win
CP932
MS932
Windows-31J
ISO-8859-1
ISO-8859-2
ISO-8859-3
ISO-8859-4
ISO-8859-5
ISO-8859-6
ISO-8859-7
ISO-8859-8
ISO-8859-9
ISO-8859-10
ISO-8859-13
ISO-8859-14
ISO-8859-15
ISO-8859-16
EUC-CN
EUC_CN
eucCN
gb2312
EUC-TW
EUC_TW
eucTW
BIG-5
CN-BIG5
BIG-FIVE
BIGFIVE
EUC-KR
EUC_KR
eucKR
KOI8-R
KOI8R

The list is a mixture of base names and aliases and applies to PHP 5.4.45 (Oniguruma lib v4.7.1), PHP 5.6.31 (v5.9.5), PHP 7.0.22 (v5.9.6) and PHP 7.1.8 (v5.9.6). Be aware of the inconsistency: mb_regex_encoding() accepts for example the base name 'UTF-8' and its only alias 'utf8', but it does not accept aliases 'utf16', 'utf32' or 'latin1'.

Additionally note, that the informal name/alias 'latin9' for ISO/IEC 8859-15:1999 (including the Euro sign on 0xA4) is also not known by mb_list_encodings(). It can only be adressed as 'ISO-8859-15' or 'ISO_8859-15' and for mb_regex_encoding() solely as 'ISO-8859-15'.

`

php dot net at phor dot net

14 years ago

`Beware, mb_regex_encoding does not support the same set of encodings as listed in mb_list_encodings.php

Example:

`

code at roberthairgrove dot com

7 years ago

`mb_regex_encoding does not recognize CP1252 or Windows-1252 as valid encodings, although they are in the list generated by mb_list_encodings.

ISO-8859-1 (AKA "Latin-1") is supported, but it's not the same as the Windows variety of Latin-1.

`

Anonymous

15 years ago

`To change algo the regex_encodign
<?php
echo "current mb_internal_encoding: ".mb_internal_encoding()."
";
echo "changing mb_internal_encoding to UTF-8
";
mb_internal_encoding("UTF-8");
echo "new mb_internal_encoding: ".mb_internal_encoding()."
";

echo

"current mb_regex_encoding: ".mb_regex_encoding()."
";
echo "changing mb_regex_encoding to UTF-8
";
mb_regex_encoding('UTF-8');
echo "new mb_regex_encoding: ".mb_regex_encoding()."
";
?>`