Unicode Utilities: Unicode Language Identifers and BCP47 (original) (raw)
help | character | properties | confusables | unicode-set | compare-sets | regex | bnf-regex | breaks | transform | bidi | bidi-c | idna | languageid
Input |
---|
Localization:
Status
Source: fr-CA
Type | Code | Name | Replacement |
---|---|---|---|
Language | fr | French | |
Region | CA | Canada |
Source: gsw-Arab-AQ
Type | Code | Name | Replacement |
---|---|---|---|
Language | gsw | Swiss German | |
Script | Arab | Arabic | |
Region | AQ | Antarctica |
Source: eng-Latn-840
Canonical Form: en-Latn-US
Minimal Form: en
Type | Code | Name | Replacement |
---|---|---|---|
Language | eng | invalid code | en |
Script | Latn | Latin | |
Region | 840 | invalid Code | US |
Samples
- en
- eng-840
- pt_PT
- AZ-arab-Ir
- zh-Hant-HK
- en-cmn-Hant-HK
- sl-Cyrl-YU-rozaj-solba-1994-b-1234-a-Foobar-x-b-1234-a-Foobar
- Other Samples
Notes
- Unicode language ids are based on BCP 47, but differ in a few ways.
- The names are localized with Unicode CLDR data: names with '*' are fallbacks to English; names with '**' are fallbacks to the latest draft registry names.
- Replacements are for invalid subtags (zho → zh, 248 → AX), or preferred replacements (iw → he), orpredominant languages (arb → ar).
Fonts and Display. If you don't have a good set of Unicode fonts (and modern browser), you may not be able to read some of the characters. Some suggested fonts that you can add for coverage are:Noto Fonts site,Unicode Fonts for Ancient Scripts,Large, multi-script Unicode fonts. See also: Unicode Display Problems.
Version 3.9; ICU version: 74.1; Unicode/Emoji version: 15.1.0;