languages, character sets, names etc (original) (raw)
Data on languages
About languages using latin script About languages using cyrillic script
Data on codepages
Data on character set repertoires
If only the first item is selected, the contents of this character set are displayed. If both items are selected, the contents of these sets are compared with each other.
Search in Unicode character names
The search is case-insensitive but requires exact substring match. It generally implies that you know how character names are constructed. Some things are counterintuitive and hard to find: AE is letter, OE is ligature; there is no reversed apostrophe but reversed comma instead and so on. Still, it's a useful tool.
Be sane! Searching by only one or two letters will probably drown your browser.
Search by Unicode number
- The numbers here must be presented in the hexadecimal notation i.e. they can contain digits 0-9 and letters A-F in either upper or lower case. Leading zeroes may be omitted (eg range query af-102).
- Many programs (especially those working with HTML) use decimal numbers. To make life easier, here is the same Unicode number query that expects decimal input.
- For my own convenience - this input form accepts (hex) utf8 encodings. It does not accept ranges and returns ? if the input is not valid.
UCS collections comprising Multilingual European Subset No. 3
1 | Basic Latin | 0020--007E | view |
---|---|---|---|
2 | Latin-1 Supplement | 00A0--00FF | view |
3 | Latin Extended-A | 0100--017F | view |
4 | Latin Extended-B | 0180--024F | view |
5 | IPA Extensions | 0250--02AF | view |
6 | Spacing Modifier Letters | 02B0--02FF | view |
7 | Combining Diacritical Marks | 0300--036F | view |
8 | Basic Greek | 0370--03CF | view |
9 | Greek Symbols and Coptic | 03D0--03FF | view |
10 | Cyrillic | 0400--04FF | view |
11 | Armenian | 0530--058F | view |
27 | Basic Georgian | 10D0--10FF | view |
30 | Latin Extended Additional | 1E00--1EFF | view |
31 | Greek Extended | 1F00--1FFF | |
32 | General Punctuation | 2000--206F | view |
33 | Superscripts and Subscripts | 2070--209F | view |
34 | Currency Symbols | 20A0--20CF | view |
35 | Combining Diacritical Marks for Symbols | 20D0--20FF | view |
36 | Letterlike Symbols | 2100--214F | view |
37 | Number Forms | 2150--218F | |
38 | Arrows | 2190--21FF | view |
39 | Mathematical Operators | 2200--22FF | view |
40 | Miscellaneous Technical | 2300--23FF | view |
42 | Optical Character Recognition | 2440--245F | |
44 | Box Drawing | 2500--257F | view |
45 | Block Elements | 2580--259F | view |
46 | Geometric Shapes | 25A0--25FF | |
47 | Miscellaneous Symbols | 2600--26FF | |
48 | Dingbats (not in MES) | 2700--27BF | view |
XX | Letter Database private range | E000--F8FF | view |
63 | Alphabetic Presentation Forms | FB00--FB4F | view |
65 | Combining Half Marks | FE20--FE2F | |
70 | Specials | FFF0--FFFD |
- Old Italic U+10300 U+1032F
- Gothic U+10330 U+1034F
- Deseret U+10400 U+1044F
- Byzantine Musical Symbols U+1D000 U+1D0FF
- Musical Symbols U+1D100 U+1D1FF
- Mathematical Alphanumeric Symbols U+1D400 U+1D7FF
- CJK Unified Ideographs Extension B U+20000 U+2A6D6
- CJK Compatibility Ideographs Supplement U+2F800 U+2FA1D
- Tag Characters U+E0000 U+E007F