Character Sets - Win32 apps (original) (raw)

A "character set" is a mapping of characters to their identifying code values. The character set most commonly used in computers today is Unicode, a global standard for character encoding. Internally, Windows applications use the UTF-16 implementation of Unicode. In UTF-16, most characters are identified by two-byte codes. The less commonly used supplementary characters are each represented by a surrogate pair, which is a pair of two-byte codes. For more information, see Surrogates and Supplementary Characters.

Some Windows applications must work with the older character sets that are native to Windows Me/98/95. Windows code pages allow your application to work with these character sets. These character sets can be divided into:

For more information, see the following topics:

About Unicode and Character Sets