ISO-IR-111 (original) (raw)

From Wikipedia, the free encyclopedia

Formerly ECMA-standard multilingual KOI-8 character encoding version.

KOI8-E (1986)

Alias(es) ISO-IR-111
Language(s) Russian, Belarusian, Macedonian, Serbian, Ukrainian (partial)
Standard ECMA-113:1986
Classification Extended ASCII, KOI
Extends KOI8-B
Succeeded by ECMA-113:1988 (ISO-8859-5)
Other related encoding(s) KOI8-F
vte

ISO-IR-111[1] or KOI8-E[2] is an 8-bit character set. It is a multinational extension of KOI-8 for Belarusian, Macedonian, Serbian, and Ukrainian (except Ґґ which is added to KOI8-F). The name "ISO-IR-111" refers to its registration number in the ISO-IR registry, and denotes it as a set usable with ISO/IEC 2022.

It was defined by the first (1986) edition of ECMA-113,[3] which is the Ecma International standard corresponding to ISO/IEC 8859-5, and as such also corresponds to a 1987 draft version of ISO-8859-5.[4] The published editions of ISO/IEC 8859-5 instead correspond to subsequent editions of ECMA-113, which defines a different encoding.[5]

ISO-IR-111, the 1985 edition of ECMA-113 (also called "ECMA-Cyrillic" or "KOI8-E"), was based on the 1974 edition of GOST 19768 (i.e. KOI-8). In 1987 ECMA-113 was redesigned.[5] These newer editions of ECMA-113 are equivalent to ISO-8859-5,[5][6] and do not follow the KOI layout. This confusion has led to a common misconception that ISO-8859-5 was defined in or based on GOST 19768-74.[6]

Possibly as another consequence of this, RFC 1345 erroneously lists a different codepage under the names "ISO-IR-111" and "ECMA-Cyrillic", resembling ISO-8859-5 with re-ordered rows, and partially compatible with Windows-1251.[7][6] Due to concerns that existing implementations might use the RFC 1345 definition for those two labels, it was proposed that the IANA additionally recognise KOI8-E as a label for ECMA-113:1985 content,[7] and the IANA presently lists that label as an alias.[2]

The following table shows the ISO-IR-111 encoding. Each character is shown with its equivalent Unicode code point.

ISO-IR-111

| | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | | | ---- | ------------------------------------------------------- | -------------------------------------------------- | -------------------------------------------------- | ---------------------------------------------------- | -------------------------------------------------- | -------------------------------------------------- | ---------------------------------------------------- | -------------------------------------------------- | -------------------------------------------------- | ------------------------------------------------ | ---------------------------------------------------------- | ------------------------------------------------------------ | ---------------------------------------------------------- | ------------------------------------------------------------------------------ | ---------------------------------------------------- | ---------------------------------------------------------- | | 0x | | | | | | | | | | | | | | | | | | 1x | | | | | | | | | | | | | | | | | | 2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | | 3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | | 4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | | 5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ | | 6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | | 7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | | | 8x | | | | | | | | | | | | | | | | | | 9x | | | | | | | | | | | | | | | | | | Ax | NBSP | ђ0452 | ѓ0453 | ё0451 | є0454 | ѕ0455 | і0456 | ї0457 | ј0458 | љ0459 | њ045A | ћ045B | ќ045C | SHY | ў045E | џ045F | | Bx | 2116 | Ђ0402 | Ѓ0403 | Ё0401 | Є0404 | Ѕ0405 | І0406 | Ї0407 | Ј0408 | Љ0409 | Њ040A | Ћ040B | Ќ040C | ¤00A4 | Ў040E | Џ040F | | Cx | ю044E | а0430 | б0431 | ц0446 | д0434 | е0435 | ф0444 | г0433 | х0445 | и0438 | й0439 | к043A | л043B | м043C | н043D | о043E | | Dx | п043F | я044F | р0440 | с0441 | т0442 | у0443 | ж0436 | в0432 | ь044C | ы044B | з0437 | ш0448 | э044D | щ0449 | ч0447 | ъ044A | | Ex | Ю042E | А0410 | Б0411 | Ц0426 | Д0414 | Е0415 | Ф0424 | Г0413 | Х0425 | И0418 | Й0419 | К041A | Л041B | М041C | Н041D | О041E | | Fx | П041F | Я042F | Р0420 | С0421 | Т0422 | У0423 | Ж0416 | В0412 | Ь042C | Ы042B | З0417 | Ш0428 | Э042D | Щ0429 | Ч0427 | Ъ042A |

Extended and modified versions

[edit]

A modified version named KOI8 Unified or KOI8-F was used in software produced by Fingertip Software, adding the Ґ in its KOI8-U location (replacing the soft hyphen and displacing the universal currency sign), and adding some graphical characters in the C1 control codes area, mainly from KOI8-R and Windows-1251.[4][6][8][9]

Incorrect RFC 1345 code page

[edit]

RFC 1345's "ECMA-Cyrillic"

Language(s) Russian, Belarusian, Macedonian, Serbian
Standard RFC 1345
Classification Extended ASCII
Transforms / Encodes ISO-IR-111
Other related encoding(s) ISO-8859-5, Windows-1251
vte

RFC 1345 erroneously lists a different code page under the name ISO-IR-111, encoding the same Cyrillic characters but with a different layout. It resembles a mixture of Windows-1251 and ISO-8859-5.[7] Specifically, line A_ corresponds to ISO-8859-5, lines C_ through F_ correspond to Windows-1251[6] (equivalent to lines B_ through E_ of ISO-8859-5), and line B_ nearly corresponds to line F_ of ISO-8859-5, with the exception of the § being replaced with a ¤.

Certain codes resemble ISO-IR-111 with flipped letter case, which may have contributed to the confusion. The majority differ and are shown below.

Code page erroneously labelled "ISO-IR-111" or "ECMA-Cyrillic" in RFC 1345

| | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | | | ---- | ------------------------------------------------------- | ---------------------------------------------- | ---------------------------------------------- | ---------------------------------------------- | ---------------------------------------------- | ---------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------ | ------------------------------------------------ | ------------------------------ | ---------------------------------------------- | ---------------------------------------------- | ---------------------------------------------- | ---------------------------------------------- | ---------------------------------------------------------- | ---------------------------------------------- | | Ax | NBSP | Ё | Ђ | Ѓ | Є | Ѕ | І | Ї | Ј | Љ | Њ | Ћ | Ќ | SHY | Ў | Џ | | Bx | | ё | ђ | ѓ | є | ѕ | і | ї | ј | љ | њ | ћ | ќ | ¤ | ў | џ | | Cx | А | Б | В | Г | Д | Е | Ж | З | И | Й | К | Л | М | Н | О | П | | Dx | Р | С | Т | У | Ф | Х | Ц | Ч | Ш | Щ | Ъ | Ы | Ь | Э | Ю | Я | | Ex | а | б | в | г | д | е | ж | з | и | й | к | л | м | н | о | п | | Fx | р | с | т | у | ф | х | ц | ч | ш | щ | ъ | ы | ь | э | ю | я |

Deviating from ISO-IR-111 (excluding deviations in case only)

  1. ^ ECMA (1 August 1985). Right-hand Part of the Cyrillic Alphabet (PDF). ITSCJ/IPSJ. ISO-IR-111.
  2. ^ a b "Character Sets". IANA.
  3. ^ ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (1st ed., June 1986)
  4. ^ a b Czyborra, Roman (1998-11-30) [1998-05-25]. "The Cyrillic Charset Soup". Archived from the original on 2016-12-03. Retrieved 2016-12-03.
  5. ^ a b c ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (2nd ed., June 1988)
  6. ^ a b c d e Nechayev, Valentin (2013) [2001]. "Review of 8-bit Cyrillic encodings universe". Archived from the original on 2016-12-05. Retrieved 2016-12-05.
  7. ^ a b c Sokolov, Michael (2003-04-05). "ECMA-cyrillic alias iso-ir-111 sore". IETF Charsets Mailing List.
  8. ^ "KOI8 Unified". Fingertip Software. Archived from the original on 1998-01-09. Retrieved 2020-02-11.
  9. ^ Leisher, Mark (2008) [1998-03-05]. "KOI8 Unified Cyrillic to Unicode 2.1 mapping table". Department of Mathematical Sciences, New Mexico State University. Retrieved 2020-05-02.