ISO/IEC 8859-5
Alias(es) | ISO-IR-144, Cyrillic, csISOLatinCyrillic[1] |
---|---|
Language(s) | Russian,Bulgarian,Belarusian,Macedonian,Serbian,Ukrainian(partial) |
Standard | ISO/IEC 8859-5, ECMA-113 (since 1988 edition) |
Classification | Extended ASCII,ISO 8859 |
Extends | US-ASCII,ISO-IR-153 |
Based on | Main code page[2] |
Extensions | IBM-915 |
Preceded by | ECMA-113:1986 (ISO-IR-111) |
Other related encoding(s) | IBM-1124 |
ISO/IEC 8859-5:1999,Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic Alpha bet,is part of theISO/IEC 8859series of ASCII-based standardcharacter encodings,first edition published in 1988. It is informally referred to asLatin/Cyrillic.
It was designed to cover languages using aCyrillic Alpha betsuch asBulgarian,Belarusian,Russian,SerbianandMacedonianbut was never widely used. The 8-bit encodingsKOI8-RandKOI8-U,IBM-866,and alsoWindows-1251are far more commonly used. In contrast to the relationship betweenWindows-1252andISO 8859-1,Windows-1251 is not closely related to ISO 8859-5. However, the mainCyrillic blockinUnicodeuses a layout based on ISO-8859-5.
ISO 8859-5 would also have been usable forUkrainianin theSoviet Unionfrom 1933 to 1990, but it is missing theUkrainian letterge,ґ, which is required inUkrainian orthographybefore and since, and during that periodoutside Soviet Ukraine.As a result, IBM createdCode page 1124.
ISO-8859-5is theIANApreferred charset name for this standard when supplemented with theC0 and C1 control codesfromISO/IEC 6429.TheWindows code pagefor ISO-8859-5 iscode page 28595a.k.a.Windows-28595.[3]IBM assignedcode page 915to ISO-8859-5 until that code page was extended.
Codepage layout
[edit]Differences fromISO 8859-1are shown with itsUnicodeequivalent code point.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
1x | ||||||||||||||||
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
8x | ||||||||||||||||
9x | ||||||||||||||||
Ax | NBSP | Ё 0401 |
Ђ 0402 |
Ѓ 0403 |
Є 0404 |
Ѕ 0405 |
І 0406 |
Ї 0407 |
Ј 0408 |
Љ 0409 |
Њ 040A |
Ћ 040B |
Ќ 040C |
SHY | Ў 040E |
Џ 040F |
Bx | А 0410 |
Б 0411 |
В 0412 |
Г 0413 |
Д 0414 |
Е 0415 |
Ж 0416 |
З 0417 |
И 0418 |
Й 0419 |
К 041A |
Л 041B |
М 041C |
Н 041D |
О 041E |
П 041F |
Cx | Р 0420 |
С 0421 |
Т 0422 |
У 0423 |
Ф 0424 |
Х 0425 |
Ц 0426 |
Ч 0427 |
Ш 0428 |
Щ 0429 |
Ъ 042A |
Ы 042B |
Ь 042C |
Э 042D |
Ю 042E |
Я 042F |
Dx | а 0430 |
б 0431 |
в 0432 |
г 0433 |
д 0434 |
е 0435 |
ж 0436 |
з 0437 |
и 0438 |
й 0439 |
к 043A |
л 043B |
м 043C |
н 043D |
о 043E |
п 043F |
Ex | р 0440 |
с 0441 |
т 0442 |
у 0443 |
ф 0444 |
х 0445 |
ц 0446 |
ч 0447 |
ш 0448 |
щ 0449 |
ъ 044A |
ы 044B |
ь 044C |
э 044D |
ю 044E |
я 044F |
Fx | № 2116 |
ё 0451 |
ђ 0452 |
ѓ 0453 |
є 0454 |
ѕ 0455 |
і 0456 |
ї 0457 |
ј 0458 |
љ 0459 |
њ 045A |
ћ 045B |
ќ 045C |
§ 00A7 |
ў 045E |
џ 045F |
History and related code pages
[edit]The ECMA-113 standard has been equivalent to ISO-8859-5 since its second edition,[4]its first edition (ISO-IR-111) having been an extension of the earlierKOI-8(defined by GOST 19768-74), which lays out the Russian letters in the same way as their ASCII Roman equivalents where possible. The initial draft of ISO-8859-5 (DIS-8859-5:1987) followed ISO-IR-111, but was revised[4]after GOST 19768-74 was replaced[5]by the newISO-IR-153in 1987, which re-arranged the Russian letters into Alpha betical order (except for Ё).[5][6]ISO-IR-153 contains the Russian letters, including Ё, and the non-breaking space and soft hyphen, whereas the full Cyrillic set of ISO-8859-5 is also called ISO-IR-144.[7]
Possibly as a consequence of this confusion,RFC1345erroneously listsyet another code pageas "ISO-IR-111", combining the letter order and case order of ISO-8859-5 with the row order of ISO-IR-111 (and consequently compatible with neither in practice, but in practice partially compatible[2]withWindows-1251).[8][2]
IBMCode page 915is an extension of ISO/IEC 8859-5, adding somesemigraphicand other symbols in theC1area. IBMCode page 1124is mostly identical to ISO-8859-5, but replaces ѓ with ґ forUkrainianuse.
ISO-IR-200,"Uralic Supplementary Cyrillic Set",[9]was registered in 1998 by Everson Gunn Teoranta (whichMichael Eversonwas a director of, prior to the founding ofEvertypein 2001),[10]and changes several of the non-Russian letters in order to support theKildin Sami,KomiandNenetslanguages, not supported by ISO-8859-5 itself. Michael Everson also introducedMac OS Barents Cyrillicfor the same languages on classic Mac OS. FreeDOS calls itcode page 59283.[11]
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
Ax | NBSP | Ё | Ӈ 04C7 |
Ӓ 04D2 |
Ӭ 04EC |
Ҍ 048C |
І | Ӧ 04E6 |
Ҋ 048A |
Ӆ 04C5 |
Ӊ 04C9 |
« 00AB |
Ӎ 04CD |
SHY | Ҏ 048E |
ʼ 02BC |
Fx | № | ё | ӈ 04C8 |
ӓ 04D3 |
ӭ 04ED |
ҍ 048D |
і | ӧ 04E7 |
ҋ 048B |
ӆ 04C6 |
ӊ 04CA |
» 00BB |
ӎ 04CE |
§ | ҏ 048F |
ˮ 02EE |
ISO-IR-201, "Volgaic Supplementary Cyrillic Set",[12]was similarly introduced by Everson Gunn Teoranta in order to support theChuvash,Komi,MariandUdmurtlanguages, spoken in the titularrepublics of Russia.FreeDOS calls itcode page 58259.[13]
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
Ax | NBSP | Ё | Ӑ 04D0 |
Ӓ 04D2 |
Ӗ 04D6 |
Ҫ 04AA |
І | Ӧ 04E6 |
Ӥ 04E4 |
Ӝ 04DC |
Ҥ 04A4 |
Ӹ 04F8 |
Ӟ 04DE |
SHY | Ӱ 04F0 |
Ӵ 04F4 |
Fx | № | ё | ӑ 04D1 |
ӓ 04D3 |
ӗ 04D7 |
ҫ 04AB |
і | ӧ 04E7 |
ӥ 04E5 |
ӝ 04DD |
ҥ 04A5 |
ӹ 04F9 |
ӟ 04DF |
§ | ӱ 04F1 |
ӵ 04F5 |
References
[edit]- ^Character Sets,Internet Assigned Numbers Authority(IANA), 2018-12-12.
- ^abcNechayev, Valentin (2013) [2001]."Review of 8-bit Cyrillic encodings universe".Archivedfrom the original on 2016-12-05.Retrieved2016-12-05.
- ^"Code Page Identifiers".7 January 2021.
- ^ab"ECMA-113 - Ecma International"(PDF).
- ^abCzyborra, Roman (1998-11-30) [1998-05-25]."The Cyrillic Charset Soup".Archived fromthe originalon 2016-12-03.Retrieved2016-12-03.
- ^"gost19768-87 TXT.GZ file".
- ^European Computer Manufacturers Association (1 May 1988).Cyrillic part of the Latin/Cyrillic Alpha bet(PDF).ITSCJ/IPSJ.ISO-IR-144.
- ^Sokolov, Michael (2003-04-05)."ECMA-cyrillic alias iso-ir-111 sore".IETFCharsets Mailing List(Mailing list).
- ^abNational Standards Authority of Ireland.Uralic Supplementary Cyrillic Set(PDF).ITSCJ/IPSJ.ISO-IR-200.
- ^Gunn, Marion; Everson, Michael (2001-09-20)."Everson Gunn Teoranta (EGT) & Everson Typography".Unicode Mail List(Mailing list).
- ^"Cpi/CPIISO/Codepage.TXT at master · FDOS/Cpi".GitHub.
- ^abNational Standards Authority of Ireland.Volgaic Supplementary Cyrillic Set(PDF).ITSCJ/IPSJ.ISO-IR-201.
- ^"Cpi/CPIISO/Codepage.TXT at master · FDOS/Cpi".GitHub.
External links
[edit]- ISO-IR 144Cyrillic part of the Latin/Cyrillic Alphabet(May 1, 1988, from ISO 8859-5 2nd version)
- ISO/IEC 8859-5:1999
- Standard ECMA-113:8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet3rd edition (December 1999)