Jump to content

ISO/IEC 8859-5

From Wikipedia, the free encyclopedia
(Redirected fromISO 8859-5)
ISO-8859-5
Alias(es)ISO-IR-144, Cyrillic, csISOLatinCyrillic[1]
Language(s)Russian,Bulgarian,Belarusian,Macedonian,Serbian,Ukrainian(partial)
StandardISO/IEC 8859-5,
ECMA-113 (since 1988 edition)
ClassificationExtended ASCII,ISO 8859
ExtendsUS-ASCII,ISO-IR-153
Based onMain code page[2]
ExtensionsIBM-915
Preceded byECMA-113:1986 (ISO-IR-111)
Other related encoding(s)IBM-1124

ISO/IEC 8859-5:1999,Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic Alpha bet,is part of theISO/IEC 8859series of ASCII-based standardcharacter encodings,first edition published in 1988. It is informally referred to asLatin/Cyrillic.

It was designed to cover languages using aCyrillic Alpha betsuch asBulgarian,Belarusian,Russian,SerbianandMacedonianbut was never widely used. The 8-bit encodingsKOI8-RandKOI8-U,IBM-866,and alsoWindows-1251are far more commonly used. In contrast to the relationship betweenWindows-1252andISO 8859-1,Windows-1251 is not closely related to ISO 8859-5. However, the mainCyrillic blockinUnicodeuses a layout based on ISO-8859-5.

ISO 8859-5 would also have been usable forUkrainianin theSoviet Unionfrom 1933 to 1990, but it is missing theUkrainian letterge,ґ, which is required inUkrainian orthographybefore and since, and during that periodoutside Soviet Ukraine.As a result, IBM createdCode page 1124.

ISO-8859-5is theIANApreferred charset name for this standard when supplemented with theC0 and C1 control codesfromISO/IEC 6429.TheWindows code pagefor ISO-8859-5 iscode page 28595a.k.a.Windows-28595.[3]IBM assignedcode page 915to ISO-8859-5 until that code page was extended.

Codepage layout

[edit]

Differences fromISO 8859-1are shown with itsUnicodeequivalent code point.

ISO/IEC 8859-5
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x
9x
Ax NBSP Ё
0401
Ђ
0402
Ѓ
0403
Є
0404
Ѕ
0405
І
0406
Ї
0407
Ј
0408
Љ
0409
Њ
040A
Ћ
040B
Ќ
040C
SHY Ў
040E
Џ
040F
Bx А
0410
Б
0411
В
0412
Г
0413
Д
0414
Е
0415
Ж
0416
З
0417
И
0418
Й
0419
К
041A
Л
041B
М
041C
Н
041D
О
041E
П
041F
Cx Р
0420
С
0421
Т
0422
У
0423
Ф
0424
Х
0425
Ц
0426
Ч
0427
Ш
0428
Щ
0429
Ъ
042A
Ы
042B
Ь
042C
Э
042D
Ю
042E
Я
042F
Dx а
0430
б
0431
в
0432
г
0433
д
0434
е
0435
ж
0436
з
0437
и
0438
й
0439
к
043A
л
043B
м
043C
н
043D
о
043E
п
043F
Ex р
0440
с
0441
т
0442
у
0443
ф
0444
х
0445
ц
0446
ч
0447
ш
0448
щ
0449
ъ
044A
ы
044B
ь
044C
э
044D
ю
044E
я
044F
Fx
2116
ё
0451
ђ
0452
ѓ
0453
є
0454
ѕ
0455
і
0456
ї
0457
ј
0458
љ
0459
њ
045A
ћ
045B
ќ
045C
§
00A7
ў
045E
џ
045F
[edit]

The ECMA-113 standard has been equivalent to ISO-8859-5 since its second edition,[4]its first edition (ISO-IR-111) having been an extension of the earlierKOI-8(defined by GOST 19768-74), which lays out the Russian letters in the same way as their ASCII Roman equivalents where possible. The initial draft of ISO-8859-5 (DIS-8859-5:1987) followed ISO-IR-111, but was revised[4]after GOST 19768-74 was replaced[5]by the newISO-IR-153in 1987, which re-arranged the Russian letters into Alpha betical order (except for Ё).[5][6]ISO-IR-153 contains the Russian letters, including Ё, and the non-breaking space and soft hyphen, whereas the full Cyrillic set of ISO-8859-5 is also called ISO-IR-144.[7]

Possibly as a consequence of this confusion,RFC1345erroneously listsyet another code pageas "ISO-IR-111", combining the letter order and case order of ISO-8859-5 with the row order of ISO-IR-111 (and consequently compatible with neither in practice, but in practice partially compatible[2]withWindows-1251).[8][2]

IBMCode page 915is an extension of ISO/IEC 8859-5, adding somesemigraphicand other symbols in theC1area. IBMCode page 1124is mostly identical to ISO-8859-5, but replaces ѓ with ґ forUkrainianuse.

ISO-IR-200,"Uralic Supplementary Cyrillic Set",[9]was registered in 1998 by Everson Gunn Teoranta (whichMichael Eversonwas a director of, prior to the founding ofEvertypein 2001),[10]and changes several of the non-Russian letters in order to support theKildin Sami,KomiandNenetslanguages, not supported by ISO-8859-5 itself. Michael Everson also introducedMac OS Barents Cyrillicfor the same languages on classic Mac OS. FreeDOS calls itcode page 59283.[11]

ISO-IR 200[9](differences from ISO-8859-5)
0 1 2 3 4 5 6 7 8 9 A B C D E F
Ax NBSP Ё Ӈ
04C7
Ӓ
04D2
Ӭ
04EC
Ҍ
048C
І Ӧ
04E6
Ҋ
048A
Ӆ
04C5
Ӊ
04C9
«
00AB
Ӎ
04CD
SHY Ҏ
048E
ʼ
02BC
Fx ё ӈ
04C8
ӓ
04D3
ӭ
04ED
ҍ
048D
і ӧ
04E7
ҋ
048B
ӆ
04C6
ӊ
04CA
»
00BB
ӎ
04CE
§ ҏ
048F
ˮ
02EE

ISO-IR-201, "Volgaic Supplementary Cyrillic Set",[12]was similarly introduced by Everson Gunn Teoranta in order to support theChuvash,Komi,MariandUdmurtlanguages, spoken in the titularrepublics of Russia.FreeDOS calls itcode page 58259.[13]

ISO-IR 201[12](differences from ISO-8859-5)
0 1 2 3 4 5 6 7 8 9 A B C D E F
Ax NBSP Ё Ӑ
04D0
Ӓ
04D2
Ӗ
04D6
Ҫ
04AA
І Ӧ
04E6
Ӥ
04E4
Ӝ
04DC
Ҥ
04A4
Ӹ
04F8
Ӟ
04DE
SHY Ӱ
04F0
Ӵ
04F4
Fx ё ӑ
04D1
ӓ
04D3
ӗ
04D7
ҫ
04AB
і ӧ
04E7
ӥ
04E5
ӝ
04DD
ҥ
04A5
ӹ
04F9
ӟ
04DF
§ ӱ
04F1
ӵ
04F5

References

[edit]
  1. ^Character Sets,Internet Assigned Numbers Authority(IANA), 2018-12-12.
  2. ^abcNechayev, Valentin (2013) [2001]."Review of 8-bit Cyrillic encodings universe".Archivedfrom the original on 2016-12-05.Retrieved2016-12-05.
  3. ^"Code Page Identifiers".7 January 2021.
  4. ^ab"ECMA-113 - Ecma International"(PDF).
  5. ^abCzyborra, Roman (1998-11-30) [1998-05-25]."The Cyrillic Charset Soup".Archived fromthe originalon 2016-12-03.Retrieved2016-12-03.
  6. ^"gost19768-87 TXT.GZ file".
  7. ^European Computer Manufacturers Association (1 May 1988).Cyrillic part of the Latin/Cyrillic Alpha bet(PDF).ITSCJ/IPSJ.ISO-IR-144.
  8. ^Sokolov, Michael (2003-04-05)."ECMA-cyrillic alias iso-ir-111 sore".IETFCharsets Mailing List(Mailing list).
  9. ^abNational Standards Authority of Ireland.Uralic Supplementary Cyrillic Set(PDF).ITSCJ/IPSJ.ISO-IR-200.
  10. ^Gunn, Marion; Everson, Michael (2001-09-20)."Everson Gunn Teoranta (EGT) & Everson Typography".Unicode Mail List(Mailing list).
  11. ^"Cpi/CPIISO/Codepage.TXT at master · FDOS/Cpi".GitHub.
  12. ^abNational Standards Authority of Ireland.Volgaic Supplementary Cyrillic Set(PDF).ITSCJ/IPSJ.ISO-IR-201.
  13. ^"Cpi/CPIISO/Codepage.TXT at master · FDOS/Cpi".GitHub.
[edit]
  • ISO-IR 144Cyrillic part of the Latin/Cyrillic Alphabet(May 1, 1988, from ISO 8859-5 2nd version)
  • ISO/IEC 8859-5:1999
  • Standard ECMA-113:8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet3rd edition (December 1999)