BCD (character encoding)

(Redirected fromRecordmark character)

BCD(binary-coded decimal), also calledAlpha numeric BCD,Alpha meric BCD,BCD Interchange Code,[1]orBCDIC,[1]is a family of representations of numerals, uppercase Latin letters, and some special and control characters assix-bit character codes.

BCD Interchange Codes
Classification6-bitAlpha numericbasic Latinencodings
Succeeded byEBCDIC

Unlike later encodings such asASCII,BCD codes were not standardized. Different computer manufacturers, and even different product lines from the same manufacturer, often had their own variants, and sometimes included unique characters. Other six-bit encodings with completely different mappings, such as someFIELDATA[1]variants orTranscode,are sometimes incorrectly termed BCD.

Many variants of BCD encode the characters '0' through '9' as the corresponding binary values.

History

edit

Technically,binary-coded decimaldescribes the encoding of decimal numbers where each decimal digit is represented by a fixed number of bits, usually four.

With the introduction of theIBM cardin 1928, IBM created acode[a]capable of representing Alpha numeric information,[2]later adopted by other manufacturers. This code represents the numbers 0-9 by a single punch, and uses multiple punches for upper-case letters and special characters.[3]A letter has two punches (zone [12,11,0] + digit [1–9]); most special characters have two or three punches (zone [12,11,0,or none] + digit [2–7] + 8).

The BCD code is the adaptation of the punched card code to a six-bitbinary codeby encoding the digit rows (nine rows, plus unpunched) into the low four bits, and the zone rows (three rows, plus unpunched) into the high two bits.[4]The digit zero (a single punch in row 0) is usually handled specially in some way, and the digit code was extended to values 10 through 15 by combining a digit in the range 2–7 with a punch in row 8.IBMapplied the termsbinary-coded decimalandBCDto the variations of BCDAlpha mericsused in most early IBM computers, including theIBM 1620,IBM 1400 series,and non-Decimal Architecturemembers of theIBM 700/7000 series.

Among the vendors using BCD wereBurroughs,[5]Bull,CDC,[6]IBM,General Electric(the computer division was purchased byHoneywellin 1969),NCR,Siemens,andSperry-UNIVAC.

IBM announced the 8-bitExtended Binary Coded Decimal Interchange Code(EBCDIC), based on BCDIC, in 1964 with the introduction of itsSystem/360line.

Special characters

edit

TheRecordmarkorRecord markcharacter (represented as ‡) is a character used to mark the end of arecord.[7]The BCD code for this character is 328in some BCD variants. The closest Unicode equivalent isU+29E7THERMODYNAMIC,but that is not found in many fonts, soU+2021DOUBLE DAGGERis often used instead. Functionally this corresponds to the EBCDICIRScharacter (ASCIIRS), X'1E'.

TheGroupmarkorGroup mark character(represented as) is a character used to indicate the start or finish of a group of related fields.[8]The BCD code for this character is 778in some BCD variants. The groupmark was proposed for Unicode standardization in 2015,[9]and was assigned to valueU+2BD2GROUP MARK.Functionally this corresponds to the EBCDICIGScharacter (ASCIIGS), X'1D'. It is now in Unicode 10.0 at this position, but only the Symbola and Unifont fonts support it.

TheWordmark,by contrast, isnota BCD character. Rather, it is a flag bit used to mark the end of a word on somevariable word length computerssuch as theIBM 1401.

BCD code variations

edit

There are many different versions of the six-bit BCD code. There are three major categories of difference:

  1. The mapping from zone punches to high-order bits. All codes translate no zone punches to a bit pattern of 00, but some encode the zone punches in 12-11-0 order, preserving Alpha betical order, while others use 0-11-12 order, resulting in a partially reversed Alpha bet.
  2. The handling of the digit 0. The straightforward translation from punched form would place the blank before digits 1–9, and encode 0 at the start of the line with 'S' in it. All codes have some special-case handling which either translates the digit 0 to the all-zero binary code (and moves the blank elsewhere), or gives it binary code 001010 (decimal 10) and moves the 8+2 punch elsewhere.
  3. The assignment of special characters. The characters assigned to codes beyond the basic Alpha numeric set varied widely, even within one model of computer. For example, some computers[b]had the percent and lozenge (U+2311SQUARE LOZENGE) at the same codes as left and right parentheses in other[c]encodings.

In "Spanish speaking countries", the character"Ñ"did not exist in the original system, therefore"@"was chosen by most manufacturers: Bull, NCR, and Control Data, but there was an inconsistency when merging databases to 7-bit ASCII code, for in that coding system the"/"character was chosen, resulting in two different codes for the same character.

Examples of BCD codes

edit

The following charts show the numeric values of BCD characters inhexadecimal(base-16) notation, as that most clearly reflects the structure of 4-bit binary coded decimal, plus two extra bits. For example, the code for 'A', in row 3x and column x1, is hexadecimal 31, or binary '11 0001'.

Tape style

edit

48-character BCD code

edit

The first versions of BCDIC had 48 characters, as they were based on card punch patterns and the character sets of printers, neither of which encouraged having a power-of-two number of characters.

IBM 48-character BCDIC code[1]: 68 
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 # @
1x / S T U V W X Y Z , %
2x - J K L M N O P Q R $ *
3x & A B C D E F G H I .

This was based on a 40-character punched card code; the original 37 (10 digits, 26 letters, and blank), plus three commercially important characters added around 1932:[1]: 67 hyphen-minusused for printing credit balances and hyphenated names, theampersandalso used in many names and addresses (Procter & Gamble,Mr. & Mrs. Smith), and theasteriskused to overprint unused fields when printingcheques.

IBM 1401 BCD code

edit

Rather than following the IBM 704's storage representation,IBM 1401followed the tape representation (descended from the 48-character BCD), thus using the all-zero code for blank and the code 10 (0x0A) for the digit zero. It had defined character forms for all possible values, for documentation purposes,[10]but only 48 of the 63 non-blank characters were printable, and there was considerable variation in how the other code values (shaded in the table below) were depicted in practice. Even the other characters varied between different available print chains for theIBM 1403printer.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 # @ : >
1x ¢ / S T U V W X Y Z , % = ' "
2x - J K L M N O P Q R ! $ * ) ; Δ
3x & A B C D E F G H I ? . ( <

Code page 353

edit

The BCDIC-A Code page was assigned asCode page 353,also known asCP353.Some of the characters in this code page are not in Unicode. (The duplication of '#' can be found in IBM's own documentation and is not a mistake here.[11])

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 # @ : >
1x / S T U V W X Y Z , % γ \
2x - J K L M N O P Q R ! # * ] ; Δ
3x & A B C D E F G H I ? . [ <

At 0x1A is the record mark. At 0x3F is the group mark.

Code page 354

edit

The BCDIC-B Code page was assigned asCode page 354,also known asCP354.[12]Some of the characters in this code page are not in Unicode.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 ' : >
1x / S T U V W X Y Z , ( γ \
2x - J K L M N O P Q R ! # * ] ; Δ
3x + A B C D E F G H I ? . ) [ <

At 0x1A is the record mark. At 0x3F is the group mark.

PTTC/BCD code pages

edit

PTTC/BCD had 5 options. There were five code pages. They are shown below. The PTTC/BCD Standard Option was assigned asCode page 355,orCP355.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 #
1x @ / S T U V W X Y Z , γ
2x - J K L M N O P Q R < $
3x & A B C D E F G H I ) .

The PTTC/BCD H Option was assigned asCode page 357,orCP357.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 =
1x ' / S T U V W X Y Z ,
2x - J K L M N O P Q R ! $
3x + A B C D E F G H I ? .

The PTTC/BCD Correspondence Option was assigned asCode page 358,orCP358.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 '
1x ! / S T U V W X Y Z ,
2x - J K L M N O P Q R < ;
3x = A B C D E F G H I > .

The PTTC/BCD Monocase Option was assigned asCode page 359,orCP359.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 #
1x @ / S T U V W X Y Z ,
2x - J K L M N O P Q R $
3x & A B C D E F G H I .

The PTTC/BCD Duocase Option was assigned asCode page 360,orCP360.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x space 1 2 3 4 5 6 7 8 9 0 #
1x @ / S T U V W X Y Z ,
2x - J K L M N O P Q R $
3x & A B C D E F G H I .

IBM 704 storage style

edit

IBM 704 BCD code

edit

The IBM 704 reordered the BCDIC code to allow a normal Alpha betic collating order internally, with 0 before 1 and A before Z. It could automatically translate between this internal form and the earlier BCDIC when reading and writingmagnetic tapes.[13]: 35 

The following table shows the code assignments for theIBM 704computer. Unassigned code positions appear as blanks.[13]: 35 

IBM 704 character set
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x 0 1 2 3 4 5 6 7 8 9 # @
1x & A B C D E F G H I +0 .
2x - J K L M N O P Q R 0 $ *
3x space / S T U V W X Y Z , %

(+0and0were rarely used characters that corresponded to the punched-card convention of a digit 0 with an overpunched sign in rows 12 or 11.)

The following table shows the code assignments for thetype 716 printerused starting with the IBM 704 computer and through the 7094.[13]: 58 The 704 interface[d]sent virtual punched-card rows to this printer, two words (72 bits) at a time, so the mapping from 6-bit BCD characters was done by software, and was not built into the printer.

IBM 716 printer character set G
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x * 1 2 3 4 5 6 7 8 9 + -
1x + A B C D E F G H I .
2x - J K L M N O P Q R $ *
3x 0 / S T U V W X Y Z , %

This is a repertoire of 45 characters (not counting blank, which is handled specially by the printer), as the characters+,-and*are duplicated.

Fortran character set

edit

There was some variation; IBM 704Fortranhad a different set of special characters (preserving only the duplicated minus sign).[14]

IBM 716 printer Fortran character set
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x * 1 2 3 4 5 6 7 8 9 = -
1x + A B C D E F G H I . )
2x - J K L M N O P Q R $ *
3x 0 / S T U V W X Y Z , (

A similar code was used for theIBM 709,7090and7094successors,[15]but with some of the special characters reassigned:

IBM 7090/7094 character set
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x 0 1 2 3 4 5 6 7 8 9 = "
1x & A B C D E F G H I +0 . )
2x - J K L M N O P Q R 0 $ *
3x space / S T U V W X Y Z ± , (

GBCD code

edit

Below is the table of GE/Honeywell's GBCD code, a variant of BCD.[16]

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x 0 1 2 3 4 5 6 7 8 9 [ # @ : > ?
1x space A B C D E F G H I & . ] ( < \
2x ^ J K L M N O P Q R - $ * ) ; '
3x + / S T U V W X Y Z _ , % = " !

Burroughs B5500 BCD code

edit

The following table shows the code assignments for theBurroughs B5500computer, sometimes referred to as BIC (Burroughs Interchange Code).[17]

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x 0 1 2 3 4 5 6 7 8 9 # @ ? : >
1x + A B C D E F G H I . [ & ( <
2x × J K L M N O P Q R $ * - ) ;
3x space / S T U V W X Y Z , % = ] "

See also

edit

Notes

edit
  1. ^There are actually multiple card codes, e.g, by 1964 there were ten versions of theIBM 026with slightly different character sets.
  2. ^E.g.,IBM 702,IBM 705
  3. ^E.g.,IBM 701,IBM 704.
  4. ^The interface on, e.g., the 7090, is different, although the software still must do mapping.

References

edit
  1. ^abcdeMackenzie, Charles E. (1980).Coded Character Sets, History and Development(PDF).The Systems Programming Series (1 ed.).Addison-Wesley Publishing Company, Inc.ISBN0-201-14460-3.LCCN77-90165.Archived(PDF)from the original on 2016-05-26.Retrieved2017-04-22.[1]
  2. ^Pugh, Emerson W.; Heide, Lars."STARS:Punched Card Equipment".IEEE Global History Network. Archived fromthe originalon 2012-05-11.Retrieved2012-06-09.
  3. ^Pugh, Emerson W. (1995).Building IBM: Shaping and Industry and Its Technology.MIT Press.pp.50–51.ISBN978-0-262-16147-3.
  4. ^Jones, Douglas W."Punched Card Codes".Retrieved2014-01-01.
  5. ^Burroughs B5500 Information Processing Systems: Reference Manual(PDF).Burroughs Corporation.1964. Archived fromthe original(PDF)on 2020-07-29.Retrieved2012-06-08.
  6. ^Control Data Corporation(1965).Codes/Control Data 6600 Computer System(PDF).
  7. ^"Record-mark".Encyclopedia.PC Magazine.Retrieved2016-04-09.
  8. ^"group mark".Encyclopedia.Retrieved2016-04-09.
  9. ^Shirriff, Ken."Proposal for addition of Group Mark symbol"(PDF).unicode.org.Retrieved2016-04-09.
  10. ^IBM 1401 Data Processing System: Reference Manual(PDF).IBM.April 1962. p. 170. A24-1403-5. Archived fromthe original(PDF)on 2012-03-14.
  11. ^"Systems i Software Globalization cp00353z"(PDF).www-03.ibm.Archived fromthe original(PDF)on 2013-01-21.Retrieved2022-06-30.
  12. ^https://ccsids.net/ccsids.html#ccsid-354.{{cite web}}:Missing or empty|title=(help)
  13. ^abcIBM 704 electronic data-processing machine manual of operation(PDF).IBM.1955. pp. 35, 58. Form 24-6661-2.Retrieved2017-04-22.
  14. ^"Fortran Automatic Coding System for the IBM 704"(PDF).IBM.1956-10-15. p. 49.Retrieved2015-09-15.
  15. ^Harper, Jack (2001-08-21)."IBM 7090/94 Character Representation".Retrieved2017-04-22.
  16. ^"Section: Tables of characters in BULL computers"(PDF).Archived fromthe original(PDF)on 2011-07-08.Retrieved2010-11-15.
  17. ^Burroughs B 5500 Information Processing Systems Extended Algol Reference Manual(PDF).1966. p. B-1.

Further reading

edit