Incomputer science,anescape sequenceis a combination ofcharactersthat has a meaning other than the literal characters contained therein;[1]it is marked by one or more preceding (and possibly terminating) characters.[2]

Examples

edit
  • InCand many derivative programming languages, a string escape sequence is a series of two or more characters,starting with a backslash\.[3]
    • Note that in C a backslash immediately followed by a newline doesnotconstitute an escape sequence, but splices physical source lines into logical ones in the second translation phase, whereas string escape sequences are converted in the fifth translation phase.[4]
    • To represent the backslash character itself,\\can be used, whereby the first backslash indicates an escape and the second specifies that a backslash is being escaped.[5]
    • A character may be escaped in multiple different ways. Assuming ASCII encoding, the escape sequences\x5c(hexadecimal),\\,and\134(octal) all encode the same character: the backslash\.
  • For devices that respond toANSI escapesequences, the combination of three or more characters beginning with the ASCII "escape" character (decimal character code 27) followed by the left-bracket character[(decimal character code 91) defines an escape sequence.

Control sequences

edit

When directed, this series ofcharactersis used to change thestateofcomputersand their attachedperipheraldevices, rather than to be displayed or printed as regulardatabytes would be, these are also known ascontrol sequences,reflecting their use in device control, beginning with theControl Sequence Initiator- originally the "escape character" ASCII code - character 27 (decimal) - often written "Esc" onkeycaps.

With the introduction of ANSI terminals most escape sequences began with thetwocharacters "ESC" then "[" or a specially-allocatedCSIcharacter with a code 155 (decimal).

Not all control sequences used an escape character; for example:

  • Data Generalterminal control sequences,[8][9][10]but they often were still called escape sequences, and the very common use of "escaping" special characters in programming languages and command-line parameters today often use the "backslash" character to begin the sequence.

Escape sequences in communications are commonly used when a computer and a peripheral have only a single channel through which to send information back and forth (so escape sequences are an example ofin-band signaling).[11][12]They were common when mostdumb terminalsusedASCIIwith 7 data bits for communication, and sometimes would be used to switch to a different character set for "foreign" or graphics characters that would otherwise been restricted by the 128 codes available in 7 data bits. Even relatively "dumb" terminals responded to some escape sequences, including the original mechanical Teletype printers (on which "glass Teletypes" or VDUs were based) responded to characters 27 and 31 to alternate between letters and figures modes.

Keyboard

edit

An escape character is usually assigned to theEsc keyon acomputer keyboard,and can be sent in other ways than as part of an escape sequence. For example, the Esc key may be used as an input character in editors such asvi,[13]or for backing up one level in a menu in some applications.[14]The Hewlett PackardHP 2640terminals had a key for a "display functions" mode which would display graphics for all control characters, including Esc, to aid indebuggingapplications.

If the Esc key and other keys that send escape sequences are both supposed to be meaningful to an application, an ambiguity arises if acharacter terminalis in use. When the application receives theASCIIescape character, it is not clear whether that character is the result of the user pressing the Esc key or whether it is the initial character of an escape sequence (e.g., resulting from an arrow key press). The traditional method of resolving the ambiguity is to observe whether or not another character quickly follows the escape character. If not, it is assumed not to be part of an escape sequence. Thisheuristiccan fail under some circumstances, especially without fast modern communication speeds.

Escape sequences date back at least to the 1874Baudot code.[15][16][17]

Modem control

edit

TheHayes command set,for instance, defines a single escape sequence,+++.(In order to interpret+++,which may be a part of data, as the escape sequence, the sender stops communication for one second before and after the+++.) When the modem encounters this in a stream of data, it switches from its normal mode of operation, which simply sends any characters to the phone, to a command mode in which the following data is assumed to be a part of the command language. You can switch back to theonline modeby sending the O command.

The Hayes command set ismodal,switching from command mode to online mode.[18][19]This is not appropriate in the case where the commands and data will switch back and forth rapidly. An example of a non-modal escape sequence control language is theVT100,which used a series of commands prefixed by aControl Sequence Introducer.

Comparison with control characters

edit

A control character is a character that, in isolation, has some control function, such ascarriage return(CR). Escape sequences, by contrast, consist of one or moreescape characterswhich change the interpretation of subsequent characters.

ASCII video data terminals

edit

TheVT52terminal used simpledigraphcommands like escape-A: in isolation, "A" simply meant the letter "A", but as part of the escape sequence "escape-A", it had a different meaning. The VT52 also supported parameters: it was not a straightforward control language encoded as substitution.

The laterVT100terminal implemented the more sophisticatedANSI escape sequencesstandard (now ECMA-48) for functions such as controlling cursor movement, character set, and display enhancements. The Hewlett PackardHP 2640series had perhaps the most elaborate escape sequences for block and character modes, programming keys and their soft labels, graphics vectors, and even saving data to tape or disk files.

Use in DOS and Windows

edit

A utility,ANSI.SYS,[20]can be used to enable the interpreting of the ANSI (ECMA-48) terminal escape sequences underDOS(by using$ein thePROMPTcommand) or in command windows in 16-bitWindows.The rise ofGUIapplications, which directly write to display cards, has greatly reduced the usage of escape sequences on Microsoft platforms, but they can still be used to create interactive random-access character-based screen interfaces with the character-based library routines such asprintfwithout resorting to a GUI program.

Use in Linux and Unix displays

edit

The default text terminal, and text windows (such as usingxterm) respond to ANSI escape sequences.

Quoting escape

edit

Overview

edit

When anescape characteris needed within the quoted/escaped string, there are two strategies used within programming and scripting languages:

  • doubled delimiter (e.g.'He didn''t do it.')[21]
  • secondary escape sequence

An example of the latter is in the use of the caret (^). E.g. this outputs "You can do so via Cut&Paste" inCMD.(otherwise, the ampersand has a restricted use)[22]

echo You can do so via Cut^&Paste

In detail

edit

A common use of escape sequences is in fact to remove control characters found in a binary data stream so that they will not cause their control function by mistake. In this case, the control character is replaced by a defined "escape character" (which need not be the US-ASCII escape character) and one or more other characters; after exiting the context where the control character would have caused an action, the sequence is recognized and replaced by the removed character.[22]To transmit the "escape character" itself, two copies are sent.[21]

In manyprogramming languagesand command line interfaces escape sequences are used incharacter literalsandstring literals,to express characters which are not printable or clash with the syntax of characters or strings. For example,control charactersthemselves might not be allowed to be placed in the program coded by the editor program, or may have undesirable side-effects if typed into a command. The end-of-quote character is also a problem for programmers that can be solved by escaping it. In most contexts the escape character is thebackslash( "\").

Samples

edit

For example, the single quotation mark character might be expressed as'\''since writing'''is not acceptable.

Many modern programming languages specify the doublequote character (") as adelimiterfor a string literal. The backslash escape character typically provides ways to include doublequotes inside a string literal, such as by modifying the meaning of the doublequote character embedded in the string (\ "), or by modifying the meaning of a sequence of characters including the hexadecimal value of a doublequote character (\x22). Both sequences encode a literal doublequote (").

InPerlorPython2

print"Nancy said"HelloWorld!"to the crowd.";

produces a syntax error, whereas:

print"Nancy said \" Hello World!\ "to the crowd.";### example of \ "

produces the intended output. Another alternative:

print"Nancy said \x22Hello World!\x22 to the crowd.";### example of \x22

uses "\x" to indicate the following two characters are hexadecimal digits, "22" being the ASCII value for a doublequote in hexadecimal.

C,C++,Java,andRubyall allow exactly the same two backslash escape styles. ThePostScriptlanguage and MicrosoftRich Text Formatalso use backslash escapes. Thequoted-printableencoding uses theequals signas an escape character.

URLandURIusepercent-encodingto quote characters with a special meaning, as for non-ASCII characters.

Another similar (and partially overlapping) syntactic trick isstropping.

Some programming languages also provide other ways to represent special characters in literals, without requiring an escape character (see e.g.delimiter collision).

See also

edit

References

edit
  1. ^"Escape Sequence".
  2. ^"Characters".The Java Tutorials.
  3. ^"Escape Sequences".3 August 2021.Character combinations consisting of a backslash\followed by a letter or by a combination of digits are calledescape sequences.
  4. ^"ISO/IEC 9899:201x Committee Draft N1570"(PDF).5.1.1.2 Translation phases, 2.: Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. [...]
  5. ^"Escape sequences".IBM.
  6. ^"Chapter 5 – AT Commands"(PDF).
  7. ^"AT Command Set and Register Summary for Analog Modem Modules".
  8. ^"Data General terminals: discussion of".
  9. ^"What's a Terminal?".
  10. ^"Data General DG210 DG211 Terminal Emulation Software".
  11. ^"Escape sequence".
  12. ^"Terminals & Printers Handbook Glossary".
  13. ^"Twelve Useful" vi "Commands".vi commands […] Pressing the Esc (Escape) key is how you […]
  14. ^"Five Unexpected Uses for the Esc Key".PCworld.2009-10-29.
  15. ^"What is ASCII? The Economist explains".The Economist.2013-06-09.
  16. ^"Baudot and CCITT code".The Baudot code, invented in 1870 and patented in 1874 by J. Baudot is […]
  17. ^"Guide to the use of Character Sets in Europe".elements C0 and C1 of control characters […] a 5-bit code patented by Jean-Maurice-Emile Baudot (1845-1903) in 1874
  18. ^"Basic Hayes AT Command Set".2011-02-05.+++ - "Escape Sequence" - This command initiates an escape sequence to return the modem to the on-line command mode
  19. ^"Modem Programming Basics".When a modem is in command mode, the modem can accept commands from you
  20. ^17. Understanding ANSI.SYS - Special Edition Using MS-DOS 6.22.
  21. ^ab"Apostrophe Editing ('aaa') (FORTRAN 77 Language Reference)".Within the field, two consecutive apostrophes […]
  22. ^ab"The Windows NT Command Shell".20 February 2014.