In acomputer language,areserved word(also known as areserved identifier) is a word that cannot be used as anidentifier,such as the name of a variable, function, orlabel– it is "reserved from use". This is asyntacticdefinition, and a reserved word may have no user-defined meaning.

A closely related and often conflated notion is akeyword,which is a word with special meaning in a particular context. This is asemanticdefinition. By contrast, names in astandard librarybut not built into a language are not considered reserved words or keywords. The terms "reserved word" and "keyword" are often used interchangeably – one may say that a reserved word is "reserved for use as a keyword" – and formal use varies from language to language. For this article, we distinguish as above.

In general reserved words and keywords need not coincide, but in most modern languages keywords are a subset of reserved words, as this makes parsing easier, since keywords cannot be confused with identifiers. In some languages, likeCorPython,reserved words and keywords coincide, while in other languages, likeJava,all keywords are reserved words, but some reserved words are not keywords, being reserved for future use. In yet other languages, such asALGOL,FORTRAN,ooRexx,PL/I,andREXXthere are keywords but no reserved words, with keywords being distinguished from identifiers by other means.

Distinction

edit

The sets of reserved words and keywords in a language often coincide or are almost equal, and the distinction is subtle, so the terms are often used interchangeably. However, in careful use they are distinguished.

Making keywords be reserved words makeslexingeasier, as a string of characters will unambiguously be either a keyword or an identifier, without depending on context; thus keywords are usually a subset of reserved words. However, reserved words need not be keywords. For example, in Java,gotois a reserved word, but has no meaning and does not appear in any production rules in the grammar. This is usually done forforward compatibility,so a reserved word may become a keyword in a future version without breaking existing programs.

Conversely, keywords need not be reserved words, with their role understood from context, or they may be distinguished in another manner, such as bystropping.For example, the phraseif = 1is unambiguous in most grammars, since a control statement of an if clause cannot start with an=,and thus is allowed in some languages, such asFORTRAN.Alternatively, inALGOL 68,keywords must be stropped – marked in some way to distinguished – in the strict language by listing in bold, and thus are not reserved words. Thus in the strict language the following expression is legal, as the bold keywordifdoes not conflict with the ordinary identifierif:

ififeq0then1fi

However, in ALGOL 68 there is also a stropping regime in which keywords are reserved words, an example of how these distinct concepts often coincide; this is followed in many modern languages.

Syntax

edit

A reserved word is one that "looks like" a normal word, but is not allowed to be used as a normal word. Formally this means that it satisfies the usuallexical syntax(syntax of words) of identifiers – for example, being a sequence of letters – but cannot be used where identifiers are used. For example, the wordifis commonly a reserved word, whilexgenerally is not, sox = 1is a valid assignment, butif = 1is not.

Keywords have varied uses, but mainly fall into a few classes: part of the phrase grammar (specifically aproduction rulewithnonterminal symbols), with various meanings, often being used forcontrol flow,such as the wordifin most procedural languages, which indicates aconditionaland takes clauses (the nonterminal symbols); names of primitive types in a language that support atype system,such asint;primitiveliteralvalues such astruefor Boolean true; or sometimes special commands likeexit.Other uses of keywords in phrases are for input/output, such asprint.

The distinct definitions are clear when a language is analyzed by a combination of a lexer and a parser, and the syntax of the language is generated by alexical grammarfor the words, and acontext-free grammarofproduction rulesfor the phrases. This is common in analyzing modern languages, and in this case keywords are a subset of reserved words, as they must be distinguished from identifiers at the word level (hence reserved words) to be syntactically analyzed differently at the phrase level (as keywords).

In this case reserved words are defined as part of the lexical grammar, and are each tokenized as a separate type, distinct from identifiers. In conventional notation, the reserved wordsifandthenfor example are tokenized as typesIFandTHEN,respectively, whilexandyare both tokenized as typeIdentifier.

Keywords, by contrast, syntactically appear in the phrase grammar, asterminal symbols.For example, the production rule for a conditional expression may beIF Expression THEN Expression.In this caseIFandTHENare terminal symbols, meaning "a token of typeIForTHEN,respectively "– and due to the lexical grammar, this means the stringiforthenin the original source. As an example of a primitive constant value,truemay be a keyword representing the Boolean value "true", in which case it should appear in the grammar as a possible expansion of the productionBinaryExpression,for instance.

Reserved ranges

edit

Beyond reserving specific lists of words, some languages reserve entire ranges of words, for use as private spaces for future language version, different dialects,compilervendor-specific extensions, or for internal use by a compiler, notably inname mangling.

This is most often done by using a prefix, often one or moreunderscores.C andC++are notable in this respect: C99 reserves identifiers that start with two underscores or an underscore followed by an uppercase letter, and further reserves identifiers that start with a single underscore (in the ordinary and tag spaces) for use infile scope;[1]with C++03 further reserves identifiers that contain a double underscore anywhere[2]– this allows the use of a double underscore as a separator (to connect user identifiers), for instance.

The frequent use of a double underscores in internal identifiers in Python gave rise to the abbreviationdunder;this was coined by Mark Jackson[3]and independently by Tim Hochberg,[4]within minutes of each other, both in reply to the same question in 2002.[5][6]

Specification

edit

The list of reserved words and keywords in a language are defined when a language is developed, and both form part of a language'sformal specification.Generally one wishes to minimize the number of reserved words, to avoid restricting valid identifier names. Further, introducing new reserved words breaks existing programs that use that word (it is not backwards compatible), so this is avoided. To prevent this and provideforward compatibility,sometimes words are reserved without having a current use (a reserved word that is not a keyword), as this allows the word to be used in future without breaking existing programs. Alternatively, new language features can be implemented as predefineds, which can be overridden, thus not breaking existing programs.

Reasons for flexibility include allowing compiler vendors to extend the specification by including non-standard features, different standard dialects of language to extend it, or future versions of the language to include additional features. For example, a procedural language may anticipate addingobject-orientedcapabilities in a future version or some dialect, at which point one might add keywords likeclassorobject.To accommodate this possibility, the current specification may make these reserved words, even if they are not currently used.

A notable example is inJava,whereconstandgotoare reserved words — they have no meaning in Java but they also cannot be used as identifiers. By reserving the terms, they can be implemented in future versions of Java, if desired, without breaking older Java source code. For example, there was a proposal in 1999 to add C++-likeconstto the language, which was possible using theconstword, since it was reserved but currently unused; however, this proposal was rejected – notably because even though adding the feature would not break any existing programs, using it in the standard library (notably in collections)wouldbreak compatibility.[7]JavaScriptalso contains a number of reserved words without special functionality; the exact list varies by version and mode.[8]

Languages differ significantly in how frequently they introduce new reserved words or keywords and how they name them, with some languages being very conservative and introducing new keywords rarely or never, to avoid breaking existing programs, while other languages introduce new keywords more freely, requiring existing programs to change existing identifiers that conflict. A case study is given by new keywords inC11compared withC++11,both from 2011 – recall that in C and C++, identifiers that begin with an underscore followed by an uppercase letter are reserved:[9]

The C committee prefers not to create new keywords in the user name space, as it is generally expected that each revision of C will avoid breaking older C programs. By comparison, the C++ committee (WG21) prefers to make new keywords as normal‐looking as the old keywords. For example, C++11 defines a newthread_localkeyword to designate static storage local to one thread. C11 defines the new keyword as_Thread_local.In the new C11 header <threads.h>, there is a macro definition to provide the normal‐looking name:[10]

#define thread_local _Thread_local

That is, C11 introduced the keyword_Thread_localwithin an existing set of reserved words (those with a certain prefix), and then used a separate facility (macro processing) to allow its use as if it were a new keyword without any prefixing, while C++11 introduce the keywordthread_localdespite this not being an existing reserved word, breaking any programs that used this, but without requiring macro processing.

Predefined names

edit

A related notion to reserved words are predefined functions, methods, subroutines, types, or variables, particularlylibrary routinesfrom the standard library. These are similar in that they are part of the basic language, and may be used for similar purposes. However, these differ in that the name of one of these entities is typically categorized as an identifier instead of a reserved word, and is not treated specially in the syntactic analysis. Further, reserved words may not be redefined by the programmer, but predefineds can often be overridden for the extent of somescope.

Languages vary as to what is provided as a keyword and what is a predefined. Some languages, for instance, provide keywords for input/output operations whereas in others these are library routines. InPython(versions earlier than 3.0) and manyBASICdialects,printis a keyword. In contrast, the C, Lisp, and Python 3.0 equivalentsprintf,format,andprintare functions in the standard library. Similarly, in Python prior to 3.0,None,True,andFalsewere predefined variables, but not reserved words, but in Python 3.0 they were made into reserved words.[11]

Definition

edit

Some use the terms "keyword" and "reserved word" interchangeably, while others distinguish usage, say by using "keyword" to mean a word that is special only in certain contexts but "reserved word" to mean a special word that cannot be used as a user-defined name. The meaning of keywords, and the meaning of the notion ofkeyword,differs widely from language to language. Concretely, in ALGOL 68, keywords are stropped (in the strict language, written in bold) and are not reserved words – the unstropped word can be used as an ordinary identifier.

The "JavaLanguage Specification "uses the term" keyword ".[12]The ISO 9899 standard for theClanguage uses the term "keyword".[13]

In many languages, such asCand similar environments likeC++,akeywordis a reserved word which identifies a syntactic form. Words used incontrol flowconstructs, such asif,then,andelseare keywords. In these languages, keywords cannot also be used as the names of variables or functions.

In some languages, such asALGOLandALGOL 68,keywords cannot be written verbatim, but must bestropped.This means that keywords must be marked somehow. E.g. by quoting them or by prefixing them by a special character. As a consequence, keywords are not reserved words, and thus the same word can be used for as a normal identifier. However, one stropping regime was to not strop the keywords, and instead have them simply be reserved words.

Some languages, such asPostScript,are extremely liberal in this approach, allowing core keywords to be redefined for specific purposes.

InCommon Lisp,the term "keyword" (or "keyword symbol" ) is used for a special sort ofsymbol,or identifier. Unlike other symbols, which usually stand for variables or functions, keywords are self-quotingand self-evaluating[14]:98and areinternedin theKEYWORDpackage.[15]Keywords are usually used to label named arguments to functions, and to represent symbolic values. The symbols which name functions, variables, special forms and macros in the package named COMMON-LISP are basically reserved words. The effect of redefining them is undefined in ANSI Common Lisp.[16]Binding them is possible. For instance the expression(if if case or)is possible, whenifis a local variable. The leftmostifrefers to theifoperator; the remaining symbols are interpreted as variable names. Since there is a separate namespace for functions and variables,ifcould be a local variable. In Common Lisp, however, there are two special symbols which are not in the keyword package: the symbolstandnil.When evaluated as expressions, they evaluate to themselves. They cannot be used as the names of functions or variables, so are de facto reserved.(let ((t 42)))is a well-formed expression, but theletoperator will not permit the usage.

Typically, when a programmer attempts to use a keyword for a variable or function name, a compilation error will be triggered. In most modern editors, the keywords are automatically set to have a particular text colour to remind or inform the programmers that they are keywords.

In languages withmacrosorlazy evaluation,control flow constructs such asifcan be implemented as macros or functions. In languages without these expressive features, they are generally keywords.

Comparison by languages

edit

Different languages often have widely varying numbers of reserved words. For example,COBOLhas about 400. Java, and otherCderivatives, have a rather sparse set, about 50. PurePrologand PL/I have none.

Disadvantages

edit

Definition of reserved words in a language raises problems. The language may be difficult for new users to learn because of a long list of reserved words to memorize which can't be used as identifiers. It may be difficult to extend the language because addition of reserved words for new features might invalidate existing programs or, conversely, "overloading" of existing reserved words with new meanings can be confusing. Porting programs can be problematic because a word not reserved by one system or compiler might be reserved by another.

Because reserved words cannot be used as identifiers, users may choose deliberate misspellings of reserved words as identifiers instead, such asclazzfor Java variables of typeClass.[17]

Reserved words and language independence

edit

Microsoft's.NETCommon Language Infrastructure(CLI) specification allows code written in 40+ different programming languages to be combined into a final product. Because of this, identifier/reserved word collisions can occur when code implemented in one language tries to execute code written in another language. For example, aVisual Basic (.NET)library may contain aclassdefinition such as:

' Class Definition of This in Visual Basic.NET:

PublicClassthis
' This class does something...
EndClass

If this is compiled and distributed as part of a toolbox, aC#programmer, wishing to define a variable of type "this"would encounter a problem:'this'is a reserved word in C#. Thus, the following will not compile in C#:

// Using This Class in C#:

thisx=newthis();// Won't compile!

A similar issue arises when accessing members, overridingvirtual methods,and identifying namespaces.

This is resolved bystropping.To work around this issue, the specification allows placing (in C#) theat-signbefore the identifier, which forces it to be considered an identifier rather than a reserved word by the compiler:

// Using This Class in C#:

@thisx=new@this();// Will compile!

For consistency, this use is also permitted in non-public settings such as local variables, parameter names, and private members.

See also

edit

References

edit
  1. ^C99 specification, 7.1.3 Reserved identifiers
  2. ^C++03 specification, 17.4.3.2.1 Global names [lib.global.names]
  3. ^Jackson, Mark (September 26, 2002)."How do you pronounce" __ "(double underscore)?".python-list(Mailing list).RetrievedNovember 9,2014.
  4. ^Hochberg, Tim (Sep 26, 2002)."How do you pronounce" __ "(double underscore)?".python-list(Mailing list).RetrievedNovember 9,2014.
  5. ^"DunderAlias - Python Wiki".wiki.python.org.
  6. ^Notz, Pat (Sep 26, 2002)."How do you pronounce" __ "(double underscore)?".python-list(Mailing list).RetrievedNovember 9,2014.
  7. ^"Bug ID: JDK-4211070 Java should support const parameters (like C++) for code maintainence [sic] ".Bugs.sun.com.Retrieved2014-11-04.
  8. ^"Lexical grammar - JavaScript | MDN".developer.mozilla.org.8 November 2023.
  9. ^C99 specification, 7.1.3 Reserved identifiers: "All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use."
  10. ^C11:The New C Standard,Thomas Plum, "A Note on Keywords"
  11. ^"The story of None, True and False (and an explanation of literals, keywords and builtins thrown in)",The History of Python,November 10, 2013, Guido van Rossum
  12. ^ "The Java Language Specification, 3rd Edition, Section 3.9: Keywords".Sun Microsystems.2000.Retrieved2009-06-17.The following character sequences, formed from ASCII letters, are reserved for use as keywords and cannot be used as identifiers[...]
  13. ^ "ISO/IEC 9899:TC3, Section 6.4.1: Keywords"(PDF).International Organization for StandardizationJTC1/SC22/WG14. 2007-09-07.The above tokens (case sensitive) are reserved (in translation phases 7 and 8) for use as keywords, and shall not be used otherwise.
  14. ^Peter Norvig:Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp,Morgan Kaufmann, 1991,ISBN1-55860-191-0,Web
  15. ^TypeKEYWORDfrom theCommon Lisp HyperSpec
  16. ^"CLHS: Section 11.1.2.1.2".www.lispworks.com.
  17. ^Zammetti, Frank (2007).Practical JavaScript, DOM Scripting and Ajax Projects.Apress.ISBN9781430201977.