Jump to content

Document type declaration

From Wikipedia, the free encyclopedia
(Redirected fromDOCTYPE)

Adocument type declaration,orDOCTYPE,is an instruction that associates a particularXMLorSGMLdocument (for example, aweb page) with adocument type definition(DTD) (for example, the formal definition of a particular version ofHTML 2.0 - 4.0).[1]In theserializedform of the document, it manifests as a short string ofmarkupthat conforms to a particular syntax.

TheHTMLlayout enginesin modernweb browsersperform DOCTYPE "sniffing" or "switching", wherein the DOCTYPE in a document served astext/htmldetermines a layout mode, such as "quirks mode"or" standards mode ". Thetext/htmlserialization ofHTML5,which is not SGML-based, uses the DOCTYPE only for mode selection. Since web browsers are implemented with special-purpose HTML parsers, rather than general-purpose DTD-based parsers, they do not use DTDs and never access them even if a URL is provided. The DOCTYPE is retained in HTML5 as a "mostly useless, but required" header only to trigger "standards mode" in common browsers.[2]

Syntax

[edit]

The general syntax for a document type declaration is:

<!DOCTYPEroot-elementPUBLIC"/quotedFPI/""/quotedURI/"[
<!-- internal subset declarations -->
]>

or

<!DOCTYPEroot-elementSYSTEM"/quotedURI/"[
<!-- internal subset declarations -->
]>

Document type name

[edit]

The opening<!DOCTYPEsyntax is followed by separating syntax[3]: 403–404 (such as spaces,[3]: 297–298, 372 or (except in XML) comments opened and closed by a doubledASCII hyphen),[3]: 372, 391 followed by adocument type name[3]: 403–404 (i.e. the name of the root element that the DTD applies to trees descending from). In XML, the root element that represents the document is the first element in the document. For example, in XHTML, the root element is <html>, being the first element opened (after the doctype declaration) and last closed.

Since the syntax for the external identifier and internal subset are both optional,[3]: 403–404 the document type name is the only information which it is mandatory to give in a DOCTYPE declaration.

External identifier

[edit]

The DOCTYPE declaration can optionally contain anexternal identifier,following the root element name (and separating syntax such as spaces), but before any internal subset.[3]: 403–404 This begins with either the keywordSYSTEMor the keywordPUBLIC,[3]: 379 specifying whether the DTD is specified using apublic identifieridentifying it as apublic text,i.e. one shared between multiple computer systems (regardless of whether it is anavailable public textavailable to the general public, or anunavailable public textshared only within an organisation).[3]: 180–182 If the PUBLIC keyword is used, it is followed by the public identifier enclosed in double or singleASCIIquotation marks. The public identifier does not point to a storage location, but is rather a unique fixed string intended to be looked up in a table (such as anSGML catalog);[3]: 180 however, in some (but not all) SGML profiles, the public identifier must be constructed using a particular syntax calledFormal Public Identifier(FPI), which specifies the owner as well as whether it is available to the general public.[3]: 182–183 

The public identifier (if present) orSYSTEMkeyword (otherwise) may (and, in XML, must)[4]be followed by a "system identifier" that is likewise enclosed in quotation marks. Although the interpretation of system identifiers in general SGML is entirely system-dependent (and might be a filename, database key, offset, or something else),[3]: 378 XML requires that they beURIs.[5]For example, the FPI for XHTML 1.1 is"-//W3C//DTD XHTML 1.1//EN"and, there are 3 possible system identifiers available for XHTML 1.1 depending on the needs. One of them is theURLreference"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd".It means that the XML parser must locate the DTD in a system specific fashion, in this case, by means of a URL reference of the DTD enclosed in double quote marks.

In XHTML documents, the doctype declaration must always explicitly specify a system identifier. In SGML-based documents like HTML, on the other hand, the appropriate system identifier may automatically be inferred from the given public identifier. This association might e.g. be performed by means of a catalog file resolving the FPI to a system identifier.[6]TheSYSTEMkeyword can (except in XML) also be used without a system identifier following, indicating that a DTD exists but should be inferred from the document type name.[3]: 378 

Internal subset

[edit]

The last, optional, part of a DOCTYPE declaration is surrounded by literal square brackets ([]), and called aninternal subset.It can be used to add/editentitiesor add/edit PUBLIC keyword behaviors.[7]It is possible, but uncommon, to include the entire DTD in-line in the document, within the internal subset, rather than referencing it from an external file.[3]: 402 Conversely, the internal subset is sometimes forbidden within simple SGML profiles, notably those for basic HTML parsers that don't implement a full SGML parser.

If both an internal DTD subset and an external identifier are included in a DOCTYPE declaration, the internal subset is processed first, and the external DTD subset is treated as if it were transcluded at the end of the internal subset. Since earlier definitions take precedence over later definitions in a DTD, this allows the internal subset to override definitions in the external subset.[3]: 402–403 

Example

[edit]

The first line of a World Wide Web page may read as follows:

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
<htmllang="ar"dir="ltr"xmlns="http://www.w3.org/1999/xhtml">

This document type declaration for XHTML includes by reference a DTD, whose public identifier is-//W3C//DTD XHTML 1.0 Transitional//ENand whose system identifier ishttp://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd.An entity resolver may use either identifier for locating the referenced external entity. No internal subset has been indicated in this example or the next ones. The root element is declared to behtmland, therefore, it is the first tag to be opened after the end of the doctype declaration in this example and the next ones, too. The HTML tag is not part of the doctype declaration but has been included in the examples for orientation purposes.

Common DTDs

[edit]

Some common DTDs have been put into lists. W3C has produced a list of DTDs commonly used in the web, which contains the "bare" HTML5 DTD, older XHTML/HTML DTDs, DTDs of common embedded XML-based formats likeMathMLandSVGas well as "compound" documents that combine those formats.[8]Both W3CHTML5and its corresponding WHATWG version recommend browsers to only accept XHTML DTDs of certain FPIs and to prefer using internal logic over fetching external DTD files. It further specifies an "internal DTD" for XHTML which is merely a list of HTML entity names.[9]: §13.2 

HTML 4.01 DTDs

[edit]

Strict DTD does not allow presentational markup with the argument that Cascading Style Sheets should be used for that instead. This is how the Strict DTD looks:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd" >
<html>

Transitional DTD allows some older PUBLIC and attributes that have been deprecated:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd" >
<html>

Ifframesare used, the Frameset DTD must be used instead, like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd" >
<html>

XHTML 1.0 DTDs

[edit]

XHTML's DTDs are also Strict, Transitional and Frameset.

XHTML Strict DTD. Nodeprecatedtags are supported and the code must be written correctly according to XML Specification.

<?xml version= "1.0" encoding= "UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" >
<htmlxmlns="http://www.w3.org/1999/xhtml"xml:lang="en"lang="en">

XHTML Transitional DTD is like the XHTML Strict DTD, but deprecated tags are allowed.

<?xml version= "1.0" encoding= "UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
<htmlxmlns="http://www.w3.org/1999/xhtml"xml:lang="en"lang="en">

XHTML Frameset DTD is the only XHTML DTD that supports Frameset. The DTD is below.

<?xml version= "1.0" encoding= "UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd" >
<htmlxmlns="http://www.w3.org/1999/xhtml"xml:lang="en"lang="en">

XHTML 1.1 DTD

[edit]

XHTML 1.1 is the most current finalized revision of XHTML, introducing support forXHTML Modularization.XHTML 1.1 has the stringency of XHTML 1.0 Strict.

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" >

XHTML Basic DTDs

[edit]

XHTML Basic 1.0

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML Basic 1.0//EN"
"http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd" >

XHTML Basic 1.1

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML Basic 1.1//EN"
"http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd" >

HTML5 DTD-less DOCTYPE

[edit]

HTML5uses aDOCTYPEdeclaration which is very short, due to its lack of references to a DTD in the form of a URL or FPI. All it contains is the tag name of the root element of the document,HTML.[10]In the words of the specification draft itself:

<!DOCTYPE html>,case-insensitively.

With the exception of the lack of a URI or the FPI string (the FPI string is treated case sensitively by validators), this format (a case-insensitive match of the string!DOCTYPE HTML) is the same as found in the syntax of the SGML based HTML 4.01DOCTYPE.Both in HTML4 and in HTML5, the formal syntax is defined in upper case letters, even if both lower case and mixes of lower case upper case are also treated as valid.

InXHTML5theDOCTYPEmust be a case-sensitive match of the string "<!DOCTYPE html>".This is because in XHTML syntax all HTML element names are required to be in lower case, including the root element referenced inside the HTML5DOCTYPE.

TheDOCTYPEis optional in XHTML5 and may simply be omitted.[11]However, if the markup is to beprocessed as both XML and HTML,a DOCTYPE should be used.[12]

See also

[edit]

References

[edit]
  1. ^HTML2HTML3HTML4
  2. ^"The HTML syntax ― HTML5".Retrieved2011-06-05.
  3. ^abcdefghijklmnGoldfarb, Charles F.(1990).The SGML Handbook.Oxford:Clarendon Press.ISBN0-19-853737-9.
  4. ^Walsh, Norman (2001-08-06)."XML Catalogs".The Organization for the Advancement of Structured Information Standards (OASIS).
  5. ^Clark, James(1997-12-15)."Comparison of SGML and XML".W3C.NOTE-sgml-xml-971215.
  6. ^"The DOCTYPE Declaration".Archived fromthe originalon 2011-08-14.Retrieved2011-09-09.
  7. ^"DOCTYPE Declaration".msdn.microsoft.com.
  8. ^"W3C QA - Recommended list of Doctype declarations you can use in your Web document".www.w3.org.Retrieved22 March2019.
  9. ^"HTML Standard".html.spec.whatwg.org.Retrieved22 March2019.
  10. ^"The HTML syntax ― HTML5".Web Hypertext Application Technology Working Group.Retrieved2011-06-05.3. A string that is an ASCII case-insensitive match for the string "DOCTYPE". 5. A string that is an ASCII case-insensitive match for the string "HTML".
  11. ^"The XHTML syntax ― HTML5".Web Hypertext Application Technology Working Group.Archived fromthe originalon 2012-06-18.Retrieved2009-09-01.
  12. ^"Polyglot Markup: HTML-Compatible XHTML Documents".World Wide Web Consortium.Retrieved2012-01-17.
[edit]