DOCTYPE tag

Internal DTD

If you do not have a specification of what elements are allowed in your document, any tags can be used. The only requirement is that they be arranged into a well formed XML document, following all the XML rules for names, closing tags, et cetera.

The sample shows an XML document, without a Document Type Definition.

Usually you want to specify what elements and attributes are allowed. For example, you may require the root element to be:   name

You can specify what elements and attriabutes are valid in your document by writing a Document Type Definition(DTD).

You can put the DTD within your XML document. Then it works for this one document, but cannot be shared. Let's look at how a DTD can be written within your document.
The first line could be:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
          

Notice that we specified standalone="yes" because our DTD is within your document, so there is no external file used.

The second line in your source document specifies the DOCTYPE. For a DTD internal within your document it would be something like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE   root-element-name   [
      put the DTD here within your XML document
      ]>

Note that:

  • there is an exclaimation point ! before the word DOCTYPE
  • DOCTYPE tag does NOT have a closing tag
  • DOCTYPE tag does NOT have a slash/ at the end
  • DOCTYPE is in upper case letters
  • the root element is specified immediatly after the word DOCTYPE
  • the DTD itself will be coded inside the [square brackets]

This tag is different looking. It does not look like HTML. It does not look like XML. It is actually written in SGML. SGML is an older, more complex, language than XML. It is used to specify markup languages. We will only use a limited part of SGML to write our DTD declarations.

To speify only a root element called name

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE   name   [
      ]>

Look at the page source code for the example on the left. It has the root element, and nothing else. There is an ELEMENT tag to specify the name element. We will look at ELEMENT tags later.

SYSTEM Identifiers

Often you may wish to put your DTD in a separate file. Then you can share the same DTD with many similar documents. The definition file is found using a SYSTEM Identifier

The contents of the separate file would be the same as what you put previously had inside the [square brackets] within your DOCTYPE tag. Let's look at how a DTD can be referenced from your document.
The first line could be:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
          

Notice that we changed to   standalone="no"   because the DTD is not within your document. Your document does not stand alone, but uses an external file.

The second line in your source document specifies where to find the DTD. For a DTD elsewhere in you system, it would be something like:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE   root-element-name   SYSTEM   "your-DTD-file-name">

To speify a DTD file called name_list.dtd, in the same directory as your name list document:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE   name-list   SYSTEM   "name_list.dtd">

Look at the source of the example at the left. It contains:
<!DOCTYPE name SYSTEM "one-element-with-DTD-file.dtd" >

The file one-element-with-DTD-file.dtd contains:
<!ELEMENT name EMPTY >
which is the same thing we had in the [square brackets] in the previous example.

You might want to put all your DTD files in another directory, so instead of just specifying the file name, you would give the path also. The following is an example:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE   name-list   SYSTEM   "dtd_directory/name_list.dtd">

You could even specify the URL on another server on the Internet as the location of your file. You might do this to share it with a few other people you are working with.

PUBLIC Identifiers

Many definitions are PUBLIC. A PUBLIC definition is registered, and available to all users. This is a common, and valuable, use of definitions. For example this allows a definition to be used by all Pharmacy businesses, to exchange information about wholesale pharmacy orders. XBRL is a definition used in accounting for business reports.

You are very likely to be using PUBLIC definitions. One commonly used definition is for XHTML. Often these PUBLIC definitions use a more complex definition language than DTD. We will see a couple more definition languages later in the course.

The first line in your file would be the same as with SYSTEM Identifiers:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
          

The second line in your source document specifies where to find the PUBLIC Specifier. It would be something like:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE   root-element-name   PUBLIC   "identifier"   "location and file name">

Look at the first two lines of the source code for this web page that you are reading. They are:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
          

Notice I did not specify standalone="no". I should have, but the browsers are smart enough to figure it out.
The root element name is:   html
The identifier is:   -//W3C//DTD XHTML 1.0 Transitional//EN The location and file name is:   http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
Note that the file extension is .dtd.   It is a DTD.