<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>
<html xmlns=“www.w3.org/1999/xhtml”>
http-equiv=“Content-Type” content=“text/html; charset=UTF-8” /><link
rel=“SHORTCUT ICON” href=“/favicon.ico” /><style type=“text/css”>
TD {font-family: Verdana,Arial,Helvetica} BODY {font-family:
Verdana,Arial,Helvetica; margin-top: 2em; margin-left: 0em; margin-right:
0em} H1 {font-family: Verdana,Arial,Helvetica} H2 {font-family:
Verdana,Arial,Helvetica} H3 {font-family: Verdana,Arial,Helvetica} A:link,
A:visited, A:active { text-decoration: underline }
</style><title>Validation &
DTDs</title></head><body bgcolor=“#8b7765” text=“#000000”
link=“#a06060” vlink=“#000000”><table border=“0” width=“100%”
cellpadding=“5” cellspacing=“0” align=“center”><tr><td
width=“120”><a href=“swpat.ffii.org/”>
src=“epatents.png” alt=“Action against software patents”
/></a></td><td width=“180”><a href=“
www.gnome.org/”>
src=“gnome2.png” alt=“Gnome2 Logo” /></a><a href=“
www.w3.org/Status”>
src=“w3c.png” alt=“W3C Logo” /></a><a href=“
www.redhat.com/”>
src=“redhat.gif” alt=“Red Hat Logo” /></a><div
align=“left”><a href=“
xmlsoft.org/”>
src=“Libxml2-Logo-180x168.gif” alt=“Made with Libxml2 Logo”
/></a></div></td><td><table border=“0”
width=“90%” cellpadding=“2” cellspacing=“0” align=“center”
bgcolor=“#000000”><tr><td><table width=“100%” border=“0”
cellspacing=“1” cellpadding=“3” bgcolor=“#fffacd”><tr><td
align=“center”><h1>The XML C parser and toolkit of
Gnome</h1><h2>Validation &
DTDs</h2></td></tr></table></td></tr></table></td></tr></table><table
border=“0” cellpadding=“4” cellspacing=“0” width=“100%”
align=“center”><tr><td bgcolor=“#8b7765”><table
border=“0” cellspacing=“0” cellpadding=“2” width=“100%”><tr><td
valign=“top” width=“200” bgcolor=“#8b7765”><table border=“0”
cellspacing=“0” cellpadding=“1” width=“100%”
bgcolor=“#000000”><tr><td><table width=“100%” border=“0”
cellspacing=“1” cellpadding=“3”><tr><td colspan=“1”
bgcolor=“#eecfa1” align=“center”><center>Main
Menu</center></td></tr><tr><td
bgcolor=“#fffacd”><form action=“search.php”
enctype=“application/x-www-form-urlencoded” method=“get”><input
name=“query” type=“text” size=“20” value=“” /><input name=“submit”
type=“submit” value=“Search …” /></form><ul><li><a
href=“index.html”>Home</a></li><li><a
href=“html/index.html”>Reference
Manual</a></li><li><a
href=“intro.html”>Introduction</a></li><li><a
href=“FAQ.html”>FAQ</a></li><li><a href=“docs.html”
style=“font-weight:bold”>Developer
Menu</a></li><li><a href=“bugs.html”>Reporting bugs
and getting help</a></li><li><a
href=“help.html”>How to help</a></li><li><a
href=“downloads.html”>Downloads</a></li><li><a
href=“news.html”>Releases</a></li><li><a
href=“XMLinfo.html”>XML</a></li><li><a
href=“XSLT.html”>XSLT</a></li><li><a
href=“xmldtd.html”>Validation &
DTDs</a></li><li><a href=“encoding.html”>Encodings
support</a></li><li><a href=“catalog.html”>Catalog
support</a></li><li><a
href=“namespaces.html”>Namespaces</a></li><li><a
href=“contribs.html”>Contributions</a></li><li><a
href=“examples/index.html” style=“font-weight:bold”>Code
Examples</a></li><li><a href=“html/index.html”
style=“font-weight:bold”>API Menu</a></li><li><a
href=“guidelines.html”>XML
Guidelines</a></li><li><a
href=“ChangeLog.html”>Recent
Changes</a></li></ul></td></tr></table><table
width=“100%” border=“0” cellspacing=“1” cellpadding=“3”><tr><td
colspan=“1” bgcolor=“#eecfa1”
align=“center”><center>Related
links</center></td></tr><tr><td
bgcolor=“#fffacd”><ul><li><a href=“Mail”>mail.gnome.org/archives/xml/“>Mail
archive</a></li><li><a href=”XSLT“>xmlsoft.org/XSLT/”>XSLT
libxslt</a></li><li><a href=“DOM”>phd.cs.unibo.it/gdome2/“>DOM
gdome2</a></li><li><a href=”XML-DSig“>www.aleksey.com/xmlsec/”>XML-DSig
xmlsec</a></li><li><a href=“FTP
<li><a href="#General5">General overview</a></li> <li><a href="#definition">The definition</a></li> <li><a href="#Simple">Simple rules</a> <ol> <li><a href="#reference">How to reference a DTD from a document</a></li> <li><a href="#Declaring">Declaring elements</a></li> <li><a href="#Declaring1">Declaring attributes</a></li> </ol> </li> <li><a href="#Some">Some examples</a></li> <li><a href="#validate">How to validate</a></li> <li><a href="#Other">Other resources</a></li>
</ol><h3><a name=“General5” id=“General5”>General overview</a></h3><p>Well what is validation and what is a DTD ?</p><p>DTD is the acronym for Document Type Definition. This is a description of the content for a family of XML files. This is part of the XML 1.0 specification, and allows one to describe and verify that a given document instance conforms to the set of rules detailing its structure and content.</p><p>Validation is the process of checking a document against a DTD (more generally against a set of construction rules).</p><p>The validation process and building DTDs are the two most difficult parts of the XML life cycle. Briefly a DTD defines all the possible elements to be found within your document, what is the formal shape of your document tree (by defining the allowed content of an element; either text, a regular expression for the allowed list of children, or mixed content i.e. both text and children). The DTD also defines the valid attributes for all elements and the types of those attributes.</p><h3><a name=“definition1” id=“definition1”>The definition</a></h3><p>The <a href=“W3C”>www.w3.org/TR/REC-xml“>W3C XML Recommendation</a> (<a href=”Tim“>www.xml.com/axml/axml.html”>Tim Bray's annotated version of Rev1</a>):</p><ul>
<li><a href="http://www.w3.org/TR/REC-xml#elemdecls">Declaring elements</a></li> <li><a href="http://www.w3.org/TR/REC-xml#attdecls">Declaring attributes</a></li>
</ul><p>(unfortunately) all this is inherited from the SGML
world, the syntax is ancient…</p><h3><a name=“Simple1”
id=“Simple1”>Simple rules</a></h3><p>Writing DTDs can
be done in many ways. The rules to build them if you need something
permanent or something which can evolve over time can be radically
different. Really complex DTDs like DocBook ones are flexible but quite
harder to design. I will just focus on DTDs for a formats with a fixed
simple structure. It is just a set of basic rules, and definitely not
exhaustive nor usable for complex DTD design.</p><h4><a
name=“reference1” id=“reference1”>How to reference a DTD from a
document</a>:</h4><p>Assuming the top element of the
document is spec
and the dtd is placed in the file
mydtd
in the subdirectory dtds
of the directory
from where the document were
loaded:</p><p><!DOCTYPE spec SYSTEM
"dtds/mydtd">
</p><p>Notes:</p><ul>
<li>The system string is actually an URI-Reference (as defined in <a href="http://www.ietf.org/rfc/rfc2396.txt">RFC 2396</a>) so you can use a full URL string indicating the location of your DTD on the Web. This is a really good thing to do if you want others to validate your document.</li> <li>It is also possible to associate a <code>PUBLIC</code> identifier (a magic string) so that the DTD is looked up in catalogs on the client side without having to locate it on the web.</li> <li>A DTD contains a set of element and attribute declarations, but they don't define what the root of the document should be. This is explicitly told to the parser/validator as the first element of the <code>DOCTYPE</code> declaration.</li>
</ul><h4><a name=“Declaring2” id=“Declaring2”>Declaring
elements</a>:</h4><p>The following declares an element
spec
:</p><p><!ELEMENT spec (front,
body, back?)>
</p><p>It also expresses that the
spec element contains one front
, one body
and one
optional back
children elements in this order. The declaration
of one element of the structure and its content are done in a single
declaration. Similarly the following declares div1
elements:</p><p><!ELEMENT div1 (head, (p | list |
note)*, div2?)>
</p><p>which means div1 contains
one head
then a series of optional p
,
list
s and note
s and then an optional
div2
. And last but not least an element can contain
text:</p><p><!ELEMENT b
(#PCDATA)>
</p><p>b
contains text or
being of mixed content (text and elements in no particular
order):</p><p><!ELEMENT p
(#PCDATA|a|ul|b|i|em)*>
</p><p>p
can
contain text or a
, ul
, b
, i
or em
elements in no particular
order.</p><h4><a name=“Declaring1”
id=“Declaring1”>Declaring attributes</a>:</h4><p>Again
the attributes declaration includes their content
definition:</p><p><!ATTLIST termdef name CDATA
#IMPLIED>
</p><p>means that the element
termdef
can have a name
attribute containing text
(CDATA
) and which is optional (#IMPLIED
). The
attribute value can also be defined within a
set:</p><p><!ATTLIST list type
(bullets|ordered|glossary)
"ordered">
</p><p>means
list
element have a type
attribute with 3 allowed
values “bullets”, “ordered” or “glossary” and which default to “ordered” if
the attribute is not explicitly specified.</p><p>The content
type of an attribute can be text (CDATA
),
anchor/reference/references
(ID
/IDREF
/IDREFS
), entity(ies)
(ENTITY
/ENTITIES
) or name(s)
(NMTOKEN
/NMTOKENS
). The following defines that a
chapter
element can have an optional id
attribute
of type ID
, usable for reference from attribute of type
IDREF:</p><p><!ATTLIST chapter id ID
#IMPLIED>
</p><p>The last value of an attribute
definition can be #REQUIRED
meaning that the attribute has to
be given, #IMPLIED
meaning that it is optional, or the default
value (possibly prefixed by #FIXED
if it is the only
allowed).</p><p>Notes:</p><ul>
<li>Usually the attributes pertaining to a given element are declared in a single expression, but it is just a convention adopted by a lot of DTD writers: <pre><!ATTLIST termdef id ID #REQUIRED name CDATA #IMPLIED></pre> <p>The previous construct defines both <code>id</code> and <code>name</code> attributes for the element <code>termdef</code>.</p> </li>
</ul><h3><a name=“Some1” id=“Some1”>Some
examples</a></h3><p>The directory
test/valid/dtds/
in the libxml2 distribution contains some
complex DTD examples. The example in the file
test/valid/dia.xml
shows an XML file where the simple DTD is
directly included within the document.</p><h3><a
name=“validate1” id=“validate1”>How to
validate</a></h3><p>The simplest way is to use the
xmllint program included with libxml. The --valid
option
turns-on validation of the files given as input. For example the following
validates a copy of the first revision of the XML 1.0
specification:</p><p>xmllint --valid --noout
test/valid/REC-xml-19980210.xml
</p><p>the – noout is
used to disable output of the resulting tree.</p><p>The
--dtdvalid dtd
allows validation of the document(s) against a
given DTD.</p><p>Libxml2 exports an API to handle DTDs and
validation, check the <a href=“associated”>xmlsoft.org/html/libxml-valid.html“>associated
description</a>.</p><h3><a name=”Other1“
id=”Other1“>Other resources</a></h3><p>DTDs are as old
as SGML. So there may be a number of examples on-line, I will just list one
for now, others pointers welcome:</p><ul>
<li><a href="http://www.xml101.com:8081/dtd/">XML-101 DTD</a></li>
</ul><p>I suggest looking at the examples found under test/valid/dtd and any of the large number of books available on XML. The dia example in test/valid should be both simple and complete enough to allow you to build your own.</p><p></p><p><a href=“bugs.html”>Daniel Veillard</a></p></td></tr></table></td></tr></table></td></tr></table></td></tr></table></td></tr></table></body></html>