We now know how to author an XML document.
Let's learn how to structure a XML document.
DTD - Document Type Definition, is a mechanism used to describe the structure of a document.
DTD lays down all the rules of how an XML document should look like. The use of XML tag document and DTD is sometimes called document modeling.
Use the tag <!ELEMENT tag
<!ELEMENT elementName keyWord>
Example:
<!ELEMENT Name (FirstName, MiddleName, LastName)>
An element of type Name must contain three subelements in the order of FirstName, MiddleName and LastName.
-----
<!ELEMENT Name (FirstName, MiddleName?, LastName)>
MiddleName is optional
-----
<!ELEMENT language (English | Chinese)>
A element of type language contains either a single element English or a single element Chinese.
-----
<!ELEMENT people (male | female)+>
<people>
<male> ... </male>
<male> ... </male>
<female> ... <female>
</people>
-----
<!ELEMENT printer (deskjet | laser)*>
Zero or more set
-----
<!ELEMENT Fax EMPTY>
fax does not contain anything.
<Fax/>
-----
<!ELEMENT text (#PCDATA | picture)*>
PCDATA - character data
-----
<!element printer (laser | deskjet)>
Error: element has to be upper case (ELEMENT)
-----
<!ELEMENT address (street, aptNum?, city, state, zip, country?)>
Optional element
------
<!ELEMENT name (firstName | LastName | #PCDATA)*>
Components of mixed content must always be separated by |
-----
Occurrence Indicators
+ * ?
Connectors
, |
PCDATA - A mixture of character data. It is usually used for leaf elements (elements with no child elements).
EMPTY - indicates an element is an empty element (leaf element).
Regular expression: + (One or more of a kind)
Regular expression: * (Zero or more of a kind)
Regular expression: ? (Optional element)
<!ATTLIST Address type (HOUSE|APT) "APT">
An element of type Address has an attribute name called type with either HOUSE or APT. APT is the default value.
-----
<!ATTLIST owner type (STUDENT|PROFESSIONAL) "STUDENT">
Element owner has attribute type with default value STUDENT.
-----
<!ATTLIST owner age CDATA #REQUIRED>
-----
<!ATTLIST MiddleName init (A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z) #IMPLIED>
-----
<!ATTLIST workplace location CDATA #FIXED "Cupertino">
CDATA - Any string of characters except for <, > and &.
ID for identifier. It is a name that is unique in the document
IDREF - value of an ID reference elsewhere in the same document
IDREFS is a list of IDREF separated by spaces
ENTITY - name of an external entity. It is like a macro.
ENTITIES is a list of ENTITY separated by spaces
NMTOKEN - a word without spaces
NMTOKENS is a list of NMTOKEN separated by spaces
#REQUIRED - A value must be supplied.
#IMPLIED - If a value is not supplied, the XML application decides what to put in.
#FIXED - A value is fixed, otherwise, error is rasied.
Use like a C macro
<!ENTITY rsinn "Richard Sinn">
The entity is called rsinn. When referenced in an XML document, the parser will insert
the replacement text "Richard Sinn". Example:
<author>
&rsinn
</author>
-----
<!ENTITY chapter2 SYSTEM "http://www.openloop.com/xml/chapter2.xml">
<toc>
&chapter2
</toc>
Internal - DTD inserted into the document itself
<!DOCTYPE address [
<!ELEMENT address (street, aptNum?, city, state, zip, country?)>
<!ATTLIST address primary (yes | no) "yes">
<!ELEMENT street (#PCDATA)>
<!ELEMENT aptNum (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT zip (#PCDATA)>
<!ELEMENT country (#PCDATA)>
]>
External - DTD is not stored in the document.
<!DOCTYPE address-format SYSTEM "http://www.openloop.com/dtd/address-format.dtd">
SYSTEM - system identifier, a Universal Resource Identifier (URI) pointing to the DTD. URI is a superset of URL.
PUBLIC - public identifier pointing to DTD by ISO with rules from ISO 9070.
sinn.xml
<?xml version="1.0"?> <!DOCTYPE profile SYSTEM "profile.dtd"> <profile> <owner type = "STUDENT" age = "20"> <Name> <FirstName>Richard</FirstName> <MiddleName init = "P">Pong Nam</MiddleName> <LastName>Sinn</LastName> </Name> <Phone> <Home>(000)000-0000</Home> <Work>(000)000-0000</Work> <Fax/> <Pager/> <Cell/> </Phone> <Address type = "HOUSE"> <StreetAddr>555 Bailey Avenue</StreetAddr> <City>San Jose</City> <State>Ca</State> <ZipCode>95141</ZipCode> </Address> <Email> <ul> <li>sinn@us.ibm.com</li> <li>sinn@mathcs.sjsu.edu</li> <li>webmaster@openloop.com</li> </ul> </Email> <Education> <Institution> <GraduationDate>1998</GraduationDate> <schoolName>University of Minnesota-Twin Cities</schoolName> <degree type = "MS" major = "CS" gpa = "3.97"/> </Institution> <Institution> <GraduationDate>1994</GraduationDate> <schoolName>University of Wisconsin-Madison</schoolName> <degree type = "BS" major = "CS" gpa = "3.80"/> </Institution> </Education> <techSkills> <Languages>Java</Languages> <Languages>C++</Languages> <Languages>C</Languages> <Languages>JavaScript</Languages> <Languages>XML</Languages> <Languages>HTML</Languages> <Languages>SQL</Languages> <System>Windows</System> </techSkills> </owner> </profile>
profile.dtd
<!-- ----------------------------------------------------
--
-- Document type Definition for the Profile Application
--
-- ---------------------------------------------------- -->
<!-- An profile document contains one or more owners -->
<!ELEMENT profile (owner)+>
<!-- an owner contains these six sessions in this sequence -->
<!ELEMENT owner (Name, Phone, Address, Email, Education, techSkills)>
<!-- ---------------------------------------------------
-- Every owner is either a STUDENT or PROFESSIONAL
-- This is indicated by its type attribute.
-- If a value is not supplied for this attribute,
-- it defaults to STUDENT
-- ---------------------------------------------------- -->
<!ATTLIST owner type (STUDENT|PROFESSIONAL) "STUDENT">
<!-- Every owner must also has a age attribute.-->
<!ATTLIST owner age CDATA #REQUIRED>
<!ELEMENT FirstName ANY>
<!ELEMENT LastName ANY>
<!ELEMENT Name (FirstName, MiddleName, LastName)>
<!ELEMENT MiddleName ANY>
<!ATTLIST MiddleName init (A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z) #IMPLIED>
<!ELEMENT Home ANY>
<!ELEMENT Work ANY>
<!ELEMENT Fax ANY>
<!ELEMENT Pager ANY>
<!ELEMENT Cell ANY>
<!ELEMENT Phone (Home, Work, Fax, Pager, Cell)>
<!ELEMENT StreetAddr ANY>
<!ELEMENT City ANY>
<!ELEMENT State ANY>
<!ELEMENT ZipCode ANY>
<!ELEMENT Address (StreetAddr, City, State, ZipCode)>
<!ATTLIST Address type (HOUSE|APT) "APT">
<!ELEMENT Email (ul)+>
<!ELEMENT li ANY>
<!ELEMENT ul (li)+>
<!ELEMENT Education (Institution)+>
<!ELEMENT GraduationDate ANY>
<!ELEMENT schoolName ANY>
<!ELEMENT degree ANY>
<!ELEMENT Institution (GraduationDate, schoolName, degree)>
<!ATTLIST degree
type (BS|MS|PhD) "BS"
major (CS|Math|Other) "CS"
gpa CDATA #REQUIRED>
<!ELEMENT System ANY>
<!ELEMENT Languages ANY>
<!ELEMENT techSkills (System|Languages)+>
xmlint action
C:\sinn\book\xml\programs\resume>xmlint sinn.xml
sinn.xml
The element 'FirstName' is used but not declared in the DTD/Schema.
URL: file:///C:/sinn/book/xml/programs/resume/sinn.xml
Line 00008: <FirstName>Richard</FirstName>
Pos 00014: -------------^
C:\sinn\book\xml\programs\resume>xmlint sinn.xml
sinn.xml

Copyright 1996-2001 OpenLoop Computing. All rights reserved.