XML is a Semi-Structured Data Model for many applications
XML
In XML Trees,
Examples:
Object-oriented model of these trees:
The following are the serialized forms of these trees:
<r> <name>Alan</name> <tel>32190</tel> <email>alan@aol.ru</email> </r>
<r> <name> <first>Alan</first> <last>Black</last> </name> <tel>32190</tel> <email>alan@aol.ru</email> </r>
So this the serialized form is
<document />
<document> Hello World! </document>
<document> <salutation> Hello World! </salutation> </document>
<?xml version="1.0" encoding="utf-8" ?> <document> <salutation color="blue"> Hello World! </salutation> </document>
Bigger Example:
<solar_system> <star> <name>Sun</name> <spectral_type>G2</spectral_type> <age unit="billions years">5</age> </star> <planet type="telluric"> <name>Earth</name> <distance unit="km">149600000</distance> <mass unit="kg">5.98e24</mass> <diameter unit="km">12756</diameter> <satellite number="1"/> </planet> <planet ring="yes" type="gaseous"> <name>Saturn</name> <distance unit="UA">5.2</distance> <mass unit="Earth mass">95</mass> <diameter unit="Earth diameter">9.4</diameter> <satellite number="18"/> </planet> <planet ring="yes" type="gaseous"> <name>Uranus</name> <distance unit="UA">19.2</distance> <mass unit="Earth mass">14.5</mass> <diameter unit="Earth diameter">4</diameter> <satellite number="15"/> </planet> </solar_system>
For applications
Parsers
Schemas are used to
There are 3 ways of doing it:
Schemas are build on top of Tree Automata and Regular Expressions theory
Consider the following XML documents :
<university> <teacher subject="math" students="180">M. Durant</teacher> <teacher subject="CS" students="130">M. Smith</teacher> <teacher subject="CS" students="150">Mme. Martin</teacher> </university>
<university> <teacher> <name>M. Durant</name> <subject>Math</subject> <students>180</students> </teacher> <teacher> <name>M. Smith</name> <subject>CS</subject> <students>130</students> </teacher> <teacher> <name>Mme. Martin</name> <subject>Math</subject> <students>150</students> </teacher> </university>
Which representation is better?
<note date="10/01/2008" />
Better:
<note> <date> <day>10</day> <month>01</month> <year>2008</year> </date> </note>
Suppose we have two tags with the same name, but different meaning
For example, consider a tag <title>
How to avoid naming conflicts?
<h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> |
<t:table> <t:name>African Coffee</t:name> <t:width>80</t:width> <t:lenght>120</t:lenght> </t:table> |
<root> <h:table xmlns:h="http://www.w3.org/TR/html4/"> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <t:table xmlns:t="http://www.foo.fr/furniture"> <t:name>African Coffee</t:name> <t:width>80</t:width> <t:lenght>120</t:lenght> </t:table> </root>
h
is defined in the element prefixed with h:
it is possible
Default namespace
<chapter xmlns="http://www.mydescription.com"> <paragraph> ... </paragraph> </chapter>
<chapter xmlns="http://www.mydescription.com/"> <paragraph xmlns="http://www.foo.fr/"> ... </paragraph> </chapter>
Several namespaces
<root xmlns:h="http://www.w3.org/TR/html4/" xmlns:t="http://www.foo.fr/furniture"> <h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <t:table> <t:name>African Coffee</t:name> <t:width>80</t:width> <t:lenght>120</t:lenght> </t:table> </root>