0% found this document useful (0 votes)
117 views42 pages

XML - 2. Basic XML Concepts - 3. Defining XML Data Formats - 4. Visualization

The document provides an overview of XML including basic concepts such as elements, attributes, and XML documents as ordered trees. It also discusses defining XML data formats using document type definitions (DTDs) and XML schema.

Uploaded by

Nikhil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views42 pages

XML - 2. Basic XML Concepts - 3. Defining XML Data Formats - 4. Visualization

The document provides an overview of XML including basic concepts such as elements, attributes, and XML documents as ordered trees. It also discusses defining XML data formats using document type definitions (DTDs) and XML schema.

Uploaded by

Nikhil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

07/14/16

1. XML
2. Basic XML Concepts
3. Defining XML Data Formats
4. Visualization
XML

XML???

X: Extensible
M: Mark-Up Language
L: Language
07/14/16

XML

XML is not
A replacement for HTML
(but HTML can be generated from XML)
A presentation format
(but XML can be converted into one)
A programming language
(but it can be used with almost any language)
A network transfer protocol
(but XML may be transferred over a network)
A database
(but XML may be stored into a database)

07/14/16

XML

But then what is it?


XML is a meta markup language
for text documents / textual data

XML allows to define languages


(applications) to represent text
documents / textual data

07/14/16

XML

XML by Example
<article>
<author>Gerhard Weikum</author>
<title>The Web in 10 Years</title>
</article>

Easy to understand for human users


Very expressive (semantics along with the data)
Well structured, easy to read and write from programs
This looks nice, but

07/14/16

XML

XML by Example
this is XML, too:
<t108>
<x87>Gerhard Weikum</x87>
<g10>The Web in 10 Years</g10>
</t108>

Hard to understand for human users


Not expressive (no semantics along with the data)
Well structured, easy to read and write from programs

07/14/16

XML

XML by Example
and what about this XML document:
<data>
ch37fhgks73j5mv9d63h5mgfkds8d984lgnsmcns983
</data>

Impossible to understand for human users


Not expressive (no semantics along with the data)
Unstructured, read and write only with special programs

The actual benefit of using XML highly depends


on the design of the application.
07/14/16

XML

Possible Advantages of Using XML

Truly Portable Data


Easily readable by human users
Very expressive (semantics near data)
Very flexible and customizable (no finite tag set)
Easy to use from programs (libs available)
Easy to convert into other representations
(XML transformation languages)
Many additional standards and tools
Widely used and supported
07/14/16

XML

App. Scenario 1: Content Mgt.


Clients

XML2HTML

XML2WML

XML2PDF

Converters

Database with
XML documents

07/14/16

XML

App. Scenario 2: Data Exchange


Buyer
XML
Adapter

Su

XML
(BMECat, ebXML, RosettaNet, BizTalk, )

Legacy
System
(e.g., SAP
R/2)

07/14/16

Order

XML

XML
Adapter
Legacy
System
(e.g.,
Cobol)

10

App. Scenario 3: XML for Metadata


<rdf:RDF
<rdf:Description rdf:about="[Link]
<dc:title>A Framework for</dc:title>
<dc:creator>Ralf Schenkel</dc:creator>
<dc:description>While there are...</dc:description>
<dc:publisher>Saarland University</dc:publisher>
<dc:subject>XML Indexing</dc:subject>
<dc:rights>Copyright ...</dc:rights>
<dc:type>Electronic Document</dc:type>
<dc:format>text/pdf</dc:format>
<dc:language>en</dc:language>
</rdf:Description>
</rdf:RDF>

07/14/16

XML

11

App. Scenario 4: Document Markup


<?xml version="1.0" ?>
<!DOCTYPE STORY SYSTEM "[Link]">
<Book Author="Anonymous">
<Title>Sample Book</Title>
<Chapter id="1">
This is chapter 1. It is not very long or
interesting.
</Chapter>
<Chapter id="2">
This is chapter 2. Although it is longer than
chapter 1,
it is not any more interesting.
</Chapter>
</Book>

07/14/16

XML

12

App. Scenario 4: Document Markup


Document Markup adds structural and semantic
information to documents, e.g.

Sections, Subsections, Theorems,


Cross References
Literature Citations
Index Entries
Named Entities

This allows queries like


Which articles cite Weikums XML paper from 2001?
Which articles talk about (the named entity) Weikum?

07/14/16

XML

13

XML for Beginners


Part 2 Basic XML Concepts
2.1 XML Standards by the W3C
2.2 XML Documents

07/14/16

XML

14

2.1 XML Standards an Overview


XML Core Working Group:
XML 1.0 (Feb 1998), 1.1 (candidate for recommendation)
XML Namespaces (Jan 1999)
XML Inclusion (candidate for recommendation)

XSLT Working Group:


XSL Transformations 1.0 (Nov 1999), 2.0 planned
XPath 1.0 (Nov 1999), 2.0 planned
eXtensible Stylesheet Language XSL(-FO) 1.0 (Oct 2001)

XML Linking Working Group:


XLink 1.0 (Jun 2001)
XPointer 1.0 (March 2003, 3 substandards)

XQuery 1.0 (Nov 2002) plus many substandards


XMLSchema 1.0 (May 2001)

07/14/16

XML

15

2.2 XML Documents


Whats in an XML document?
Elements
Attributes
plus some other details
<?xml version=1.0 encoding=utf-8?>

07/14/16

XML

16

A Simple XML Document


<article>
<author>Shivang</author>
<title>XML BASICS</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=1 title=Introduction>
The <index>Web</index> provides the universal...
</section>
</text>
</article>

07/14/16

XML

17

A Simple XML Document


<article>
Freely definable tags
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=1 title=Introduction>
The <index>Web</index> provides the universal...
</section>
</text>
</article>

07/14/16

XML

18

A Simple XML Document


Start Tag
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=1 title=Introduction>
The <index>Web</index> provides the universal...
</section>
</text>
</article>
End Tag

07/14/16

Element
XML

Content of
the Element
(Subelements
and/or Text)
19

A Simple XML Document


<article>
<author>Shivang Popat</author>
<title>XMl BASICS</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=1 title=Introduction>
The <index>Web</index> provides the universal...
</section>
</text>
</article>

Attributes with
name and value

07/14/16

XML

20

Elements in XML Documents


(Freely definable) tags: article, title, author
with start tag: <article> etc.
and end tag: </article> etc.

Elements: <article> ... </article>


Elements have a name (article) and a content (...)
Elements may be nested.
Elements may be empty: <this_is_empty/>
Element content is typically parsed character data (PCDATA),
i.e., strings with special characters, and/or nested elements (mixed
content if both).
Each XML document has exactly one root element and forms a
tree.

07/14/16

XML

21

Elements vs. Attributes


Elements may have attributes (in the start tag) that have a name
and
a value, e.g. <section number=1>.
What is the difference between elements and attributes?
Only one attribute with a given name per element (but an
arbitrary number of subelements)
Attributes have no structure, simply strings (while elements can
have subelements)
As a rule of thumb:
Content into elements
Metadata into attributes
Example: attributes/useofattributes
07/14/16

XML

22

XML Documents as Ordered Trees


article

author

text

title

number=1

Shivang

abstract

section
title=

In order

The

XML
BASICS

07/14/16

index

provides

Web

XML

23

More on XML Syntax


Some special characters must be escaped using entities:
< &lt;
& &amp;
(will be converted back when reading the XML doc)
Some other characters may be escaped, too:
> &gt;
&quot;
&apos;

07/14/16

XML

24

Well-Formed XML Documents


A well-formed document must adher to, among others, the
following rules:
Every start tag has a matching end tag.
Elements may nest, but must not overlap.
There must be exactly one root element.
Attribute values must be quoted.
An element may not have two attributes with the same
name.
Comments and processing instructions may not appear
inside tags.
No unescaped < or & signs may occur inside character
data.
07/14/16

XML

25

Well-Formed XML Documents


A well-formed document must adher to, among others, the
following rules:
Every start tag has a matching end tag.
Elements may nest, but must not overlap.
ThereOnly
must bewell-formed
exactly one root element.
documents
Attribute values must be quoted.
can
be
processed
by
XML
An element may not have to attributes with the same
name.
parsers.
Comments and processing instructions may not appear
inside tags.
No unescaped < or & signs may occur inside character
data.
07/14/16

XML

26

XML for Beginners


Part 3 Defining XML Data Formats
3.1 Document Type Definitions
3.2 XML Schema

07/14/16

XML

27

3.1 Document Type Definitions


Sometimes XML is too flexible:
Most Programs can only process a subset of all possible
XML applications
For exchanging data, the format (i.e., elements,
attributes and their semantics) must be fixed
Document Type Definitions (DTD) for establishing the
vocabulary for one XML application (in some sense
comparable to schemas in databases)
A document is valid with respect to a DTD if it conforms
to the rules specified in that DTD.
Most XML parsers can be configured to validate.
07/14/16

XML

28

TYPES OF DTD
2 TYPES OF Dtd
Internal DTD
Example

External DTD
Example

07/14/16

XML

29

DTD Example
<?xml version="1.0"?>
<page>
<title>Hello friend</title>
<content>Here is some content :)</content>
<comment>Written by Shivang Popat</comment>
</page>

07/14/16

XML

30

Element Declarations in DTDs


One element declaration for each element type:
<!ELEMENT element_name content_specification>

where content_specification can be


(#PCDATA) parsed character data
(child)
one child element
(c1,,cn) a sequence of child elements c1cn
(c1||cn) one of the elements c1cn
For each component c, possible counts can be specified:

c
c+
c*
c?

exactly one such element


one or more
zero or more
zero or one

Plus arbitrary combinations using parenthesis:


<!ELEMENT f ((a|b)*,c+,(d|e))*>
07/14/16

XML

31

More on Element Declarations


Elements with mixed content:
<!ELEMENT text (#PCDATA|index|cite|glossary)*>

Elements with empty content:


<!ELEMENT image EMPTY>

Elements with arbitrary content (this is nothing for


production-level DTDs):
<!ELEMENT thesis ANY>

07/14/16

XML

32

Attribute Declarations in DTDs


Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>

declares two required attributes for element section.


element name
attribute name
attribute type
attribute default

Example(withattribute)
07/14/16

XML

33

Attribute Declarations in DTDs


Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>

declares two required attributes for element section.


Possible attribute defaults:
#REQUIRED
is required in each element instance
#IMPLIED
is optional
#FIXED default always has this default value
default
has this default value if the attribute is
omitted from the element instance
07/14/16

XML

34

Attribute Types in DTDs


CDATA
string data
(A1||An)enumeration of all possible values of the
ID
IDREF

attribute (each is XML name)


unique XML name to identify the element
refers to ID attribute of some other element
(intra-document link)

IDREFS

list of IDREF, separated by white space


plus some more

07/14/16

XML

35

Flaws of DTDs
No support for basic data types like integers, doubles,
dates, times,
No type derivation
Cant express unordered contents conveniently
XML Schema

07/14/16

XML

36

3.2 XML Schema Basics


XML Schema is an XML application
Provides simple types (string, integer, dateTime,
duration, language, )
Allows defining possible values for elements
Allows defining types derived from existing types
Allows defining complex types
Allows posing constraints on the occurrence of elements
Allows forcing uniqueness and foreign keys
Examples

07/14/16

XML

37

XML for Beginners


Part 3 Visualization
4.1 XSLT(Extensible Stylesheet Language
Transformations)

07/14/16

XML

38

XSLT essentials and goals


XSLT is a transformation language for XML. That
means, using XSLT, you could generate any sort of other
document from an XML document. For example, you
could take XML data output from a database into some
graphics.
XSLT is a W3C XML language (the usual XML wellformedness criteria apply)
XSLT can translate XML into almost anything , e.g.:
wellformed HTML (closed tags)
any XML, e.g. yours or other XML languages like SVG, X3D
non XML, e.g. RTF (this is a bit more complicated)

07/14/16

XML

39

XSLTElements

The <xsl:template> Element


The <xsl:value-of> Element
The <xsl:for-each> Element
The <xsl:sort> Element
The <xsl:if> Element
The <xsl:choose> Element
The <xsl:apply-templates> Element

07/14/16

XML

40

A complete XSLT example

07/14/16

XML

41

Summary and Outlook

You should give one, I wont.

07/14/16

XML

42

You might also like