XML
documents (and HTML documents) are made up by the following building blocks:
Elements, Tags, Attributes, Entities, PCDATA, and CDATA
This is a brief explanation of each of the building blocks:
Elements
Elements are the main building blocks of both XML and HTML documents.
Examples of
HTML elements are "body" and "table". Examples of XML elements could be
"note"
and "message". Elements can contain text, other elements, or be
empty. Examples of empty HTML elements are "hr", "br" and
"img".
Tags
Tags are used to markup elements.
A starting tag like <element_name> mark up the beginning of an
element, and an ending tag like </element_name> mark up the end of
an element.
Examples:
A body element: <body>body text in between</body>.
A message element: <message>some message in between</message>
Attributes
Attributes provide extra information about elements.
Attributes are placed inside the start tag of an element. Attributes come in
name/value pairs. The following "img" element has an
additional information about a source file:
<img src="computer.gif" />
The name of the element is "img". The name of the attribute is
"src". The value of the attribute is "computer.gif".
Since the element itself is empty it is closed by a " /".
PCDATA
PCDATA means parsed character data.
Think of character data as the text found between the start tag and the end
tag of an XML element.
PCDATA is text that will be parsed by a
parser. Tags inside the text will be treated as markup and entities will be expanded.
CDATA
CDATA also means character data.
CDATA is text that will NOT be parsed by a parser.
Tags inside the text will NOT be treated as markup and entities will not be expanded.
Entities
Entities as variables used to define common
text. Entity references are references to entities.
Most of you will known the HTML entity reference: " "
that is used to insert an extra space in an HTML document.
Entities are expanded when a document is parsed by an XML parser.