CED Solutions Home
Search classes by keyword:
Search classes by category:
computer training atlanta
birmingham, alabama


Microsoft Gold Partner, Microsoft Office Specialist, Cisco Partner, CompTIA Gold Partner, SCP Partner, EC-Council Training, ISC Provider, Novell Gold Training Partner, CIW Partner, Prometric Training Center, Pearson Vue Test Center, Measure Up Training


XML | XML Classes | XML FAQ

XML FAQ

XML FAQ: Table of Contents

  1. What is XML?
  2. What is meant by markup and markup languages?
  3. How is XML related to SGML?
  4. Who needs XML?
  5. Is XML difficult?
  6. What are the rules for writing an XML document?
  7. What is the difference between a tag and an element, and what is an empty tag?
  8. What are the rules as to how a tag must be written?
  9. What are the rules as to how an attribute must be written?
  10. How do you write comments in XML?
  11. What is a CDATA section in XML?
  12. What is a Processing Instruction in XML?
  13. For More Information



What is XML?

XML is short for eXtensible Markup Language, and it is really a set of rules for writing a markup language a markup language. Any markup document that conforms to the rules of XML is known as an 'application' of XML. Here is an example of an XML document.

Hello XML

XML uses angled brackets to designate opening tags and closing tags that contain content. The tags may contain attributes and their values.

These tags are of the form:

<[tagname] [attribute name]="[attribute value]">

Here is an example of an opening tag with an attribute and its value: <

code>

Every tag must have a closing tag of the form:

Here is an example of the closing tag:

Unlike HTML ALL attributes must be quoted with either a single or a double quote. The quotes must match.

Also unlike HTML the tags are case sensitive, i.e. , , andare all different.

If a tag does not have a closing tag, i.e. if it is an empty tag similar to the tag in HTML then it must take the special form:

<[empty tag name]/>

Note the penultimate forward slash. Here is an example of an empty tag in XML.

Like HTML XML can contain comments, and the syntax for comments is similar to that of HTML.

<--this is a comment in both html and xml-->

XML really describes a 'grammer' in which we can write our own Mark-up language. It is similar to SGML and is 100% compatible with SGML.

HTML is a mark-up language written according to the rules of SGML. It is an application of SGML.
Back to top

What is meant by markup and markup languages?

A markup language is the set of rules, the grammar, and syntax that tells how a language which marks up documents should be "spoken". SGML is a markup language, and HTML is the vocabulary of a particular dialect of that language, albeit a very widely spoken dialect. HTML follows the rules of SGML.

XML is also a markup language with a grammar that is based on but substantially more simple than SGML.

Markup are the symbolic tag sets that are used to indicate that some thing needs to be done to the text. The <b></b> pair is markup in HTML. In XML and SGML it corresponds to the tags.

Markup can take one of three forms, semantic, stylistic, or structural.

Semantic markup gives information about the text it is marking up eg. In the element

<hamlet> To be or not to be...</hamlet>

the hamlet tag tells us that the words are being spoken by Hamlet. In the HTML element

<CODE>For i= 0 to ubound(chapterArray)</CODE> tells us that the enclosed text is code.

Stylistic markup tells us about the style that should be used to display a document item.

In HTML the element

<I>This is italic text</I>

tells that the style of the document should change.

Structural markup tells us some thing about the structure of a document. Again in HTML

<P> The text that occurs until one comes across another similar tag is a paragraph and should be treated as such.

The XML equivalent of this could be,

<para>the text that occurs......</para>
<P>Is a structural markup.</P>

The old editor's notations of "dele" and "stet" beloved of crossword fans is structual markup.


Back to top

How is XML related to SGML?

If it wasn't for HTML hardly anyone would have heard of SGML (standardized general markup language), although it has been an international standard since 1986. It is really a document that lays down rules on how to describe a set of markup tags. HTML is its most well known product. It has been used with great success however to manipulate large bodies of documents, and relies on the fact that a document marked up according to the rules of SGML can be widely understood on a variety of platforms.

Its great strength is that it allows the use of semantic tagging which can acuratly describe a documents content.

Its chief drawback is its complexity, which makes it difficult for the occasional user, and also makes it difficult to write SGML-compatible software.

XML (extensible markup language) is a recent language that is 100% compatible with SGML. It has been designed by the W3C as a version of SGML suitable for use over the Internet. It is still very much in the developmental stage, although the Specification for the language proper is quite established. There is still much work to be done on the form of linking XLL and the form of style sheet to use with it. Originally a simplified version of DSSSL called DSSSL-o or XS (extensible styling) was to be used, but both of these are horribly complicated. Currently it appears that CSS will be used for every day declarative styling and XSL will be used when more powerful document manipulation is required.

Most people who have been exposed to this language are wildly enthusiastic about it. It has nearly all the power of SGML with none of the difficulty.


Back to top

Who needs XML?

Everyone who needs to send documents over the Internet containing information that needs to be manipulated in various ways. (You still make your cool display pages using HTML.)

XML allows us to markup a document with a set of tags of our own devising.

Markup can be of three sorts:-

Stylistic Markup:-

Tells how the document is to be styled. The <I>, <b>, and <U> tags are all stylistic markup in HTML.

Structural Markup:-

Tells how the document is to be structured, the <H*>, <P> and the <DIV> tags are examples of structural markup.

Semantic Markup:-

Tells us some thing about the content of the text. <title> and <CODE> are examples of semantic markup in HTML.

HTML has proven very adept at preparing documents for display over the web, but a document marked up in HTML tells us very little about the content of the document, and it so happens that for most documents to be useful in a business situation there is a need to know about the documents content.

As an example if a patients medical records was marked up in HTML, and I as a doctor had wanted to find out about the patients allergies, at present I would have to download the whole record (several K), and then do a manual search through that document.

If however the patients records were marked up in XML and one of the tags was <allergies>, I could just send a request to the Server for that part of the document, and receive a few bytes of information instead of hundreds of kilobytes.

Using the same example of patients records, what if we wanted some one to have access to some part of the records, but not others, (Would you really want every one at the Insurance office reading the notes that your Shrink may have written about you?), then you could instruct the server to withhold certain parts of the document. i.e.. in the above example anything marked up <psych.-note> or <confidential>.

Thus the ability for individuals, groups of individuals, and institutions to write their own markup language will expedite information transfer and provide other benefits, such as confidentiality.

More recently it has become obvious that XML can replace proprietary binary codes in Data Bases, and thus make the old dream of the true interchangebility of data across application and platform a reality. XMl is also being used to write many of the new language specs. It has become the de facto language of the World Wide Web consortium (the body that 'governs' HTML).


Back to top

Is XML difficult?

No. XML was designed to be easy, the official specification is a mere 40 pages (download it from http://www.w3.org/TR/) and is written in (almost) readable language. (They use EBNF notation to describe the keywords. Read section 6, the last section, of this document first.)

Any one with a basic understanding of HTML can be writing XML documents in no time at all.


Back to top

What are the rules for writing an XML document??

XML documents come in two flavors, the valid document and the well formed document. Every valid document is well formed, but not every well formed document is valid.

A Well-Formed Document

A well-formed document must follow three very simple rules.

  1. It must contain at least one element.
  2. There must be a unique opening and closing tag, which contains the whole document. This forms the ROOT element
  3. All the tags must be correctly nested and must match.(Note that XML is case sensitive <tag> is not the same as <Tag>.)

In addition all the tags and attributes must conform to the rules for writing tags, and all the values of the attributes must be quoted.

Here are some examples of some well formed documents.

<greeting>Hello World!</greeting>

The above example follows all the rules for a well formed document.

<greeting manner="cordial">Hello World!</greeting>

We have given the 'greeting' element an attribute. Note how the value is quoted. Single quotes could also be used, but the quotes must match.

<xdoc>
	<greeting>Hello World!</greeting>
	<greeting>Hello XML!</greeting>
</xdoc>

Note that for there to be a unique opening and closing tag we have had to add the xdoc tag.

<xdoc>
	<greeting>Hello World!</greeting>
	<greeting><emphasis>Hello XML!</emphasis></greeting>
</xdoc>

Note how the 'emphasis' tag is nested (i.e. completely enclosed within) in the greeting tag.

The following examples are NOT well formed documents. See if you can figure out why. The answers are given at the end of this question

Bad example #1

	<greeting>Hello World!</Greeting>
Bad example #2
	<greeting manner=cordial>Hello World!</greeting>
Bad example #3
	<greeting manner="cordial'>Hello World!</greeting>
Bad example #4

	<greeting>Hello World!</greeting>
	<greeting>Hello XML!</greeting>

Bad example #5
<xdoc>
	<greeting>Hello World!</greeting>
	<emphasis><greeting>Hello XML!</emphasis></greeting>
</xdoc>

A valid document must be well-formed, and it must also conform to its DTD (Document Type Definition). This is a set of rules describing how the document must be laid out. The DTD (if present) is either written or referenced in the PROLOG of the XML document.

Answers to Bad examples

  • 1. The tags don't match.The closing greeting begins with an upper case G. XML is case sensitive.
  • 2. The value of the attribute 'manner', cordial is not quoted.
  • 3. The value is now quoted, but the quotes don't match.
  • 4. There is no unique opening and closing tag for the document
  • 5. The elements do not nest. 'emphasis' and 'greeting' overlap.

Back to top

What is the difference between a tag and an element, and what is an empty tag?

These two words are NOT interchangeable. In XML, a tag is what is written between angled brackets e.g. <atag>. This is an example of an opening tag. In XML all opening tags must have closing tags of the form </atag>. The way the <P> tag is used in HTML is illegal in XML. In XML an opening <P> tag requires a closing tag </P>.

An element is an opening and a closing tag and what comes in between.

<greeting>Hello XML!!</greeting >

is an element.

Empty tags must be in a special format namely <emptytag/> (note where the forward slash is), or else you are allowed to write <emptytag></emptytag>. The <IMG> tag is illegal in XML.

Use a convention. I put HTML tags in uppercase, XML tags in lower case.( This convention is becoming quite wide spread.)

XML is case sensitive. ie. <Atag> <atag>, and <ATAG> are three different kinds of tags.


Back to top

What are the rules as to how a tag must be written?

XML is case-sensitive. ie. <Atag> <atag>, and <ATAG> are three different kinds of tags.

A tag name must start with a letter (a-z, A-Z) or an underscore (_) and can contain letters, digits 0-9, the period (.), the underscore (_) or the hyphen (-). White space is not allowed, nor other markup.

The colon (:) is reserved for experimental use, and although it is legal at present in may acquire special meaning in the future, so don't use it.(For those interested, it's main use is in namespaces, and reserved keywords.)

No name can begin with the sequence "xml..". This sequence is reserved for use by the standardization forum.

Your tags should have semantic meaning, otherwise why bother to use XML!!

With these few simple rules and conventions in mind go ahead and make tags that describe your document!!


Back to top

What are the rules as to how an attribute must be written?

Tags can contain attributes. An example you may be most familiar with is the <IMG> tag eg <IMG alt="smileyface" URL="smiley.gif" VSPACE=75>

In XML an attribute takes the following general form:

attribute="value"

Note that there must be an equal sign and a value, and the value must be quoted, so the VSPACE attribute above would have to be VSPACE="75" to be legal in XML. Also in HTML some tags can take an attribute without a value such as <UL COMPACT>.This too would be illegal, you must give an attribute a quoted value such as <UL COMPACT="anything">, even <UL COMPACT= ""> would do.

Attributes have to follow these rules.

  • The same rules that apply to the character types allowed in tag names (see above) apply to composing attribute and attribute value names, except they cannot contain "<" or "&".


  • As already mentioned, all values must be quoted.


  • Attributes can only appear in start tags and empty element tags.


  • No attribute may appear more than once in the same start tag.


  • All attributes must be declared in the DTD if present, and their value must be of the correct type. (see below)


  • An attribute cannot contain a reference to an external entity.

Back to top

How do you write comments in XML?

XML Comments.

XML comments are written the same way as HTML comments. i.e.

<!--this is a comment-->.

The XML processor is not required to pass this information on to the user agent, i.e. the piece of software that is converting the document into some thing useful, but XML also uses CDATA sections which is used to escape blocks of text containing markup.


Back to top

What is a CDATA section in XML?

CDATA is short for Character DATA. CDATA sections allow us to escape blocks of text containing markup.

CDATA sections take the general form:

<![CDATA[....put text containg markup here...]]>

For example, suppose I wanted to print out the following line of text, as would be quite common if I was writing a book on HTML or XML:

"The left angled bracket '<' and the ampersand '&' must be replaced by their entities &lt; and &amp; respectively".

If I was writing this in HTML I would have to put:

"The left angled bracket '&lt;' 
and the ampersand '&amp;' 
must be replaced by their 
entities &amp;lt; and &amp;amp; respectively".

By escaping the text using CDATA, I could simply write

    <![CDATA["The left angled bracket '<' 
	and the ampersand '&' must be replaced 
	by their entities &lt; and &amp; respectively".]]>
The text, including the markup, would be displayed. Obviously, the CDATA escape section could not include the sequence ']]>', but it could include any other kind of markup, which would not be interpreted by the browser.
Back to top

What is a Processing Instruction in XML?

Processing instructions

Processing instructions take the form

     <?this is a processing instruction?>
     

Processing instructions cannot start with any form of the string "xml" this is reserved for the xml version declaration processing instruction.

     <?xml version="1.0"? encoding="UTF-8"?>

They can occur any where in the document and contain information that the processor must pass on to the user agent. The version declaration is an example of a processing instruction.


Back to top

For More Information



Home | Technical Schedule | Application Classes | Class Outlines | MCSE, MCDBA, MCSD Training | Microsoft .NET Programming | Cisco Classes | Linux, Unix, AIX | CompTIA Certification | Webmaster Training | Pricing | Locations | Financing | E-mail Us





Search classes by keyword:
Search classes by category:


computer training atlanta MCSE Atlanta MCSE Georgia MCSE classes MCSE Birmingham MCSE Training MCSE training Atlanta MCSE classes Atlanta MCSE classes Birmingham MCSE classes Alabama Dreamweaver Atlanta Dreamweaver training Atlanta ColdFusion training ColdFusion training Atlanta project training atlanta ColdFusion classes Atlanta ColdFusion classes Georgia Flash training Georgia Flash training Atlanta MCSE boot camps MCSE certification training computer room rentals computer room rentals atlanta computer room rentals birmingham computer training georgia computer training alabama computer training birmingham Crystal Reports training Crystal Reports training atlanta Crystal Reports classes Crystal Reports classes Atlanta Crystal Reports 8.0 training