Keeping Your Space with xml:space
Aside from xml:lang, there is one more important predefined attribute in XML documents that can help maintain layout of source data that is being transported by XML: xml:space. For example, the original format for the third quote in the quotelist in Listing 2-1 is:
Is this a dagger which I see before me,
The handle toward my hand? Come, let me clutch thee:--
I have thee not, and yet I see thee still.
Art thou not, fatal vision, sensible
To feeling as to sight? or art thou but
A dagger of the mind, a false creation,
Proceeding from the heat-oppressed brain?
However, the XML document that I am using has stripped away the line formatting,
and looks like this:
Is this a dagger which I see before me, the handle toward my
hand? Come, let me clutch thee: I have thee not, and yet I see
thee still. Art thou not, fatal vision, sensible to feeling as
to sight? or art thou but a dagger of the mind, a false
creation, proceeding from the heat-oppressed brain?
Because Shakespeare text is often formatted in a very particular way, the loss of the original formatting, and the inability of XML to restore the formatting to its original condition, is a problem. To maintain the text spacing through XML document manipulation and future reformatting, the xml:space=”preserve” attribute can be used to make sure that the spacing and the line formats stay intact:
xml:space=”preserve”>
Is this a dagger which I see before me,
The handle toward my hand? Come, let me clutch thee:--
I have thee not, and yet I see thee still.
Art thou not, fatal vision, sensible
To feeling as to sight? or art thou but
A dagger of the mind, a false creation,
Proceeding from the heat-oppressed brain?
The xml:space=”default” attribute can also be defined, but just for fun because it doesn’t tell the parser to do anything it wouldn’t do anyway. Unfortunately, even when the space attribute is set to “preserve”, the retention of text formatting is up to the parser, as there is nothing in the W3C XML document recommendation that specifically requires the xml:space attributes to be respected. This means that some parsers may ignore the xml:space, but most are good XML citizens and respect the text formatting if the “preserve” attribute is set. One more item of note: The space that is defined around text but part of the text formatting is referred to as “whitespace” in XSL and parsing lingo, which I will be covering later in this book.