[Back] [Forward] [Contents] [Home]

Section 3: Introducing the eXtensible Markup Language and Related Technologies

3.5 Displaying, styling and transforming XML documents

As identified in Section 3.1, XML encoding by itself does nothing to display a text. Other technologies are required to create presentation and output. Options are available such as the XML transformers, Cascading Stylesheets (CSS) and the Extensible Stylesheet Language (XSL). Both these produce output which transforms XML documents into other formats through the use of an XML processor (Ross et al. 2004, 46).

3.5.1 Cascading Stylesheets (CSS)

XML supports the Cascading Stylesheet mechanism to add style for webpage display. The stylesheet is called in the XML document prologue and used in the same way as for (X)HTML documents and comprises a text file consisting of a list of style rules following the syntax of 'selector {property: new value}' (Deane and Henderson 2004). An example of this approach is presented in Section 4 as part of the practical case-study (see Section 4.4.2).

Web browser support for Cascading Stylesheets has been patchy until recently, and even with the latest version of Internet Explorer 6.0, there is only partial support. For example, the display of tables using the {display: table} property is unsupported in CSS1. No Web browsers currently support the full CSS recommendations in conjunction with XML. The W3C Schools website stresses that formatting XML with CSS is not the future of the Web, and that XSL will become more commonly used as Web browsers provide greater support for the XSL specification.

There are presently three levels of CSS defined by the W3C. CSS level 1 is the current W3C Recommendation, issued in 1996 and revised in 1999. CSS level 2, revision 1 is a W3C Candidate Recommendation of 25 February 2004. CSS level 3 is under development at the time of writing. CSS1 can only render an element's data content, whereas a CSS2 rule can render both element attributes and data content. However, neither CSS1 nor CSS2 can be used to modify the XML document structure.

3.5.2 The eXtensible Stylesheet Language (XSL)

XSL is the W3C Schools' preferred stylesheet language for XML as it is far more sophisticated than CSS. XSL is a family of recommendations for defining XML document transformation and presentation. It consists of three parts: XSL Transformations (XSLT), a language for transforming XML; the XML Path Language (XPath), an expression language used by XSLT to access or refer to parts of an XML document, and XSL Formatting Objects (XSL-FO), an XML vocabulary for specifying formatting semantics (Harold and Means 2002). To perform a transformation, an XSL processor is required, which analyses the XML document and converts it into a node tree, the node being an individual branch of the tree, that is a specific piece of the document. Some Web browsers, such as Microsoft Internet Explorer 6.0 have such a processor inbuilt (see 3.4). The processor looks to the stylesheet for instructions on what to do with the nodes. These instructions are contained within templates identifying the nodes to which they apply (Castro 2001, 136; Harold and Means 2002). The XSL stylesheet is called in the XML document prologue in the same way as for a CSS stylesheet.

Figure 6
Figure 6: XSL transformation model

XSLT version 1.0 became a recommendation of the World Wide Web Consortium in September 1999 and specifies rules by which one XML document is transformed either into another XML document, or HTML and other formats prior to display in the browser. Different stylesheets or scripts can be written to query the same data, so that different clients can receive content in different formats e.g. new XML, HTML, plain text, WML, PDF or SMIL from the same XML source document (see Fig. 6; Fitzgerald 2004, 1). In contrast to CSS, XSLT can transform the XML document structure; for instance, the elements can be sorted, an element's data content can be transformed into other attributes and attributes can be transformed into data content. Examples of client-side XSL transformations are presented in Section 4 as part of the practical case-study (see 4.4.3).

There is presently a W3C Working Draft for XSLT Version 2.0, November 2003, designed for use in conjunction with XPath 2.0, which is a W3C Working Draft of October 2004. XPath is a language for addressing parts of an XML document, designed to be used by both XSLT and XPointer. The official specification is XPath Version 1.0, a recommendation of the World Wide Web Consortium from November 1999. XPath syntax is a system of describing node sets by specifying their location within an XML document, thus allowing elements to be retrieved from a hierarchical XML document structure, similar to the way in which SQL retrieves information from a database (Ross et al. 2004, 45). XSLT and XPath are the two most widely used XML-related specifications from the W3C since XML 1.0 (Fitzgerald 2004, ix).

3.5.3 The XML Linking Language (XLink)

The XML Linking Language (XLink), allows elements to be inserted into XML documents in order to create and describe links between resources. It uses XML syntax to create structures that can describe links similar to the use of hyperlinks within HTML, as well as other more sophisticated links. It can also be used to retrieve images for display. At present, however, XLink is not supported in any of the current Web browsers and cannot, therefore, be used in client-side transformations (Castro 2001, 15). To work around this problem, <a href> HTML tags have been used within XSL stylesheets to retrieve images for the practical case-study in Section 4 (see 4.4.3.1).

3.5.4 Other XML languages

There are a number of other technologies within the XML family, including WML and SVG.

3.5.4.1 The Wireless Markup Language (WML)

The Wireless Markup Language (WML) is an XML-based language developed to allow mobile devices with limited memory, limited processing power, limited bandwidth and small screens to view simplified webpages (Mann and Sbihli 2000). Details of the WML 2.0 DTD can be found through the WAP Forum, now consolidated into the Open Mobile Alliance.

WML is built into the micro-browsers on mobile devices using the Wireless Application Protocol (WAP), a group of networking applications that enable small quantities of data to be transferred across the Internet. WAP stores a series of mini-pages called cards in a single file, rather than HTML files which are generally too large in size, and, via a WAP Gateway, converts data into a format usable by mobile devices, for example by compressing often-used words into code (Deane and Henderson 2004). A practical application of WML is presented in Section 4 using the WAP emulator, WinWAP 3.1 PRO (see 4.4.4).

3.5.4.2 Scalable Vector Graphics (SVG)

Scalable Vector Graphics is an XML-based language for Web graphics from the W3C. In the practical case-study presented in Section 4, the image content of the grey literature reports was not available in digital format, and has been scanned to create .jpg files for display in a Web browser. Future projects looking to XML-encode archaeological reports that have vector-based digital imagery available may consider use of SVG markup for their Web presentation and manipulation. The online excavation reports in the Archaeology of York Web Series referred to in Section 2.1.2 are examples of how SVG can bring enhanced visualisation and depth to archaeological site drawings and illustrations on the Web. Rains (2004) has recently discussed these in the Archaeology and XML Newsletter (see 3.6). Wright (2003), has recently explored potential archaeological applications of SVG utilising digitised vector drawings from the 1975 excavations at Cricklade in Wiltshire as a case-study.

3.5.5 The future of XML technology

As identified above, the standards body for XML and related technologies is the W3C. The process of defining the specifications for these technologies is an ongoing process and new versions are being drafted at the time of writing. There are presently W3C Working Drafts for XSLT Version 2.0, XPath 2.0, XQuery 1.0 and XHTML 2.0

Many countries, including the UK, have shown their commitment to the use and development of XML technology by formally adopting this as their standard for e-Government (Cabinet Office 2000; see 2.3.2). In the Netherlands, for example, XML is specified in the Regulation for ordered and accessible archival records (Digital Preservation Testbed Project 2002).

XML has become an important part of IT infrastructure due to the many advantages identified in Section 3.2 and it is likely to remain so indefinitely. It is likely that Web-browser support will improve with future releases, but this cannot be relied upon and server-side, rather then client-side, processing is likely to be the way forward. As Harold and Means (2002, 11) identify 'XML has proven itself a solid foundation for many diverse technologies'.


[Back] [Forward] [Contents] [Home]

© Internet Archaeology URL: http://intarch.ac.uk/journal/issue17/5/gf3-5to3-5-5.html
Last updated: Wed Apr 6 2005