Internet Archaeol. 7. Gray and Walford. XML example frameset

<body>  <h2><a name="p6">A Sample SSD</a></h2> <p>We present below a sample structured site description showing the XML markup and some specific comments on it. Although this information is intended to be parsed by a computer, it is text-based and capable of human interpretation (that it was prepared and formatted by a human makes this easier). In practice, this SSD could be embedded in or attached to a report along with the Dublin Core metadata which described the report itself.</p> <pre> <a href="fnote1.html" title="XML header (identifies document as XML)"> <?xmlversion="1.0"?> </a> <site provider="example" sitecode="IA99" version="1"> <<a href="fnote2.html" title="Name, size and location of the site; size and location can be used inside other elements.">name</a>>Brown's Farm, Long Heslington</name> <<a href="fnote2.html" title="Name, size and location of the site; size and location can be used inside other elements.">size</a> type="area" unit="ha">1.2</size> <invtype>evaluation</invtype> <<a href="fnote2.html" title="Name, size and location of the site; size and location can be used inside other elements.">location</a> scheme="osgb">SJ41625978</location> <<a href="fnote3.html" title="Dating uses a simple scheme 'code' and lists two dating codes (applying to the site as a whole).">dating</a> scheme="code">ba med</dating> <<a href="fnote4.html" title="Group of features. <type> describes the feature in terms of a thesaurus. We've used an abbreviated form for the top-level term 'Religious, Ritual and Funerary' from the RCHME TMT hierarchy in this case. There are obviously advantages and disadvantages to using the hierarchical form rather than a single term. The major disadvantage is extra verbosity; the advantage is that partial matches can be used i.e. a search for a general category will yield specific instances without any special support from the search system. Features within the group can have their own specific dating, location and description.">feature</a>> <type scheme="rchme-tmt">RRF:Cemetery:Cremation Cemetery</type> <dating scheme="code">ba</dating> <feature> <type scheme="rchme-tmt">RRF:Burial:Cremation Burial</type> </feature> <feature> <type scheme="rchme-tmt">RRF:Burial:Cremation Burial</type> <<a href="fnote5.html" title="A find within a feature. This has explicitly been assigned a scheme of 'none', otherwise it would have inherited its scheme from its parent element.">find</a>> <type scheme="none">Comb</type> </find> </feature> </feature> <<a href="fnote6.html" title=" A second feature group, showing a different date range scheme.">feature</a>> <type scheme="rchme-tmt">Domestic:Settlement</type> <dating scheme="crange">12:13</dating> <feature> <type>Domestic:Dwelling:House</type> <<a href="fnote7.html" title="Note the use of 'qualifier' to give additional keyword descriptions; these may or may not come from a controlled vocabulary.">qualifier</a>>timber-framed, clay-walled, thatched</qualifier> </feature> <<a href="fnote8.html" title=" Further features within this group. They are contained within separate feature tags as this is the only way to identify them as separate features. Three type lines in a row would provide three different thesaurus terms for the same feature (they could be terms from different wordlists). Note also that as no scheme is specified 'rchme-tmt' (from the parent element) is assumed.">feature</a>> <type>Unassigned:Drain</type> </feature> <<a href="fnote8.html" title=" Further features within this group. They are contained within separate feature tags as this is the only way to identify them as separate features. Three type lines in a row would provide three different thesaurus terms for the same feature (they could be terms from different wordlists). Note also that as no scheme is specified 'rchme-tmt' (from the parent element) is assumed.">feature</a>> <type>Unassigned:Boundary</type> </feature> <<a href="fnote8.html" title=" Further features within this group. They are contained within separate feature tags as this is the only way to identify them as separate features. Three type lines in a row would provide three different thesaurus terms for the same feature (they could be terms from different wordlists). Note also that as no scheme is specified 'rchme-tmt' (from the parent element) is assumed.">feature</a>> <type>Agricultural:Stock Enclosure</type> </feature> </feature> </site> </pre> <p><a name="p61"><strong>There are several points worth noting in this example:</strong></a></p> <ul> <li>Tags can be nested. This allows features and finds to be grouped in a logical way and makes it straightforward to express the concept of a find being contained within a feature, and of sub-features. <li>We have used tag content rather than attributes to hold key values. This is primarily to make the description easier to read, and to make a distinction between the content data which may either be free text or from a large controlled vocabulary, and attributes which will generally have a small number of possible values. <li>Most of the tags are optional. A fixed scheme that forces the specification of unknown or irrelevant data makes the automated searching and indexing process more complicated. It is possible to attach a confidence value to descriptions or dates which are uncertain. <li>Provision is made for different schemes for type, dating, location, size and so on, to reflect the preferences of authors. There is obviously a benefit to the community in adopting common standards where possible, though for some items (size, dating, location) translation between schemes is feasible. Translation between different thesauri for feature and finds types is a significantly more involved task, so adoption of common standards here is appropriate. <li>There are mandatory qualifiers for provider and site/report code in order to give every site a unique identifier. A version number is provided to make it possible to distinguish a newer version of the description of a site, should it be revised. </ul> <p>A few notes on terminology are relevant. We have used 'feature' as the primary element description tag, rather than 'context' as this structure is not intended for recording at the context level. Instead, the intent is to record the "most interesting" facts about the site in a structured manner. As a consequence, it is not a replacement for traditional data repositories, and doesn't contain all the possible information that could be recorded about a site. The design criterion has been to include things which are the most likely grounds for searching and indexing, as a trade-off between compactness and completeness; like metadata, this information is ultimately just a pointer to a full report.</p> <p>The definition of "interesting" is flexible, and constitutes the kind of material that would be highlighted in an abstract, or which the author would particularly mention to their peers. Obviously, things which are commonplace or standard features of sites need not be included. A general <find> entry for 'animal bone' is probably redundant unless there is something unusual about the presence, quantity, location or type. It may be interesting to note negative information (i.e. the absence of certain kinds of finds or features) in some circumstances.</p> <p>The other information which would be better recorded in a site database is a precise description of the relationships between elements. The SSD structure that we propose does not provide a means to report relationships between contexts. Instead, it is intended to provide a general outline of what there is, and possibly where it is.</p> <p><a name="p62"><strong>This should be seen very much as a work in progress. There are plenty of issues to be resolved, for example:</strong></a></p> <ul> <li>Should a description include any methodological information? i.e. should features inferred from geophysical data be marked as such? <li>How much contextual information about site location could be usefully specified i.e. whether it is urban, hilltop, by a river, underwater etc. This is potentially of interest to data users, but requires the selection (or possibly development) of suitable vocabularies. <li>Is it useful to specify the shape of items in a general sense? A vector graphics addition (see below) would provide a more precise representation, but without this, what general terms should be used to describe object shapes? <li>Confidentiality of site locations. There is obviously some concern about making the exact locations of sites available on the web. There is no simple automatic solution to this. The accurate location data could obviously be omitted and if the site proved to be interesting on other grounds, the report author or original contractor could be contacted. An encryption-based scheme is also a possibility but involves administrative overhead. </ul> <p><strong>It is clear that much work lies in the identification of suitable controlled vocabularies, and that wider consultation with potential users is necessary. Other developments could include (in descending order of priority):</strong></p> <ul> <li>A search engine (and "web crawler") tailored to the SSD data format. <li>Simplified site plans. Embedding XML vector graphics tags in an investigation description could provide a pictorial representation of the position and extent of features. Because of the compactness of the XML representation, the plan would be available on the search engine; the items (features or findspots) relating to query terms could be highlighted. <li>A user-friendly interface that will present the user with, say, a set of simple forms and then generate the SSD. This could be done by providing an applet which users could run locally, or via a web-based interface. </ul>  <link rel="stylesheet" type="text/css" href="https://archaeologydataservice.ac.uk/app/themes/ads_theme/public/cookieconsent/cookieconsent.min.css"> <script src="https://archaeologydataservice.ac.uk/app/themes/ads_theme/public/cookieconsent/cookieconsent.min.js"></script> <script> window.addEventListener("load", function(){ window.cookieconsent.initialise({ "palette": { "popup": { "background": "#2a2a52" } , "button": { "background": "#fcfcff" } }, "content": { "message": "We use cookies to improve the efficiency of our website and analyse visits to our webpages. By continuing to access our website you are consenting to our <a href='/privacy-policy.html' class='cookie_link'>Privacy Policy</a> (shared with ADS), <a href='/terms.html' class='cookie_link'>Terms and Conditions</a> and to receive our cookies. You can change your cookie settings at any time. Find out more in our <a href='/cookies.html' class='cookie_link'>Cookie Policy</a>.", "dismiss": "CONTINUE", "link": " " } })}); </script>  <script> var _paq = _paq || []; _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function() { var u="//styx.york.ac.uk/"; _paq.push(['setTrackerUrl', u+'js/']); _paq.push(['setSiteId', 3]); var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'js/'; s.parentNode.insertBefore(g,s); })(); </script> </body>