4. Discussion and Evaluation

The Demonstrator scenarios and reflections on the STAR implementation experience raise various issues, which are discussed in this section. These range from archaeological modelling to technical development issues, discussion of cost-benefits and future work.

4.1 CRM-EH ontology modelling

The CRM-EH model is based on the archaeological recording methodology known as 'single context recording'. This is the most widely adopted recording methodology currently used in the UK and is widespread elsewhere, with origins in recording systems from the Museum of London and English Heritage (Chadwick 1998; Richards and Hardman 2008). It is generally the approach used by EH and therefore reflected in the archaeological records made for Raunds and other excavations consistent with the EH recording system. Data collected under another archaeological recording methodology that does not record individual contexts as single units of stratigraphy may not fit the conceptual framework of single context recording that is incorporated in the CRM-EH model. Nevertheless, it is possible to envisage that datasets resulting from different approaches that recorded finds objects from archaeological layers could still be mapped to the CRM-EH (if necessary via the CRM).

Representing details of physical relationships

In the Demonstrator it is possible to search the immediate stratigraphic relationships between contexts (Above/Below/Equals), as discussed in the stratigraphic scenarios. The stratigraphic browsing interface is illustrated in Figure 6.

Figure 6

Figure 6: Screendump from Demonstrator with stratigraphy context view

However, the interface (and the mapping) does not present all the details of physical relationships, such as Cut by, Cuts, Butted by etc. An exception was made, where present in the data, for Contains, which was considered particularly relevant for understanding relationships between contexts and features - see Figure 7, for example, which shows one context contained by another, in addition to the context–group (in green) relationship.

Figure 7

Figure 7: Screendump from Demonstrator with hierarchical context view

The EH recording manual makes a clear conceptual distinction between physical relationships and what are called direct stratigraphic relationships; i.e. where one layer is immediately above or below another, as shown in Harris matrix diagrams, and which carry the direct temporal relationship such as 'layer A directly “is before” layer B'.

Physical relationships are characterised by physical properties identified during excavation, such as a foundation trench cutting through a whole collection of layers, or perhaps a well or deep pit that may cut down through a sequence of 20-30 or more different layers (indeed wells may go through the entire stratigraphic sequence). In such cases, the well is noted to 'cut' various layers. There is clearly a broader temporal relationship implicit in the statement that X cuts Y. However, the stratigraphic relationship of the well cut (the one most important for dating when the well was first dug) is only directly regarded as being 'above' the very first layer (the most recent level in the stratigraphic matrix) that the well cuts through. Future work could consider more detailed treatment of physical relationships. For example, Chadwick (1998) outlines the potential for recording physical alongside stratigraphic relationships, as a means of incorporating more contextual information and allowing archaeologists on site to contribute more actively to the interpretative process.

Granularity of detail in the model

One factor that influences the level of detail in the search interface and results is the granularity of detail in the ontological modelling; some areas of the CRM-EH go to greater levels of detail. For example, the measurements of environmental sample sizes and quantities are included in the model and have been implemented in the Demonstrator, unlike similar details for soil descriptions and sizes of deposit inclusions. For STAR Demonstrator purposes, we judged it useful to show how detail could be included, but also took account of the probable requirements for detailed search across widely differing datasets.

This issue is linked to the different levels of granularity in the datasets and critical mass in those elements considered for cross search. For example, all datasets included context records, but all did not necessarily hold data on deposit inclusions, clay, silt, sand content, or method of excavation, etc. It seemed likely that such details were less likely to be relevant for inter-site analysis across a number of different sites/projects, particularly considering the more general level of detail in grey literature reports. On the other hand, for intra-site analysis, a higher level of detail in relating context data might be fruitful and one future scenario could be that researchers be referred to native project databases, having identified them via a broader cross-search facility, such as STAR. Alternatively, if necessary, greater detail could be extracted via STAR techniques.

The datasets also differed as to their respective stages in the project management process, which archaeological projects tend to follow. Thus Raunds Prehistoric was excavation data archived after work on the site was completed, while Raunds Environmental was produced by specialist environmental assessment work. The extract from Raunds Roman was from the analysis stage, while Silchester LEAP and MoLA ROP could be considered at the publication stage. The CRM-EH ontology covers various stages in the archaeological process, from fieldwork excavation and survey to the later processes of finds conservation, environmental analysis and site analysis of grouping and phasing, where contexts are amalgamated into interpretative groups and broader dating phases are attached. While assistance with the initial stages can be offered by the semantic techniques, our conclusion is that it is generally most useful to search across project datasets that have effectively reached publication stage. However, future research might also explore the evolution of data and interpretation throughout the archaeological process.

As explained in section 1.2, the original aim of the CRM-EH was to model the EH (Centre for Archaeology) archaeological processes in order to inform future systems design. The Demonstrator shows that the CRM-EH can also serve cross-search purposes. To achieve this, a particular subset of the model was employed; some of the detail required for systems design was not required, for example procedural aspects of how EH carried out their recording practices (e.g. sample and finds processing and measurements). The original CRM-EH model was simplified to focus on the four main facets of archaeological data presented in the Demonstrator user interface and the STELLAR templates (Groups - Contexts – Finds – Samples), together with associated attributes and properties.

While this may not be sufficient for all possible enquiries, the scenarios discussed in section 3.1 are intended to provide some evidence that this core subset provides an appropriate balance of granularity for inter-site search and demonstrates the potential for complex semantic search.

In one sense, shielding users from the full complexity of the CRM (and CRM-EH) is effectively exposing a simpler ontology. As described in section 4.2, the STELLAR templates largely correspond to the main elements of the Demonstrator user interface, a particular subset of the CRM-EH. It is not necessary for users, either as data providers or as searchers, to be exposed to the full ontology.

Spatial and temporal modelling

Within the constraints of the available project resources, while spatial attributes were mapped and extracted, the Demonstrator focused upon the project's research goals and did not reproduce the spatial functionality available via GIS archaeological systems or ADS ArchSearch, for example. However, if required by future development, it would be straightforward to transform the site coordinate data extracted by STAR to universal coordinate systems for spatial visualisation and retrieval tools. Figure 2 illustrates this point, since it was produced using GoogleMaps on RDF data, in the same basic format employed by the STELLAR templates for location information.

Similarly, on the temporal side, Figure 8 shows the capability to extract time period information. The extract is produced from the Silchester LEAP data using a custom XSL transformation that generates CRM based RDF. It also illustrates an advantage of using a conceptual model, in that the STAR.TIMELINE process (section 2.1) has automatically created temporal interval relationships from finds to 'known' period concepts. These could be utilised by a future inference capability for querying over temporal relationships.

Figure 8: Silchester LEAP time period data expressed as RDF

			<crmeh:EHE0009.ContextFind rdf:about="http://tempuri/star/"/>
			<crmeh:EHE0038.ContextFindProductionEventTimespan rdf:about="http://tempuri/star/">
					<crm:E61.Time_Primitive rdf:about="http://tempuri/star/">
						<claros:not_before rdf:datatype="">43</claros:not_before>
						<claros:not_after rdf:datatype="">410</claros:not_after>
		<crm:P114F.is_equal_in_time_to rdf:resource="http://tempuri/star/concept#134738"/>
		<crm:P117B.includes rdf:resource="http://tempuri/star/concept#134823"/>
		<crm:P118B.is_overlapped_in_time_by rdf:resource="http://tempuri/star/concept#134822"/>
		<crm:P118F.overlaps_in_time_with rdf:resource="http://tempuri/star/concept#135956"/>
		<crm:P119B.is_met_in_time_by rdf:resource="http://tempuri/star/concept#134803"/>
		<crm:P120B.occurs_after rdf:resource="http://tempuri/star/concept#136075"/>
		<crm:P120F.occurs_before rdf:resource="http://tempuri/star/concept#135957"/>
		<crm:P119F.meets_in_time_with rdf:resource="http://tempuri/star/concept#134743"/>


© Internet Archaeology/Author(s)
University of York legal statements | Terms and Conditions | File last updated: Mon July 18 2011