4.2 Data mapping process and STELLAR project

When considering wider use of the methods, the complexity and potential subjectivity of the work of mapping and extracting data to the ontology is potentially the most significant cost issue. This is in terms both of the human resources and the potential expertise required, as many of the entities in the ontology are fairly abstract; understanding the conceptual complexity of the CIDOC CRM poses a challenge to some non-specialists. It can also be possible for different people to make alternative valid mappings to the ontology for the same situation, raising difficulties for semantic interoperability, as encountered in the European FP6 Bricks project (Nußbaumer and Haslhofer 2007).

The CRM-EH extensions to the CRM were useful in this regard for STAR project mappings. Because CRM-EH entities are sub-classed from the CRM they inherit the same properties as the parent classes. Additionally, the CRM-EH entities have been given further definitions (via scope notes) that have more specific archaeological meanings. Thus mapping the data about a ContextFind (EHE0009 isa CRM E19) and relating that to a Context (EHE0007 isa CRM E53) is a more obvious mapping than trying to relate all the E19 Physical Objects from an excavation to the relevant E53 Places.

The STAR mapping work was performed by the project team, assisted by the mapping and extraction tool described in section 2.2. It is clear, however, that the process of CRM mapping remains complex for potential data providers. Therefore, the follow-on STELLAR project (in collaboration with ADS) has addressed this issue by providing more support and guidance to data providers. The data mapping/extraction techniques have been generalised to help third party data providers undertake this work. The aim is to make it easier to perform the mapping/extraction for users familiar with their own data but less familiar with the ontology and to encourage consistency in the mapping.

Following discussions with archaeologists at the STAR final workshop, the STELLAR approach is to work back from the ontology to the datasets. Guidelines identify templates for mapping to key areas of the ontology (drawing on STAR experience). For the current set of templates, this corresponds to the cross search for inter-site analysis rationale outlined in section 4.1, although other templates for different purposes could be envisaged. These templates largely correspond to the archaeological concepts underpinning the STAR user interface (Groups, Contexts, Finds, Samples). Choosing a template and providing the associated data corresponds to making a mapping to the CRM and CRM-EH entities associated with the template.

The STELLAR website makes available for download a command line mapping/extraction tool and a simpler browser-based application, together with a set of templates for generating CRM-EH, SKOS and also broader CIDOC CRM templates conforming to the CLAROS classical art project format. An online tutorial uses the Silchester LEAP dataset as an example and correctly extracts the LEAP sample data, which was discussed in the Hearth Metalworking scenario.


© Internet Archaeology/Author(s)
University of York legal statements | Terms and Conditions | File last updated: Mon July 18 2011