1.3 Data selection

The initial research proposal had involved various independent datasets held in different systems but relating to a single overall project, Raunds, an EH project studying a large area in Northamptonshire, with a wide range of finds and environmental data. As the project progressed, we realised that a broader range of datasets from a number of organisations would more fully test the mapping and cross-searching methods.

Given the initial selection of Raunds Roman (RRAD - Iron Age and Roman) with associated Stanwick (STAN) sampling data and Raunds Prehistoric (RPRE - Neolithic and Bronze Age) datasets, we looked for related datasets, particularly for the Roman period. There were a number of reasons for mapping to the Museum of London Archaeology (MoLA) data. The MoLA recording system has been well documented (Spence 1990) and has become well established over the last 20-25 years, being widely adopted in the UK (and beyond) by a range of organisations, with various modifications. Mapping to MoLA project structure results in the potential mapping to the CIDOC CRM and CRM-EH of a large group of projects that had used the system for recording over the years, through their shared data structure. Accordingly, a set of different projects using the MoLA database format were mapped and data extracted, the largest being the Royal Opera House (ROP) project.

The (LEAP) data from the Silchester Roman town life project was also identified as a useful test bed since it related well to RRAD by period. The Silchester IADB dataset employed in STAR was published as part of the AHRC-funded LEAP project and LEAP is used as an acronym for the dataset in this article. Since the York Archaeological Trust (Integrated Archaeological Database) IADB system was the site database, mapping the LEAP Silchester data results in the potential mapping of other IADB excavation datasets.

An extract of 2460 reports from the OASIS index of grey literature (Online AccesS to the Index of archaeological investigationS) made available by the ADS was an obvious choice for a grey literature dataset. The OASIS project aims to enhance access to grey literature and provides a data capture form, encouraging providers from developer-funded and research fieldwork to contribute reports. While the resulting digital library can be searched via the ADS ArchSearch catalogue utility, there is currently no provision for detailed cross search with online datasets. One goal of STAR was to take a first step towards connecting archaeological grey literature with published online data.

Figure 2

Figure 2: Google maps plot of 3 principle Investigation Projects providing data for STAR - Raunds, Northants; Royal Opera House, London; Silchester, Hants. RDF triple data is displayed for Silchester in the ALT tag box

The selected datasets came in a variety of structures and formats. Mapping the various datasets gave confidence that a wide range of different organisations could potentially map their data records to the ontology. Figure 2 shows the distribution of the main excavation data sites (Raunds, MoLA ROP, Silchester).


© Internet Archaeology/Author(s)
University of York legal statements | Terms and Conditions | File last updated: Mon July 18 2011