3. Metholodical Challenges

3.1 Database construction

Fulfilling our initial objective, the creation of a database for the whole of the Yorkshire region, was no easy matter, given that it had to draw on material held in a great variety of institutions and systems employing diverse terminology. Clearly, any scheme endeavouring to reconcile such diversity had to be both sufficiently generic to encompass these disparate holdings, yet sufficiently detailed to do justice to the variety of the archaeological evidence being gathered. That said, our initial work had a head start in certain areas. Of the seven SMRs that actively curate data for the region (the North York Moors National Park, North Yorkshire County Council, West Yorkshire Archaeology Service, South Yorkshire Archaeology Service, City of York Council, Humber Archaeological Partnership and the Yorkshire Dales National Park – the Peak Park only receives data subsets from adjoining areas), the majority employed the ExeGesis system to hold data. In addition, many used terms derived from the National Monuments Record's Thesaurus to describe monuments and objects. This allowed the team to define concordances with the project's functional categories.

Secondly, bespoke strategies were required to incorporate most museum data because of the diverse database systems which these institutions employ. Data was gathered from Harrogate and District Museums Service, the Yorkshire Museum, Hull and East Ridings Museum, Wakefield Museums Service, Doncaster Museum, Sheffield Museum Trust and Leeds Museum. However, some museums use the common MODES database and many have constructed their record-holding systems on SPECTRUM, the MDA-approved documentation standard. In addition, the latter organisations employ the approved MIDAS list of terms for their holdings, a terminology also used in the classification of objects generated by the Treasure Act and held by the British Museum, and artefacts reported to the Portable Antiquities Scheme, two of our further data sources. Finally, recent commercial data could be accessed from the Archaeological Investigations Project at Bournemouth and from English Heritage's Excavations Index hosted by the Archaeology Data Service, although coverage in each case was by no means complete or completely accurate.

To be relevant to our objectives, each data entry required spatial, chronological and functional attributes: a dated archaeological observation was of limited use without some knowledge of the human activity associated with it, while knowing date and function but having no record of where the data point lies in the region also ruled it out for our purposes. Although designing a system to record the spatial and chronological components proved fairly straightforward, dealing with different functional attributes proved rather more time-consuming. However, creating a structure for the latter was essential if we were to allow the character of Yorkshire's archaeological resource to be evaluated, both within and across periods. Each is discussed in more detail below.

For spatial data, most observations existed within a coherent system of grid references and were thus easily dealt with. Beyond this, the main issue was how to cater for the fact that some locations were very specific, but others much less so. For example, a recently excavated building might be precisely positioned, whereas an artefact recovered in the 19th century might be recorded merely as coming from a particular parish. Our system has been set up to cope with these different levels of spatial resolution.

The chronological divisions that we created were designed to reconcile period categories from different sources. As most data sources explicitly utilised conventional period categories, it was relatively easy to define an overarching structure. The categories we settled on are listed below (the last, however, lying beyond the scope of our own project), together with approximate absolute dates that could be attached to each.

PALAEOLITHIC (up to c. 8,000BCE)
MESOLITHIC (c. 8,000BCE up to c. 3,800BCE)
NEOLITHIC (c. 3,800BCE to c. 2,000BCE)
BRONZE AGE (c. 2,000BCE to c. 800BCE)
IRON AGE (c. 800BCE–c. 100CE)
ROMAN (c. 100CE–c. 400CE)
MEDIEVAL (c. 1100CE–c. 1550CE)
POST-MEDIEVAL/MODERN (c. 1550CE–present day)

Understandably, in deciding on the above list, we were aware that the stated boundaries are approximate and vary somewhat between different institutions – in every case, the dates given above can be contested. Indeed, they are much debated, a process that can ultimately often end up with doubting the utility or viability of the chronological category as a whole (see further discussion). In addition, some of the data was held at a finer level than our categories allowed. For example, certain sources might record entries within particular periods as having early, middle and late components. However, there was no great consistency between different institutions in choosing the boundaries of such sub-divisions – a museum might divide the Roman period between early (AD70–AD150), middle (AD150 and AD300) and late (AD300 and AD410); an SMR might distinguish early- (AD50–AD200) from late- (AD200–AD400) Roman entries; and a commercial excavation might talk of the period as a whole. Where such greater accuracy exists, however, it has been incorporated into our underlying database and can be accessed by drilling down to that lower point. Naturally, to make use of data with this increased level of resolution, any researcher would have to understand the terminology and associated boundaries employed by the individual organisation concerned.

The date of each of our entries had to be classified at one of two levels of resolution: those which could be unambiguously attributed to a period (for example, a Bronze Age urn), and those possibly belonging to a particular period (for example a settlement site seen by aerial photography and recorded as being 'late-prehistoric' in date). Making this distinction is obviously vital, as each set of data derived from these date allocations has different archaeological implications. These differences show up when one compares two distributions: one displaying entries which might be attributable to a particular period, for example belonging to the Neolithic (Figure 2a), and a second displaying entries which were definitely of a given date (Figure 2b).

Such disparities highlight current limitations to our knowledge of the ancient landscapes, monuments and artefacts known from the region, thus indicating areas where existing evidence would benefit from further research to clarify and refine its chronology. In addition, by noting variations in the patterning of 'ONLY' and 'ALL' distributions, and then relating them to functional attributes, it proved possible to draw out some more wide-ranging implications for how we approach the region's archaeology (the latter ideas are discussed in the next section of this article).

Defining the functional divisions which could reconcile the holdings of diverse organisations proved much more difficult than accommodating the spatial or chronological components of their data. For a number of the institutions included in the project, the terminology that they employed to categorise their material had evolved gradually, and had been applied inconsistently. In some cases, systems had also come to embody a number of hierarchically related names as if they existed on the same ontological level. Indeed, even in cases where institutions held this information using an explicit, clearly defined thesaurus of terms, it proved challenging to incorporate them into a single, overarching structure.

Eventually, working with the terminology of most SMRs and a significant number of museums, we created a system of six headings, each with a number of sub-headings. This gave 37 distinct functional categories to which any given archaeological encounter might be attributed.

a: woodland a: temporary a: extraction
b: pasture/meadowland b: subsistence b: processing
c: agricultural/horticultural c: farmstead c: production
d: heathland/undeveloped d: farmstead complex – dispersed d: storage/exchange
e: other e: farmstead complex – nucleated e: disposal
  f: elite residence  
  g: urban settlement  
  h: conurbation  
a: dyke system a: production a: recreational
b: enclosure b: processing b: legal
c: water supply/drainage c: storage/exchange c: educational
d: other water related d: disposal d: medical
e: boundary marker   e: sacral
f: road/trackway   f: funerary
    g: ceremonial/commemorative
    h: military (structural)
    i: military (event)

The resulting database can generate, on a period-by-period basis, distributions across the region for each higher-order or lower-order category, or for any combination of them (for example Mesolithic habitation sites, Neolithic funerary practice, Bronze Age land enclosure). Equally, we can look at particular types of activity across period divisions (for example late prehistoric ceremonial sites, or Iron Age and Roman farmsteads). However, to make sense of any of these archaeological distributions, we had to take on board the fact that current land-use and administrative structures affect the past intensity of fieldwork and data storage, while underlying geological formations impact on ancient settlement and land exploitation, together with present site visibility.

In order to help understand the implications of the above factors, we incorporated 'underlays' into our database, against which the various types of archaeological evidence might be set. We needed to view our distributions in relation to institutional boundaries such as county divisions or National Parks, and in relation to the present coastline and river courses, road network, and modern settlements. This information is not only essential to locate archaeological distributions readily, but also serves a more analytical purpose. For example, rivers have obviously been important through time for the resources associated with them and the communications corridors they provide. Equally, contemporary settlements and roads dictate the presence and movement of modern humans in the landscape, thus influencing the distribution of known archaeological material. In addition, we included vertical aerial photographs to provide evidence of current land-use, another major influence on the preservation and visibility of archaeological remains. For example, arable cultivation may create good aerial photographic opportunities but forested land may obscure features from the air, or military landholding protect them from overflying before the recent advent of satellite images. Finally, we incorporated information on the distribution of different land quality, on elevation and topography, and on solid and drift geology. For example, solid geology dictates access to mineral sources, thus impacting on human activity and settlement in all periods, and drift geology affects farming regimes in the past and site visibility in the present.

Thus far, the project has managed to incorporate data from a wide range of sources, giving a total of almost 60,000 entries, represented by c. 45,000 separate data points (some points covered material from more than one period or indicated more than one function). Naturally, additional work would allow us to incorporate the holdings of a wider range of institutions and to enhance the quality of such data generated by recent commercial projects. Hence we might spread our work into other types of data such as that generated by ground-based geophysics outputs and aerial photography or by survey of standing buildings. We could also look more thoroughly at later periods, for example the post-medieval/early modern (included in our project only on the basis of selected case studies), and other types of information, for example that produced by surveys of standing buildings or incorporated in documentary sources. Finally, we need to test patterning more systematically, make the data accessible to a wider range of users, and to define mechanisms for updating it on a regular basis. However, even at this less-developed stage, it is clear that the distribution of data for our various period and functional categories demonstrates definite arrangements, patterning that demands explanation.


© Internet Archaeology URL:
Last updated: Mon Nov 26 2007