7. Related Work

This section describes some of the existing efforts for archaeological data representation and management, and compares and contrasts the Faceted Query Engine with these systems.

7.1 Faceted navigation and search over text documents

Archaeo-Browser from the Common Information Environment is a search tool that provides unified access to information on the historic environment from a range of public and academic institutions in the UK. The tool enables faceted search over a repository of articles. To search for articles of interest, the user may select one or more categories, and documents that match all selected categories will be returned as results. In addition to filtering results by category, the user may choose to filter by keyword or a set of keywords, as well as by profile – a pre-defined label or set of labels. Keyword, category, and profile-based search can be combined within the same query.

The Faceted Query Engine combines faceted navigation capabilities with an expressive data model and query language, and allows for more sophisticated data representation and manipulation compared to Archaeo-Browser. Nonetheless, projects like the Archaeo-Browser demonstrate the appropriateness of the faceted model for archaeology.

7.2 Semi-structured representations of archaeological data

The Online Cultural Heritage Research Environment (OCHRE, formerly XSTAR) is an XML system for textual and archaeological research. The OCHRE project defines an Archaeological Markup Language (ArchaeoML) Schema that adheres to the W3C XML Schema specification, and implements a graphical user interface that, when completed, will support data maintenance and querying.

OCHRE allows for very flexible and expressive modelling of cultural heritage data by adopting an item-based approach (a term coined by the authors of OCHRE). The framework defines a set of 20 general categories that describe different aspects of the cultural heritage domain, and places items (entities, units of observation) into one of these categories. However, categories do not rigidly prescribe a schema: each item may describe additional attributes of its own. The authors of OCHRE argue that item-based modelling is required in complex domains, and that the class-based approach taken by the relational model does not give adequate expressiveness. However, as we argued in Section 3.3, albeit in a slightly different setting, flexibility and expressiveness come at a price of higher conceptual complexity. The Faceted Query Engine overcomes the rigidness of relational modelling and of monolithic hierarchies by offering a flexible framework that is still class-based. Unlike OCHRE, our model allows the expression of complex hierarchies and cross-referencing the data while at the same time keeping the conceptual complexity low. In our model an item may belong to multiple classes; the item's attributes are always defined by the classes to which it belongs, and not by the individual item itself.

7.3 Structured representations

Integrated Archaeological Database (IADB) by York Archaeological Trust, has been used as a platform in several projects, including the Silchester Virtual Research Environment and Walmgate. IADB models archaeological data as Finds and Contexts, referred to as primary records. At the post-excavation analysis stage, the IADB provides a means of grouping these primary records into Sets, Groups and Phases. An additional datatype, Objects, allows the addition of documentation and interpretation to primary records.

Like the Faceted Query Engine, IADB implements browsing and navigation facilities, enriched by cross-references. However, unlike our solution, IADB employs monolithic (as opposed to faceted) hierarchical organisation: each hierarchically organised entity always belongs to a single place in the hierarchy. So, in the IADB framework, each Find is a member of a single Context, each Context is a member of a single Set, and so on. In contrast, the Faceted Query Engine allows an entity to be placed into several positions in the hierarchy. Please refer to Section 4 for more information regarding this distinction.

Another important difference between IADB and the Faceted Query Engine is the availability of query facilities. For data analysis that goes beyond browsing and navigation, IADB allows its users to pose SQL queries against the relational tables where the data is stored. Such query facilities are powerful, but writing SQL queries is burdensome for non-technical users. The Faceted Query Engine implements query facilities as part of its interface. Please refer to Section 3.3 for a discussion of this point.

7.4 Ontologies

Comité International pour la Documentation (CIDOC) developed a conceptual reference model (CRM), a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information. CIDOC CRM is a rich hierarchical model that supports multiple-inheritance hierarchies (ones where an entity has more than one parent) on both classes and attributes.

The main conceptual difference between our approach and the CIDOC CRM is, again, the representation of hierarchies: CIDOC uses an ontology, whereas hierarchies are monolithic (see Section 4). Most features of the cultural heritage domain laid out by the CIDOC specification can be represented by the faceted data model, and manipulated using our query language. We are currently working on incorporating hierarchies on attributes into the model (see Section 8 for details).

7.5 Controlled vocabularies

The need for a standard vocabulary for describing archaeological data has been recognised by the community, and several projects are under way to address this issue. While we have not utilised a standard vocabulary for the Thulamela or Memphis projects, a schema designer could use our system in combination with a standard ontology.

The SPECTRUM Terminology project by MDA is an on-line thesaurus that consists of a repository of cultural heritage terminologies, along with tools and guidelines for creating terminologies. The MDA Archaeological Object Thesaurus is a hierarchically organised list of common archaeological terms.

Comité International pour la Documentation (CIDOC) created a time-period thesaurus as part of its data standardisation and integration effort (Doerr et al. 2004).


© Internet Archaeology URL:
Last updated: Mon April 30 2007