4.4 Semantic implementation issues

The user interface is intended to demonstrate the research aims, while shielding the user from the complexities of the underlying ontology. It combines various interface elements that could be separated and given more space in an operational system, allowing more space for additional details with the results. It also illustrates the potential of a structured semantic search, which has been relatively under-explored. More common semantic web user interfaces follow a faceted 'filter-flow' browsing paradigm (e.g. Hearst et al. 2002) as seen in the ADS ArchSearch system, or triple store implementations, such as CLAROS, MuseumFinland, or slashfacet. Alternatively, an initial string search (as in the Pilot) delivers results with accompanying semantic links, which can be short-cuts of ontology relationship chains.

While the user interface permits complex queries to be constructed, there are known performance issues with SPARQL generally (and particularly with free text queries). Thus some complex searches or free text queries without semantic restriction may time out on the current server. Recent SPARQL platforms and future versions of SPARQL are anticipated to offer more help for performance. Additionally, targeted indexing of the underlying triple store database has been found helpful in the CLAROS project. Another solution found in semantic applications is the prior computation of common 'short cuts', particularly relevant to some of the long CRM relationship chains.

Another limitation with the current SPARQL platform employed is the lack of support for incorporating concept-based expansion (Tudhope et al. 2006) and inferencing. While the pilot system (section 3) afforded concept expansion in an initial string search, a similar facility is not available in the SPARQL concept search in the Demonstrator. For example, concept expansion over types using SKOS thesauri could allow advanced query options that combined (say) hearth and hearth: debris (as discussed in section 3.1), avoiding the need for separate queries. Or concept expansion could incorporate an 'entry vocabulary', such as mosaic for floor: tessellated, or effective synonyms for metal, so that advantage could be taken of the synonyms and semantic relationships in a thesaurus.

Another avenue of future work concerns more advanced use of inferences over the CRM-EH relationships. The current Demonstrator makes some use of the semantic structure, as in the ability to query over stratigraphic relationships. Additional reasoning over CRM-EH classes and properties could, for example, allow advanced query options that took into account both groups and contexts, as in the furnace scenario discussion. It could also support search over a wide range of physical relationships, in addition to or in combination with stratigraphic relationships (as discussed in section 4.1).

Further inferences would permit a 'transitive' query, whose scope extended to a complete group hierarchy, or all contexts stratigraphically above a particular context, as opposed to just the immediately connected one. Such a development would be significant, in that it would permit the varying practices followed by different archaeological organisations to be reflected within the CRM-EH framework and to be integrated semantically via advanced search options. For example, RRAD and MoLA generally tend to follow a group/subgroup/context structure, whereas LEAP tends to follow a group/context pattern, including more context hierarchies. These differences of recording practice could be reflected within the semantic model but be meaningfully connected in search results.


© Internet Archaeology/Author(s)
University of York legal statements | Terms and Conditions | File last updated: Mon July 18 2011