9. Conclusion and Future Work

The ArchAIDE project was ambitious and wide-ranging, but accomplished the majority of its aims over its three short years. Much was learned not only about the use of image recognition to identify archaeological pottery, but also the importance of access to comparative collections and intellectual property rights. ArchAIDE partners worked hard both to develop innovative technologies while still ensuring the knowledge and workflows of archaeological pottery specialists were respected. By being as outward facing as possible, both through the inclusion of Associate Partners and a comprehensive programme of communication and dissemination, contacts were made across the archaeological community who wanted to offer ongoing support and continue to contribute to the project, which has become a challenge now that it has formally ended. Along the way, several unexpected lessons were learned.

As the project progressed, it became evident that the comparative data necessary to implement the ArchAIDE database and app must be derived from a variety of sources, each with different advantages and restrictions. Comparative data (data that is meant to show typical pottery types and characteristics, against which pottery to be identified by the user is compared) is often useful only if it is considered authoritative. For instance, an analogue equivalent of the Roman Amphorae archive might be a particular comparative paper catalogue for Majolica of Montelupo that is accepted by specialists in that pottery type as a required citation in any peer-reviewed paper, and is also therefore considered authoritative. Roman Amphorae is free (as long as users abide by the appropriate terms and conditions of use). This is not the case for the paper catalogue described in the second example, where conversion into a dynamic digital resource was never envisioned.

While useful tools to help digitise the authoritative paper catalogues necessary to show the technical proof of concept of the ArchAIDE app were developed by CNR, this does not mean the ArchAIDE project necessarily now holds copyright to the newly digitised, remixed data (although the metadata created as part of this process by the ArchAIDE project can be argued to be new data, for which the project can claim copyright). Whether these data can be made available outside the proof of concept would need to be negotiated with each copyright holder, which represents a major logistical and (potentially) financial difficulty. This becomes even more complicated if the ArchAIDE app were to be monetised in any way. The issue cannot of course be solved by ArchAIDE, but instead provides another important proof of concept opportunity by the project. Showing the potential of digitising paper catalogues in a way that demonstrates how their content can be actively reused allows ArchAIDE to open a discussion with publishers and other data providers about the importance of making their resources available in new ways, with a tangible benefit (seeing their data in use within the app), thus furthering the long-term discourse around making research data open and accessible.

Another key lesson was the amount of training data necessary for the image recognition algorithm to return useful results. The researchers at Tel Aviv University were used to working with thousands of examples to train for a particular type, but ArchAIDE partners often struggled to obtain 10 per type. Ten per type was not ideal, but was determined to be the minimum amount that had to be acquired. This resulted in far more time and effort spent on digitising the paper catalogues and undertaking the enormous photo campaigns to capture the necessary primary data. This effort helped partners to understand the importance of working together if the humanities wish to take advantage of the many machine-learning methods now available. Datasets are small, fragmented, and rarely optimised for machine-learning applications.

It was also hoped the photo campaigns might result in new comparative collections that could be made freely available as part of the ArchAIDE archive, but intellectual property rights in many European countries are restrictive, and did not allow photos taken by ArchAIDE partners of sherds held in national and regional collections to be made available. It is hoped that seeing the usefulness of these data within an example application such as ArchAIDE may also help convince the holders of these resources to move towards more open data policies.

Figure 24a Figure 24b
Figure 24: To address the paucity of training data available, 3D models were broken into 'virtual sherds' and used to train the shape-based algorithm

Once of the most fundamental lessons was that it was not possible to design an image recognition system that could identify pottery using both decoration-based and shape-based characteristics. It took considerable effort and discussion, but it became clear that it was necessary to separate them, developing two different algorithms. This allowed a creative outcome, as separating out shape-based recognition allowed the 3D models to be used to create desperately needed training data. By breaking the 3D models into 'virtual sherds' and using the sherds to train the shape-based image recognition algorithm, the accuracy rate was increased to an acceptable level (Figure 24).

ArchAIDE has shown the potential of using automated image recognition to identify archaeological pottery, while also illustrating some of the challenges that will need to be addressed in future. ArchAIDE has also shown it may be used for a variety of pottery types if the necessary comparative data can be gathered (and potentially other artefact types as well), as virtually all pottery identification relies on recognition based on either the shape or decorative elements of a vessel (or both). The ArchAIDE partners hope to continue working together, further developing the ArchAIDE mobile and desktop apps, but also working collaboratively across the archaeology community to build open comparative collections for use by all.

For other publications on aspects of the ArchAIDE project, please see (Anichini and Gattiglia 2017; Anichini et al. 2018; Banterle et al. 2017a; Banterle et al. 2017b; Dellepiane et al. 2017; Gattiglia 2018; Gualandi et al. 2016; Wright and Gattiglia 2018) all of which can be downloaded from the ArchAIDE Zenodo Community.


Internet Archaeology is an open access journal based in the Department of Archaeology, University of York. Except where otherwise noted, content from this work may be used under the terms of the Creative Commons Attribution 3.0 (CC BY) Unported licence, which permits unrestricted use, distribution, and reproduction in any medium, provided that attribution to the author(s), the title of the work, the Internet Archaeology journal and the relevant URL/DOI are given.

Terms and Conditions | Legal Statements | Privacy Policy | Cookies Policy | Citing Internet Archaeology

Internet Archaeology content is preserved for the long term with the Archaeology Data Service. Help sustain and support open access publication by donating to our Open Access Archaeology Fund.