Is it the Thought that Counts? An evaluation of Digital Archaeological Data Archiving in Catalonia

Archaeological data archiving has not been a major concern in Catalonia. The heritage legal corpus does not engage with the archiving and curation of archaeological data other than the excavation reports. However, it highlights the responsibility of the administration in cataloguing and disseminating the cultural heritage. For this reason, a lot of effort has been invested during the past few years in inventorying known archaeological sites and publishing archaeological reports, with the aim of increasing the transparency of the administration towards its citizens. This article describes the present situation for archaeology in Catalonia, its legal framework and the main initiatives carried out to archive, manage, and publish archaeological data from a user's point of view. Its main aim is to evaluate the current state of archaeological data archives and public databases by analysing the existing platforms with a set of indicators. This assessment leads to the conclusion that the current repositories and databases could be more worthwhile if some limitations were overcome, but also that the advance in archaeological data archiving is restricted by existing law.


Introduction
Spanish cultural heritage, defined as movable or immovable elements of artistic, historic, palaeontological, archaeological, ethnographical, scientific or technical interest, is managed by the governments of the autonomous regions (Comunidades Autónomas). Even though there is common ground established by the Heritage law of 1985 (Spain 155/1985), there are differences between the regions and their heritage legislation, especially relating to cataloguing sites and the management of reports and archaeological data.
In the case of Catalonia, cultural heritage management falls within the purview of the Culture Department, the General Directorate of Cultural Heritage and, in the specific case of archaeological heritage, the Archaeological Service of Catalonia. Since 1993, when the Catalan heritage law was issued (Catalonia 9/1993), different repositories and platforms have been developed by the Catalan administration to archive, manage and disseminate archaeological information. However, this has also been a period of evolution for archaeological fieldwork, and the changes experienced have set new possibilities and challenges for heritage management.
This article describes the current situation in Catalan archaeology, outlining the organisation and management of archaeological fieldwork. It also provides an overview of the state of archaeological data archives and public databases. To do so, a set of descriptors have been selected and used for each platform (repository, database, etc.), and applied when possible. These indicators are described in Table  1. The final aim of the article is to analyse and evaluate the current situation on this issue in Catalonia. It must be said that all descriptions and analyses are made from the point of view of a user, and therefore represent an external and inevitably subjective opinion. The main strengths, limitations and future challenges of the existing repositories and platforms are also addressed.

Archaeology in Catalonia Today
In Catalonia, municipal councils are responsible for knowing, inventorying and cataloguing the archaeological elements that exist in their territory. This information is subsequently coordinated by the regional administration, where the management of cultural heritage falls under the responsibility of the Culture Department and its General Directorate of Cultural Heritage (Vila Recio 2006). Inside this department, the Archaeological Service of Catalonia (SAC) controls the archaeological and palaeontological interventions and evaluates the archaeological heritage, determines and assures its safeguarding, and promotes archaeological research (Miró i Alaix 2004). Inside this same department, the Technical Support and Inventory Service of the Archives, Libraries, Museums and Heritage Directorate (SSTI) is responsible for the coordination of the inventory and cataloguing of heritage elements and the associated data, as well as their protection.
Archaeological interventions are classified by their purpose. A slightly different treatment is given to interventions which derive from a research project instigated by universities or research institutions (research or programmed interventions) (Miró i Alaix 2004) compared with those not motivated by a research projectwhich can be in advance of land development projects (preventive interventions) or caused by the unexpected finding of archaeological remains during construction work (urgent or salvage interventions).
In 2019, there were 1024 archaeological interventions in Catalonia, most of which (868 which is 85% of the total) were rescue and salvage excavations, while 157 (15%) were created by a research project. While the number of rescue interventions has fluctuated over recent years in relation to the construction market, the number of research interventions has been slowly growing (Generalitat de Catalunya 2019).

Legal framework
In Catalonia, the archaeological and palaeontological heritage is regulated by the law on cultural heritage issued in 1993 (Catalonia 9/1993), which establishes the definition, protection, study and management of cultural heritage. This law builds upon the National Heritage law of 1985 (Spain 16/1985), which determined the creation of the entities that are now managing archaeological interventions and archaeological heritage, and their responsibilities.
Since then, even though archaeological fieldwork and research have undergone profound changes, no further major legislation has been written. The 1993 law was only actualised by a decree issued in 2002 (Catalonia 78/2002), during the construction boom, where the current workflow to manage and carry out archaeological interventions was defined. It described the procedure to be followed in order to authorise an intervention, the responsibilities of actors during the full process, where the finds must be deposited, and which reports must be written. This decree also established that all materials and inventories coming from an archaeological intervention must be accessible to everyone, even though no additional details were given on how, or under what conditions, this requirement should be met (art. 28.1).
In addition, a comprehensive plan (PIACAT) was written in 2009 and implemented in the early 2010s (Departament de Cultura i Mitjans de Comunicació 2009), which was drafted after organising a participatory process with different stakeholders within the archaeological community, and implied changes in archaeological practice (Junyent 2009 None of the aforementioned laws, decrees or plans make much reference to archiving the archaeological data or sharing the data collected during archaeological interventions or analysis of archaeological artefacts. As a result, the project directors (see further discussion below) are not specifically obliged to publish or share any kind of information or raw data, other than the report, and the administration has no responsibility to provide a public repository for archaeological data.

Archaeological fieldwork
Before starting any archaeological intervention, archaeologists must ask for permission from the SAC. This entity is responsible for evaluating the project, its promoters, and directors, and for granting licences. The only requirement to direct an archaeological intervention is to hold a degree in history or archaeology and not to have any overdue reports, as established in the 2002 decree (Catalonia 78/2002).
Within three months after conducting fieldwork, project directors are expected to submit an initial report summarising the actions performed and the preliminary results. They then have two years to write a more complete report, which will be evaluated by the SAC and then archived.

Reporting and documentation
During the two years allowed for writing the report documenting the works carried out and their results and interpretations, projects must also document the analysis performed on the materials and samples collected. This report, considered an objective account of the data collected during the intervention and all the results from the scientific study (Departament de Cultura Direccio 2016; Departament de Cultura i Mitjans de Comunicació 2009, art. B.I), is mandatory. Not complying with this requirement means after two years project directors will not be able to undertake archaeological work until the outstanding report is submitted to the administration (Catalonia 78/2002).
The finds and find inventories must be available for public consultation, and must be given to a public institution (museum, public cultural heritage storage facility) together with a copy of the report. This report is considered of public interest and the archaeologist(s) must provide access to it on request, even if it has not yet been published by the administration.
The required content of the final report is also defined in the 2002 decree (Catalonia 78/2002, art. 12.1). It includes both structural recommendations (sections such as motivation, goals, description of the work, methodology, conclusions, and interpretations) and specific content and documentation (such as stratigraphic, planimetric and photographic records, and an inventory of the finds).
A more detailed explanation of the requirements for the report is included in the style manual published by the Archives, Libraries, Museums and Heritage Directorate (Departament de Cultura Direccio 2016), which is considered to be in constant revision and amelioration. Although this document has no legal power, it describes Submission is normally made on paper, together with a CD containing the images from the report, the planimetric files, and a copy of the text of the report. Since the COVID-19 epidemic, this submission is now only made online.
After being accepted by the SAC, the reports are managed by the SSTI. This service archives a paper copy of each report (a second paper copy is archived in the corresponding territorial office of the SAC) and coordinates the digitisation and archiving of the reports into the digital report repository (Calaix)  Apart from the report and all the data that it must include, there is no other official interest in the archiving, curation or preservation of the data collected during the research related to the archaeological intervention.

Digital Infrastructures for Archaeological Data
The Catalan public administration (mainly via the SSTI and the SAC) has created different digital infrastructures that make data and information about the archaeological heritage accessible, as is its responsibility as stated in legislation.
Most of the data available come from the administration itself. The situation is constantly evolving, and many tools have changed in the last five years, which is a good sign of the interest from the administration on these platforms and this issue.

Immovable Cultural Heritage Inventory (Invarque)
The SSTI, together with the SAC, keeps an updated inventory of all the known cultural heritage in Catalonia (Invarque), considered the main information source for archaeological sites in Catalonia (Departament de Cultura i Mitjans de Comunicació 2009, art. B.I). It was started in 1982 (Vila Recio 2006) and has its origin in archaeological mapping (Cartes Arqueològiques). Together with the Architectural Heritage Inventory and other similar inventories, it constitutes the Cultural Heritage Inventory detailed in the 1993 law, which aims to 'allow systematic documentation and collection, investigation and dissemination of all the heritage elements included' (Catalonia 9/1993). It is also established that public access to the data within the inventory must be granted by the Administration.
The database is freely accessible online, but the data are not downloadable. In June 2021, the inventory database included information about 11,954 sites, some of which have not been studied but have been documented or identified during visual surveys. The data are being constantly updated and completed with information collected from the archaeological reports of the latest interventions. Despite this there is a considerable backlog, and the data are not up to date; as of July 2021 the latest data were from 2015.
Each record has a standardised form, and the data available include the name of the site (Name, Other names), its location (Location, Situation), the approximate time period and chronology (Chronologies), and the site type (Type To prevent looting, the exact site coordinates are not shown publicly, but the approximate location of the site is given by the name of the municipality to which it belongs. Since 2005, the database is also associated with a GIS (Geographic Information System) which includes georeferencing of the sites. The records can be browsed by filtering using the name of the site, location (comarca and/or municipality), type of site, chronology, and type of legal protection.
The Catalan version of the Getty Art and Architecture Thesaurus (AAT) is used in the Inventory, primarily in the fields Chronology and Typology.

eGIPCI
The same database used to be accessed through the platform eGIPCI, which stands for eGestió Integral del Patrimoni Cultural Immoble (Integrated e-Management of Immovable Cultural Heritage). This platform was developed in 2005 to allow designated cultural heritage stakeholders (researchers, administration workers, institutions, municipalities, and archaeological companies) to access the geographical cultural heritage data. It was maintained until 2015, when the new heritage GIS was created (Geoportal). Since then, eGIPCI has been operating but in June 2021 its geographical viewer stopped being available, as it was considered an outdated platform, and was no longer maintained. Most of this information can now be accessed through the Geoportal of Cultural Heritage.
eGIPCI was initially designed for registered users to query geographical heritage data connected to the inventory after acceptance of a justified petition. To access the information available in eGIPCI, users are asked their name, national ID number, address, affiliation, email address, and the reason for their request.

Geoportal
The Cultural Heritage Geoportal was implemented in the mid 2010s, built on the Geographical Information System developed in 2005 and connected to the Immovable Cultural Heritage Inventory and other complementary datasets. The platform grants access to the geographical information of Catalan archaeological heritage and is open and publicly accessible. It now maps around 25,000 monuments, 12,000 archaeological sites and 400 palaeontological sites (Redacció 2017), which is more than 90% of the catalogued cultural heritage items in the Invarque inventory (Departament de Cultura 2017).
The Geoportal is considered a tool for use in preventive archaeology, although its main goal is to publish and disseminate the data. In June 2021, it was still in development, and new features are expected to be added in the future, such as data download. The dataset is being updated, as the digitisation of the geometries of heritage elements has been an ongoing task for the past few years.
The main layer of the GIS is a vector file with the geometry of the catalogued archaeological sites (polygons). Although the data comes from the Invarque inventory, only a few fields of each record (Name, Other names, Municipality, Protection and Classification, UTM X, UTM Y, Latitude, Longitude) are displayed in this platform, and no link to the same record in the inventory is available. No other data or metadata are available. The site records can be browsed with searches over these fields, and the user can also apply some basic analysis tools in combination with multiple layers such as property registry maps, municipalities, etc.
The platform uses standard technologies from the Open Geospatial Consortium (OGC), which allow interoperability of the data (Redacció 2017). All datasets (layers) are expected to be downloadable in the future.

Interventions Map
The Interventions Map was created in 2017 by the local branch offices of the SAC with the goal of visualising data about the archaeological interventions carried out in Catalonia. The map is publicly and freely accessible, and it displays basic data about archaeological interventions and their promotion, coming from the database used by the SAC to manage intervention petitions. Currently, the datasets containing archaeological intervention data have no more than 1477 records in total, as only data from 2014 to 2018 from Barcelona and surroundings are available. However, it is planned to extend it to the whole Catalan territory in the future.
The data are organised in vector-based layers where each layer contains the data about the interventions that took place in a specific year, digitised in points. Each record contains information about the name of the intervention (Name), a photo, the director and/or the company that created it (Director/Company), the category of the site (Category), the type of the intervention according to its motivation (Type), a short description of the found elements (Description), the chronology of the site (Chronology), the coordinates (UTM X and UTM Y), and information about the municipality where the intervention took place (Municipality and Comarca). No link is made to the site in question (via the Inventory) or to the resulting report (via the report repository Calaix). There is no reference to any thesaurus or controlled vocabulary if used.
The data and all their related metadata are downloadable in different geographical standard formats (GeoJSON, ESRI Shapefile, .kml, .gxf and .gpx) and in different EPSG projections.

Calaix
Calaix is the digital repository of the Culture department of the Catalan government. This public repository stores different collections about archaeological, architectural and popular heritage, and data about museums, libraries, archives and cultural institutions across Catalonia. Started in 2010, the archaeology section of the repository contains data from archaeological excavations, mainly intervention reports, but also graphic documentation from the interventions, such as planimetric files and images, and aerial imaging of Catalan archaeological sites.
Its most important archaeological collection contains the digitisation of the final reports of the archaeological interventions. This collection, which consists of 5870 reports (as of June 2021), has been created with the digitisation of archaeological reports since 1981. The reports are available in .pdf format; the older ones are PDFs generated from the digitisation of the paper reports submitted to the SAC, while the newer ones are already submitted in a digital format. Although there are some reports available from recent years, most of the reports of interventions from 2015 and later have not been uploaded. Despite this existing delay, upload of recent reports is an ongoing task.
Calaix uses the OAI protocol. Each report is given a unique identifier (URI) and is enriched and described with some metadata; Dublin Core and the Getty AAT controlled vocabulary are used, as well as geographical standards. Some of the most common fields that describe the records include the date of creation of the record (dc.date.available), the date stated in the report (dc.date), the main authors of the report (dc.creator) and collaborators (dc.contributor.author), the identifier of the record (dc.identifier.uri), the format (dc.format), the language (dc.language.iso), the title of the report (dc.title), its type (dc.type), the municipality or municipalities where the intervention took place (dc.coverage.municipi) and their administrative region (dc.coverage.comarca), the chronology of the site (dc.subject.cronologia), the name of the site (dc.subject.jaciment), its typology (dc.subject.tipologia), the format of the file (dc.format.type) and whether the report includes an inventory of materials (dc.description.inventari) and an inventory of stratigraphical contexts (dc.description.ue). In some cases, however, some fields are missing; similarly, there are some records where metadata include other fields, such as a reference (field dc.identifier.inventari) to the site record in the Inventory (referenced by the database ID), which is not a globally and persistent unique identifier. The names of the authors are not normalised or referenced in external databases such as ORCID.
Neither is the name of the site, which can be written differently from the official one stated in the Invarque inventory). Although Getty AAT is mostly used, there are variations in the vocabulary and categories used in some of the fields, such as dc.subject.cronologia, and dc.subject.tipologia. There is also little information about the intervention that prompted the report, and a lack of data about the company or institution that prompted it, and the purpose.
Search facilities are quite limited and can be made only on the author, the title, the date of the report, and the typology, chronology, name, town or administrative region of the site. As not all records include these fields, and as the values of these fields are not completely normalised, searches can become slow and unsuccessful.
The PDFs are freely accessible and easily downloadable from the repository, and no authentication is required. Metadata, however, are not downloadable. All data, not just metadata, are under the CC0 Creative Commons licence, as stated in the footer of all the official web pages from the Catalan administration.
The photographs included in the report are uploaded separately and linked to the report record, as well as the planimetric record. In both cases, the files are easily downloadable from the record page. In June 2021, even though archaeologists are currently asked to submit the planimetric records in .dwg format (Departament de Cultura Direccio 2016), no file has been uploaded in this more useful vector-based format yet. All planimetric records available are still raster-based images.
The usage of this resource varies depending on the report. According to the statistics displayed in the repository itself, the whole archaeological collection is accessed around 300 times per month and most of these views are of the reports collection. Each report is accessed around 3 times per month on average, although there is a lot of variability.

Carta arqueològica de Barcelona
Barcelona is the capital and biggest city in Catalonia, and it has a rich archaeological heritage. Although the city only represents 0.32% of the Catalan territory, around a quarter of the archaeological interventions of 2018 took place in Barcelona, most of which were preventive interventions and development led (Generalitat de Catalunya 2019).
To better manage this situation, Barcelona has its own municipal Archaeological Service with responsibility to ensure the protection of the archaeological heritage of the city, and to document, preserve and disseminate information about it. During the last decade, this office has focused on data communication and openness, by working on data archiving and mapping (Miró i Alaix 2016). It has developed its own archaeological report archive, where some of the reports of the interventions that took place in Barcelona between 1962 and 2021 have been digitised and uploaded, and some mapping tools to browse the archaeological remains located in the municipality.
The main tool developed by the Archaeological Service of Barcelona is the Carta Arqueològica, a free, open map published in 2012 that allows the user to query the areas affected by archaeological interventions, as well as some data about the located remains (Miró i Alaix 2014). Its main goal is to provide an integrated project with a focus on analysis, diagnosis, and evaluation of the archaeological remains of the city (Miró i Alaix 2016). This tool therefore is designed to be used by researchers, but also by land developers and professional archaeologists, to safeguard known archaeological heritage (Miró i Alaix 2014). However, it was also envisioned as a dissemination tool, to bring together society and the common archaeological heritage.
The dataset mapped in the Carta Arqueològica includes information about more than 3000 archaeological elements in Barcelona. It consists of a layer, where each element is represented by its geometry (polygon). The register for each archaeological element is identified with a unique and permanent URL and is documented with some metadata. It contains information about the element's location (Context, District, Plot code, Addresses, UTM X and UTM Y), its description (Name, Description, Historical mentions), its chronology (Typology, Initial chronology, Final chronology), its legal protection (State, Existing protections, Category, Classification, and Level of protection) and the archaeological interventions undertaken (Date, Type, Administrative type, Director/Author, Motivation, Promoter/Owner), together with documentation and associated references (Miró i Alaix 2016). The register may also be linked to photographs of the site and the digitised report, if it is available in the repository. A special thesaurus has been created for the fields Chronology and Typology (Miró i Alaix 2014).
The interface allows users to search and filter through the name of the site or intervention (Name), the name of its director (Director/Author), the district where it took place (District), the type of remains and their chronology (Typology, Initial chronology, Final chronology), as well as their current legal protection (Existing protections).
The same dataset is openly available in the Open Data Portal of the city council from where it can be downloaded in different formats (KML, GeoJSON, KMZ). However, the data about the sites depicted in the online platform when accessing the registers from the map (intervention date, name, authors, documentation, chronological data, etc.) are not included in the downloadable dataset, which is reduced to a set of geographical points with minimal data.
other types of data related to or resulting from archaeological interventions is made in the current legislation and therefore no further action has been taken in this direction.
The different platforms described above allow users to freely access various datasets containing data about cultural heritage and its management, as well as its archaeological resources, which primarily include archaeological data from the administration. The range of users goes from all of society to professionals within archaeology, and its potential users can be within research, heritage management and protection, land development or heritage dissemination.
With some exceptions, most of these databases have a backlog that should be addressed. They have enormous potential that is currently limited by the delay in updating the data (in the case of the catalogue) and in the upload of recent reports (in the case of Calaix), which is becoming a bigger problem every day, as the number of archaeological interventions continues to increase.
Similarly, the full potential of the data available is not exploited because of the search limitations. When searching for archaeological sites, it is exceedingly difficult to browse the database, as not all fields are searchable. Also, the lack of normalisation of some fields (especially in Calaix) or the low application of thesauri, make searches less efficient and less successful.
Another big issue is the lack of interoperability. Most of the time there is no connection between the different databases, even though their relationship is obvious. For example, when consulting information about an archaeological site in the Inventory or the Geoportal, there is no option to access the reports from that site that are available online in the Calaix repository. In this sense, the Carta Arqueològica de Barcelona is an example of interoperability, as each site register is connected to its interventions and to the digitised reports. However, the existence of the Carta Arqueològica itself demonstrates the fragmentation of the CRM platforms in Catalonia. As a result, there is a complex panorama of single, disconnected, and complementary tools that allow access to the same or related datasets that could be simplified by creating a unified platform providing access to the entire dataset. This situation reduces the usefulness of these platforms, as it fragments the available information and does not allow the user to exploit the relationships between the registers, and it diminishes their search potential. These limitations might seem insignificant to the non-professionals accessing the platform to get information about their local archaeology, but it restricts the ability of researchers to find relationships within and between sites. It will be seen whether the Geoportal, still in development, will provide further advances in this direction. Finally, there is also a lack of documentation in general, and specifically about the provenance of the data and the processes used to collect it, as well as the criteria used to build the dataset. For example, no information about the reports that are still to be uploaded is given to the user, so it is impossible to know whether a missing report was never written or submitted or if it simply has not been digitised. Similarly, no information is given about the criteria used to decide what is considered an archaeological site, which defines whether it will be included in the Inventory. This information prevents performing large-scale analysis in the current system.
Despite these limitations, the importance of these platforms must be highlighted, especially Calaix, which is used by many archaeologists every day. This is the only public platform that allows access to archaeological research data directly created by the archaeologists, and not by the administration. As such, it could play a significant role in opening the door to sharing further data, because we must not forget that intervention reports are only one of the various outputs of an archaeological excavation. Indeed, the Catalan archaeological community still lacks a repository to store research data, and Calaix is well positioned to fill this need. The platform is widely known and used by the archaeological community for research and professional reasons. The use of new formats and methodologies, such as 3D modelling, will create fresh needs for this repository, and we will see if it can keep up with all the new data that will come.
In summary, multiple platforms have been developed over the past few years by the Department of Culture with the aim to open up cultural heritage datadata coming from the administration or via archaeological reportsto the citizens, as stated in the heritage legislation. Analysing the current situation of archaeological data archiving in Catalonia, we can draw two main conclusions and paths for future action. On the one hand, there is a complex network of databases to manage cultural heritage data, and multiple powerful online platforms developed to allow public access to these data. We should take advantage of these existing tools and make them more useful for researchers, allowing the user to cross-search them with more user-friendly search interfaces, permitting data and metadata downloads, and making interoperability possible, among other options. In short, we should plan to leverage the full potential of those platforms to fully benefit from the existing data. On the other hand, other than the administration's cultural heritage data and reports, there is no place for the archiving and publication of data, and no initiative on this issue coming from the public administration. It seems therefore that no real advances in archaeological data archiving will be made until the existing law is updated and these issues are included in future heritage legislation.