The ArchAIDE Archive
Gabriele Gattiglia is a Researcher in Archaeology at the MAPPA Lab of the University of Pisa. He leads the MAPPA Lab, which manages the MOD (Mappa Open Data), the Italian repository for Open Archaeological Data. His fields of interest regard Digital Archaeology and Archaeological Method and Theory. He deals with mathematical applications and Big Data issues in archaeology. He has been the coordinator of ArchAIDE project (http://www.archaide.eu).
Francesca Anichini is a scientific technician at the MAPPA Lab at the University of Pisa. She teaches Archaeological Communication Management at the School of Specialisation in Archaeology at the University of Pisa. Since 2019 she is also a PhD student in Contemporary Archaeology. She deals with archaeological methods, archaeological open data, archaeological potential and archaeological communication. She has been project and communication manager of ArchAIDE project (http://www.archaide.eu).
This paper is focused on one of the less-known aspects of the ArchAIDE project: the open data policy and the management of material covered by copyright. The ArchAIDE project developed a system for automatic recognition of pottery with an innovative app for tablets and smartphones. This goal has been implemented through the development of two distinct neural networks for appearance-based and shape-based recognition and lays on the creation of a digital comparative collection, incorporating existing digital collections, digitised paper catalogues and multiple photography campaigns. For achieving the correct management of the material which falls under copyright or database protection, the EU directives on Copyright (2001/29/EC) and Database protection (96/9/EC) were analysed. The scientific research exception permitted the implementation of the project, in particular, (i) as regards the area of copyright: published works, mentioning the source and the authors’ name, can be used to the extent justified by a non-commercial purpose; the use of the structure of published databases can only be used, mentioning the source, to the extent justified by a non-commercial purpose; (ii) as regards the sui generis right: databases can be used, even if scientific research is not its sole purpose, mentioning the source and the authors’ name, to the extent justified by a non-commercial purpose. This does not mean the ArchAIDE project necessarily holds the copyright to the newly digitised, remixed data. Whether these data can be made available outside the project would need to be negotiated with each copyright holder. Showing the potential of digitising paper catalogues in a way that demonstrates how their content can be actively reused allows ArchAIDE to open a discussion with publishers and other data providers about the importance of making their resources available in new ways, with a tangible benefit (seeing their data in use within the app), thus furthering the long-term discourse around making research data open and accessible.
Participating in the H2020 open data pilot, ArchAIDE was committed to creating sustainable outputs where the project held the copyright. This included making the interoperable, multilingual vocabularies, and the video corpus created by the project available, as well as the 2D and 3D models created from the ADS archive Roman Amphorae: a digital resource. This aspect of the archive represents a good exemplar of best-practice reuse. When this digital resource was first deposited in 2005, creating automated 2D and 3D models that could be used to create ‘virtual sherds’ to train the deep learning algorithm could not have been a use envisioned. As 2D and 3D models were created for every type from Roman Amphorae, it was possible to link the two archives, amplifying the usefulness of both. The ArchAIDE archive includes 2D vector drawings in SVG format for download, and 3D models for interactive use within the 3D viewer (created using 3DHOP). The 3D models can also be downloaded for use with 3D software and 3D printing.
It was also hoped the thousands of photos taken by the project for training the algorithms, might result in new comparative collections that could be made freely available as part of the ArchAIDE archive. Still, intellectual property rights in many European countries are restrictive and did not allow photos taken by ArchAIDE partners of sherds held in national and regional collections to be made available. It is hoped that seeing the usefulness of these data within an example application such as ArchAIDE may also help convince the holders of these resources to move towards more open data policies. Finally, the source code and neural network models will shortly be made publicly available as open source.
Acknowledgements. This research was supported by the EU Horizon 2020 grant agreement No. 693548. We thank all the members of the ArchAIDE team (http://www.archaide.eu).