Heritage data-centric research: are FAIR data fair enough?
In order to participate in this workshop please fill in the Registration Form.

Date of workshop: will be announced soon..
Time of workshop: will be announced soon..
Duration of workshop: one session
Place: Cultural Conference Centre of Heraklion (CCCH)

In the current trend for e-Science, i.e. collaborative, computationally- or data-intensive research, archaeology is not a laggard. A number of initiatives are addressing how to manage and use data produced by heritage research, most notably the ARIADNE one in the archaeological domain (https://www.ariadne-infrastructure.eu), presently involving the most important research centres from all European countries in creating a comprehensive and integrated archaeological data infrastructure that so far has already registered little less than 2.000.000 archaeological datasets. Such infrastructure, implemented by ARIADNE, is bringing archaeology out of the “long tail of science”, i.e. those disciplines that make little use of data-centric research. It is revolutionising the concept of Big Data: not relatively few datasets, each with terabytes of numbers, as in nuclear physics; but millions of small datasets, all potentially relevant to a specific research question but including a large (and unknown) majority probably irrelevant at all.

E-Science relies on the well-known FAIR principles (https://www.force11.org/fairprinciples), stating that data should be Findable, Accessible, Interoperable and Re-usable. Now, if “F”, “A” and “I” mainly depend on the technical way in which data and metadata are generated, stored, managed and curated, the “R” has less technical (but not less important) implications. It involves theoretical, methodological and epistemological aspects that have not received enough attention in the current debate. It has been argued that e-science discovery could be modelled as a deterministic discovery process; nevertheless, even in this perspective, simply modelling the provenance of data is not sufficient, but the provenance of the hypotheses and results generated from analyzing the data need to be modelled as well.

Thus, to reuse data in cultural heritage it is necessary to expand the “R” facet of the FAIR principles at least into R3: Re-usable, Relevant and Reliable. Judging relevance and reliability may appear obvious to a human eye, but it is not to machine processing. Data reliability depends on a chain of trust that needs to be adequately supported by documentation, and on this regard the CIDOC CRM may play a key role. If in the past reference to previous discoveries published in journals and books was based on the academic practice of peer-review and on the authoritativeness of the author and of the publication, re-using data created by others is still lacking a similar good practice.

The session will discuss such aspects and propose ways to address the issue. Contributions will come from purely cultural heritage practice (“What would you need to rely on somebody else’s data?”) to semantics (“What would you suggest to document, in order to support reliability?”). Both aspects will be analysed in light of the CRM: does it already provide a sufficiently rich toolbox, or additions are required? If so, which ones?

Chair:

Franco Niccolucci

PIN, Prato, Italy

Presenters:

Nicola Barbuti

University of Bari, Italy

The R4 to Identify Born and Digitized Cultural Heritage: Re-usable, Relevant, Reliable and Resistant

It is urgent and imperative to identify what, and how much of the digital resources produced up to day we can identify as “born digital and digitized cultural heritage”. This process needs the definition of clear and homogeneous criteria, according to which we can distinguish digital cultural entities from the daily magmatic production of data. As the FAIR Principles alone do not seem to be sufficient for this purpose, we believe that the FAIR R should be quadrupled in R4: Re-usable, Relevant, Reliable and Resistant. We think that these requirements will give the digital data the value of Cultural Heritage, as they are perfectly specular to the definition we can give of what we commonly consider tangible and intangible cultural heritage.

Martin Doerr

FORTH, Foundation for Research and Technology – Hellas

CRMInf: Supporting Facts by Arguments

In the current practice of documenting cultural heritage the maintainers of the databases mostly present facts as their best knowledge, adding some citations, but without analyzing the reasons why a particular fact is believed or not. Archaeological records may contain more detailed justifications, but only in limited cases related to individual facts. On the other side, computer scientists have developed advanced argumentation systems, but more to support an expert dialogue than to justify and maintain the validity of facts in documentation systems. CRMInf is a CIDOC CRM - compatible extension designed for the latter. Currently, it contains a basic model of ways to acquire new knowledge, and it is being further specialized for supporting more directly the discourse with historical sources and with scientific observations. We will present the theory underlying CRMInf, the current state of development, and implementation issues.

Achille Felicetti

PIN, Prato, Italy

Heritage Science and Cultural Heritage: a CIDOC CRM-enabled Model for Integration and Interoperability

The main goal of our model is to collect provenance data of scientific datasets resulting from Heritage Science research, and to document it in a standard and accessible way. Our approach, inheriting and adapting common logics and concepts of existing models and taking inspiration from the semantic principles of CIDOC CRM, proposes a schema composed of reusable XML modules, intended to describe Heritage Science entities (including actors, devices, datasets, analysis and other events) in great detail, and dynamically organised in a common framework by means of a set of internal links based on persistent identifiers. Such a structure implements a platform-independent meta-format able to express the essence of the data while remaining unbound to any specific system or software, and supports the necessary confidence in somebody else’s data for re-use.

Marianna Figuera

University of Catania, Italy

A Fuzzy Approach to Evaluate the Attributions’ Reliability in the Archaeological Sources

The problem of the relevance of the archaeological sources could be addressed from a different perspective: considering the reliability concept liked to the subjectivity inside the archaeological data. I would like to present a case study of the so-called small finds coming from Phaistos and Ayia Triada (Crete). The unusual finds analyzed and the specific history of excavations of the two sites led to the realization of a procedure in which a Fuzzy approach has been used to preserve the degree of uncertainty of the functional attributions. The concept of “probability of belonging” and the management through multi-assignment of the sources’ attributions could suggest a possible methodological approach to the validation of the relevance and reliability of the archaeological data.

Sorin Hermon

STARC, The Cyprus Institute, Nicosia, Cyprus

How FAIR are the FAIR principles for archaeological data?

The aim of the presentation is to discuss the added value of making archaeological data FAIR, in particular primary data collected during fieldwork, such as 3D models of excavation units, analytical measurements and geodesic data. The main argument of the discussion is that without a formal representation of data provenance, such data can be FAIR but of little use for archaeological research.

Olivier Marlet & Pierre-Yves Buard

University of Tours, France / University of Caen, France

Logicist writing for reliability of data-centric research in archaeology

Within the framework of the activities conducted by the consortium MASA (Memory of Archaeologists and Archaeological Site) from the very large facility Huma-Hum, the Laboratoire Archéologie et Territoires (University of Tours/CNRS, France), in collaboration with the MRSH (University of Caen/CNRS, France), set up a logicist writing publication for the results of the excavation of the Rigny cemetery. For this publication, Elisabeth Zadora-Rio has formalized her archaeological reasoning according to the precepts of Jean-Claude Gardin, thus proposing a clear structuring of the logic of inferences allowing going from field observations to the most synthetic interpretations. The web application developed makes it possible to read the publication in a synthetic way or to deepen the reading by going as far as excavation data, information directly linked to ArSol, our online database.

Christian-Emile Smith Ore

University of Oslo, Norway

Joseph Padfield

The National Gallery, London, UK

Putting theory into practice - Using a CIDOC based venue ontology to describe the movement of paintings within the National Gallery

Using the CIDOC CRM in the NG with the Venue ontology allows considering how it is practically used, developing a simple, internal PID system and its incorporation within a practical tool for capturing and recording the movement of paintings, thus documenting their provenance and relationship with the parts of the gallery and the whole.