Developing robust data technology is key to ensuring open science
The National Strategy for Open Science (ENCA) defines the roadmap which ensures that by 2027 the principles of open science will have been integrated into how research is executed, funded, evaluated, and communicated in Spain. This strategy not only represents a paradigm shift for the people who do science, but also poses a major challenge: to develop data technology that is powerful and secure enough to execute this transformation and connection with the European open ecosystem. One of the technologies envisioned are the so-calleddata infrastructures. They consist of information management systems that can deal with diverse data (research, industrial or even personal) for different purposes and for different end users.
CREAF is taking up this challenge participating in innovative projects such as AquaINFRA, AD4GD, OEMC and EOSC Focus. The latter is defining the future of the European Open Science Cloud (EOSC), an initiative funded by the European Commission through the Horizon 2020 program. Behind the EOSC acronym lies the desire to ensure that scientific data and services financed by European public funds can be easily accessed and reused for research, innovation and education with greater quality, transparency, and efficiency. CREAF became a member of this network a few months after its constitution, committing to making our research data available to the European scientific community.
"Developing robust and reliable web technology, together with a commitment to providing open data, broadens the scope of available data and allows us to analyse the health of our planet from different prisms”.
KAORI OTSU, CREAF researcher.
In addition, CREAF researcher Kaori Otsu is actively participating in the technical deployment of this digital ecosystem through the EOSC focus project. Her most recent achievement has been to contribute to the organization of a campus for experts in Thessaloniki (Greece). The Winter School 2024 brought together more than 100 professionals who are building the EOSC through Horizon 2020 projects. The aim was to identify synergies and opportunities for collaboration.
The balance between opening and closing
One of the most repeated mottos when it comes to open science is that data should be as open as possible, but as closed as necessary. This means that the infrastructures where they are hosted should not only facilitate access, but also ensure that sensitive data, such as those that affect people's privacy or health or those that are confidential to companies, are secure, controlled and maintain their intellectual property. "This is exactly the goal of the cutting-edge data spaces that we are also developing: virtual environments where data can be exchanged between users under a common agreement established between the parties," explains CREAF researcher Joan Masó.
Responsible access can be guaranteed in the long term if research data and derived services follow international FAIR (Findable, Accessible, Interoperable and Reusable) principles and comply with standards such as the EU General Data Protection Regulation (GDPR). It is also essential that the data is interoperable, i.e., that they use consistent concepts and measurement variables based on common vocabularies and international standards. Researcher Joan Masó has collaborated in the description of some of these standards, such as the OGC API Tiles.
Local becomes global and vice versa
“Data democratization can potentially break down boundaries if these technologies include diverse communities from the outset, which greatly enriches the scientific community”, states Kaori Otsu.
Open data environments developed by supranational entities are more useful if they are grounded in local realities, which usually provides useful data. "Data democratization can potentially break down boundaries if these technologies include diverse communities from the outset, which greatly enriches the scientific community," says Kaori Otsu. For example, during the implementation of EOSC, it is also expected to reflect the CARE (Collective Benefit, Authority to control, Responsibility, Ethics) principles, which uphold the rights that indigenous peoples have over their data.
And that’s not all; local research can contribute to European science and vice versa; it is a relationship that benefits both scales. For example, the AD4GD project, coordinated by CREAF, is testing a possible use for a data space dedicated to the objectives of the European Green Deal. To do so, they are collecting data on habitat connectivity coming from different origins, both local and European: data from the Global Biodiversity Information Facility, land cover maps, sensors and camera trapping, citizen science, government data, etc. Connectivity is one of the factors that most condition the dispersion of animals and plants throughout the territory and functions as an indicator of biodiversity. This information, therefore, can be critical for a town council to decide where to build a new road and equally essential for the European Parliament to evaluate the objectives of the new law on nature restoration.
The role of citizen science
The EOSC is preparing its digital environment to be able to integrate this type of research in a secure way that respects privacy.
Citizen science is the process of generating scientific knowledge with the volunteer participation of non-specialists and the support of professionals in the field. "We believe that citizen science can be fully scientific," states Kaori Otsu; "it is a type of research that generates a very powerful group force that contributes to generate knowledge". The EOSC is preparing its digital environment to be able to integrate this type of research in a secure way that respects the privacy of the people who generate it.
This integration is quite relevant because, for example, habitat connectivity indices mentioned above could be refined if they incorporated data from species observations in the territory, which could come from citizen science initiatives. CREAF, in fact, already participates in revolutionary projects such as Mosquito Alert, RitmeNatura, AlertaForestal or some of the services developed by the Cos4Cloud project.
The future of EOSC
The EOSC as a virtual open science space is an action promoted by the European Research Area (ERA) Policy Agenda, as are the Horizon Europe research and innovation program and the Coalition for Advancing Research Assessment (CoARA), an agreement that seeks to reform research assessment criteria and to which CREAF has also recently adhered.
The expert meeting in Thessaloniki in which Kaori Otsu participated revealed the challenges to be faced in the future to transform all EOSC needs into efficient technology. These include further work on data interoperability, the implementation of FAIR principles or the deployment of so-called persistent identifiers, which enable the generation of imperishable references for documents, files, and data sets, as is already done by the well-known DOIs.