• English 
  • Spanish 

Towards Supporting Climate Scientists and Impact Assessment Analysts with the Big Data Europe Platform

The EU, Horizon 2020, project Big Data Europe (BDE) aims to support European companies and institutions in
effectively managing and making use of big data in activities critical to their progress and success. BDE focuses
on seven areas of societal impact: Health, Food and Agriculture, Energy, Transport, Climate, Social Sciences and
Security. By reaching out to partners and stakeholders, BDE aims to elicit data-intensive requirements for, and
deliver an ICT platform to cover aspects of publishing and consuming semantically interoperable, large-scale,
multi-lingual data assets and knowledge.

In this presentation we will describe the first BDE pilot for Climate, focusing on SemaGrow , its core component,
which provides data querying and management based on data semantics.
Over the last few decades, extended scientific effort in understanding climate change has resulted in a huge volume
of model and observational data. Large international global and regional model inter-comparison projects have
focused on creating a framework in support of climate model diagnosis, validation, documentation and data access.
The application of climate model ensembles, a system consisting of different possible realisations of a climate
model, has further significantly increased the amount of climate and weather data generated. The provision of such
models satisfies the crucial objective of assessing potential impacts of climate change on well-being for adaptation,
prevention and mitigation.

One of the methodologies applied by the climate research and impact assessment communities is that of dynam-
ical downscaling. This calculates values of atmospheric variables in smaller spatial and temporal scales, given a
global model. On the company or institution level, this process can be greatly improved in terms of querying, data
ingestion from various sources and formats, automatic data mapping, etc.

The first Climate BDE pilot will facilitate the process of dynamical downscaling by
1. providing a semantics-based interface to climate open data, to ESGF services,
2. searching, downloading and indexing climate model and observational data, according to user requirements,
such as coverage and experimental scenarios,
3. executing dynamical downscaling models on institutional computing resources, and
4. establishing a framework for metadata mappings and data lineage.

The objectives of this pilot will be met building on the SemaGrow system and tools, which have been developed
as part of the SemaGrow project in order to scale data intensive techniques up to extremely large data volumes
and improve real time performance for agricultural experiments and analyses. SemaGrow is a query resolution and
ingestion system for data and semantics. It is able to extract semantic features from data, index them and expose
APIs to other BDE platform components. Moreover, SemaGrow provides tools for transforming and managing
data in various formats (e.g. NetCDF), and their metadata. It can also interface between users and distributed,
external data sources via SPARQL endpoints. This has been demonstrated as part of the SemaGrow project, on
diverse and large-scale scientific use-cases. SemaGrow is an active data service in agINFRA , a data infrastructure
for agriculture.