Periodic Reporting for period 1 - EUXDAT (European e-Infrastructure for Extreme Data Analytics in Sustainable Development)


There are more and more data sources available, generating tons of data each day that could be exploited in different areas to improve the way we do things or to understand our environment much better. On the other hand, we are in a context in which sustainability is a key aspect for humanity, in general, and Europe, in particular, as the population grows, and the environment needs to be protected with respect to its over-exploitation.
Agriculture is a sector which as an important impact in our environment (especially in rural areas) and also in food security aspects for our (growing) society and, therefore, we have the opportunity to use all those huge amounts of data in order to perform analytics with novel techniques. Such analytics will help us to take care of our environment (and the soil) while we optimize how we use it, maintaining food demands and provision balanced.
As a result, EUXDAT proposes solutions for crops monitoring, for improving land use maps, for taking better decisions on the crops to exploit in certain types of soil and for improving the management of farms in general.
EUXDAT, therefore, is an e-Infrastructure for Large Data Analytics-as-a-Service, which aims at bringing heterogeneous data sources (Copernicus, climate data, sensors data, UAVs data, machinery data, land use data, hydrology data, etc…) together with advanced data analysis tools, which can make use of both Cloud and HPC resources, in order to process huge amounts of data which will be useful for supporting agriculture.
The main objectives of the project are the following:
• Develop a set of tools for managing extremely large datasets, considering storage requirements, different formats and managing policies for reducing data movement latency and protecting the information;
• Adapt and evolve, as required, data processing tools already available adding new features in such a way that they can be provisioned in a Large Data Analytics-as-a-Service way. These main changes will be focused on the capabilities to exploit HPC capabilities with the new data management tools, the improvement of users’ portal and the adaptation of resources management;
• Carry out service activities based on an integrated e-infrastructure, where three data-intensive pilots from the Sustainable Development domain will validate the proposed solutions;
• Carry out an important networking activity, especially in the domain of Sustainable Development, in order to motivate the adoption of the proposed tools among a wider European community.

Work performed

The first stage of the project consisted of setting up all communication, cooperation, quality and financial means to ensure an efficient and smooth project management for all the time of the project.
The work toward implementation of the EUXDAT e-Infrastructure started with the functional definition of the v1 of the e-Infrastructure performed in the frame of WP2. A group of Agriculture related experts ran and evaluated interviews to farmers to gather feedback and functionally define the scenarios to be implemented. Consolidating these feedbacks and putting them together with the technical knowledge of the consortium in the EO data exploitation field, scenarios and e-Infrastructure functional requirements were produced.
An e-Infrastructure architecture and technical requirements were then derived to drive the implementation performed in the frame of WP3 and WP4.
In WP3, the end’s User platform was built: on top of a cloud environment, a PaaS layer was configured able to host all the platform components. The scope of the v1 of the platform was defined according to technical and functional requirements and a first of components was implemented and/or integrated to the platform: prototyping environment, data analytics tools, data connectors, user management, orchestrator, data catalogue.
In WP4, both HPC and elastic cloud services were studied to provide scalable data processing capabilities to the EUXDAT e-Infrastructure users. On the HPC part, the integration between the HPC facility provided by HLRS and the e-Infrastructure was done by developing an orchestrator plugin. On the elastic cloud part, different strategies were studied, the conclusions were reported in a deliverable and the actual implementation will start at the beginning of the next period.
From the set of scenarios functionally defined, 3 were chosen as implementation target for the e-Infrastructure v1: Monitoring of crop status, Agro-Climatic zones and Open Land Use Map. Specific components (data processing application, graphical interfaces…) were developed, integrated with the platform services (orchestrator, data connectors API…) and deployed on the end user’s platform PaaS. Qualification of these scenarios have been done on a limited context. The qualification in the e-Infrastructure context will be done later in the project.
A Data Management Plan was produced to describe the position of EUXDAT in terms of data generation, collection, sharing, exploitation, cataloguing and standardization.
The dissemination and communication strategy of the project was defined, and an official website was put online.
On the business part, first Market Analysis activities were performed and were ready at M6. From the starting point, the development of the exploitation plan started. First conclusions were reported in the first version of the Exploitation Plan.

Final results

EUXDAT platform offers a thematic cloud platform in the domain of sustainable development with unique features.
EUXDAT is based on a performant e-Infrastructure enabling the processing of large amount of data, on both HPC and Cloud, with an orchestration mechanism optimizing the execution according to processing characteristics.
EUXDAT platform already offers a set of data access connectors to a large number of datasets:
- Copernicus data (through Mundi web services platform),
- Open land use map,
- DEM,
- Meteorological data (through Meteoblue),
- field sensor data (via Pessl instrument API).
Additional connectors, such as UAV data access connectors or Soil maps are planned before the end of project.
EUXDAT platform offers the capability to simplify the building of a new application in the domain of Sustainable Development from a set of existing unitary functionalities, such as the data access connectors and the already implemented scenarios, offering:
- an integrated prototyping environment for the application development initial steps (Jupyter notebook)
- a scalable and evolutive e-Infrastructure enabling the deployment of processing services on both HPC and Cloud
- a large number of datasets and the capacity to add new specific data access connectors
- the capacity to deploy a specific Front-end for an application.

