The Square Kilometre Array (SKA) is an ambitious project to construct the world’s most powerful radio telescope and enable transformational scientific discoveries across a wide range of topics in physics and astronomy. Physically distributed over two locations in the deserts...
The Square Kilometre Array (SKA) is an ambitious project to construct the world’s most powerful radio telescope and enable transformational scientific discoveries across a wide range of topics in physics and astronomy. Physically distributed over two locations in the deserts of South African and West Australia, the operational headquarters for the SKA Observatory is based in the UK and European partners comprise a major component of the global community assembling this new facility. Based on its value to the European science community, the SKA has been designated a Landmark facility by the European Strategy Forum on Research Infrastructures (ESFRI). Construction of the telescope is expected to begin late in 2019 with first science operations in the early 2020’s.
Once operational, the SKA is expected to produce an archive of science data products with an impressive growth rate on the order of 700 petabytes per year. Storage and computing resources associated with the SKA Observatory itself, however, are expected to be highly constrained in order to support operations and the creation of these basic data products. Any further processing and subsequent science extraction by users will require a global research infrastructure providing additional capacity in networking, storage, computing, and expertise. This research infrastructure is currently foreseen to take the form of a federated, global network of SKA Regional Centres (SRCs). These SRCs will be the primary interface for researchers in extracting scientific results from SKA data and, as such, are essential to the ultimate success of the telescope.
The primary objective of the AENEAS (Advanced European Network of E- infrastructures for Astronomy with the SKA) project is to develop a concept and design for a distributed SKA Regional Centre in Europe. This design will consider all the major components necessary to establish such a European SRC including data storage and management, networking, computing, and user support. By engaging directly with the European SKA astronomers, AENEAS will assess the analysis needs of the community to ensure the final design for the SRC network can deliver their science goals. Along with the technical design, a primary goal of the AENEAS project is to identify the resources available amongst the European SKA partners and how these resources may be integrated into a unified analysis platform for SKA science.
Bringing the relevant community together to both design and resource the proposed European SKA Regional Centre (ESRC) is an important goal of the AENEAS project. Identifying the relevant partners and resources represents an important initial step. Working with the various European national representatives, the governance work package team has conducted a detailed census of over 47 scientific institutes, expertise centres, infrastructure providers, NRENs, and commercial partners that are likely to comprise the underlying infrastructure of the ESRC. A report detailing the capacities, in terms of expertise and computational resources, of these nodes has been compiled and constitutes an important input to the other work packages as it sets the initial scale for the resources available to the ESRC as well as providing a starting topology for the network design.
For the more technical work packages, substantial progress has been made on assessing the variety of data products the ESRC will need to ingest in order to support SKA science. Using inputs from the SKAO, the computing and storage work package has characterized the type, number, and expected growth rate for the storage resources necessary to host the SKA science archive. In addition, they have begun characterizing the workflows and associated computing resources needed to perform further post-processing and analysis on that data once hosted in the ESRC. This characterization includes a census of current relevant computing tools and middleware and will ultimately lead to a design recommendation for the ESRC computing platform.
Similarly, the networking work package has conducted an extensive set of data transfer tests to characterize the expected network performance for data movement from the telescope sites in South African and Australia to the ESRC as well as best practices to optimize this performance. A report summarizing these results has been prepared and delivered over the current reporting period. Going forward, the networking work package team will utilize the results from the resource census discussed above to look at different data distribution topologies within Europe as well as the network performance impact for different types of user access and analysis on the distributed SKA data.
Finally, the work package team on user access and knowledge creation has been analyzing existing user interaction models for radio astronomy data in order to identify where modifications or enhancement will be necessary for the ESRC to support SKA science extraction. As part of this analysis a series of surveys have been conducted of both the user community but also several representative, operational radio astronomy facilities. These surveys considered types of data, methods of access, level of additional processing required, models for user support, and common analysis tools used among others. The results of these surveys have been collected into two series of reports delivered over the current period. Using these initial results as a starting point, next steps will focus on recommendations for the ESRC user interfaces necessary to support both data discovery and analysis including possible improvements to the underlying Virtual Observatory (VO) software stack.
The unprecedented scale of the expected SKA data stream requires a fundamental change in the way the radio astronomy approaches science extraction. For most radio astronomers, the standard analysis scenario involves a moderately sized dataset, obtained from a given facility, and analyzed personally on their individual workstation or other local facilities. Deploying an analysis platform that can accommodate the scale of SKA data, utilize the necessarily distributed computing and storage infrastructure, and still maintain workflows and techniques familiar to the astronomy community will require both technological innovation as well as education and advanced user support. The AENEAS project aims to define an operational analysis platform for SKA science that satisfies these boundary conditions and will enhance the possibilities for European scientists to exploit SKA data. Although tailored to meet the demands of the SKA science community, such an analysis platform could be of value for other data intensive research domains as well.
More info: https://www.aeneas2020.eu/.