Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - I-BiDaaS (Industrial-Driven Big Data as a Self-Service Solution)

Teaser

The I-BiDaaS project aims to empower end-users and practitioners to utilize and interact with big data technologies more easily. This is achieved by designing, building, and demonstrating a unified solution that (i) increases the speed of data analysis which is necessary to...

Summary

The I-BiDaaS project aims to empower end-users and practitioners to utilize and interact with big data technologies more easily. This is achieved by designing, building, and demonstrating a unified solution that (i) increases the speed of data analysis which is necessary to cope with the rate of data asset growth, and (ii) facilitates cross-domain data-flow, matching the needs of a thriving data-driven EU economy. Overall, the I-BiDaaS solution will help increase the efficiency and competitiveness of EU companies.
The objectives of the I-BiDaaS project are the following:
• Develop, validate, demonstrate, and support, a complete and solid big data solution that can be easily configured and adopted by practitioners.
• Break inter- and intra-sectorial data-silos, create a data market and offer new business opportunities, and support data sharing, exchange, and interoperability.
• Construct a safe environment for methodological big data experimentation, for the development of new products, services, and tools.
• Develop data processing tools and techniques applicable in real-world settings, and demonstrate significant increase of speed of data throughput and access.
• Develop technologies that will increase the efficiency and competitiveness of all EU companies and organisations that need to manage vast and complex amounts of data.

Work performed

Some of the project highlights of this period are:

Platform integration and use cases demonstrations. The project successfully delivered the Minimum Viable Product (M12), which includes both batch and streaming processing, and the first full prototype of the platform (M18), which include different modes of operations and real-world scenarios.

Breaking silos. The project provided several realistic synthetic data sets generated by the IBM’s TDF tool, according to the guidelines and insights provided by the data providers (CAIXA, CRF, TID), as well as real data that were properly anonymized (using masking/tokenization procedures).

External feedback. The project has carried out several activities in order to acquire external feedback and evaluation, including two EAB meeting and two Info Days. The participant list in the Info Days was consisting of a mixture of audience, including local/regional community of data experts, students, researchers and employees from SMEs. The feedback that was received from all these events is considered invaluable for the growth of the project.

Work Highlights per WP

WP1: The work carried out led to the identification of industrial challenges, the elicitation of user requirements, the specification of the architecture, and the definition of the validation approach.

WP2: The work carried out led to the definition of the datasets for the industrial use cases, an end-to-end solution for data on-boarding was deployed, synthetic data for several use cases was fabricated and the visualization tools and interfaces were created.

WP3: The work carried out led to scientific advances in the context data analytics algorithms, setting up an initial prototype of the batch platform, and an innovative way to visualize high-dimensional data to explore anomalies.

WP4: This work carried out led to the design of the streaming component of the I-BiDaaS platform which allows to analyze a large number of input streams in a distributed fashion using a complex event processing engine. In addition, the offloading of compute-intense tasks to GPUs was designed and tested.

WP5: The work carried out within this period allows us to have an operational infrastructure environment where the software modules which are part of the I-BiDaaS solution have been deployed supporting the realization of the I-BiDaaS MVP prototype (M12) and the 1st I-BiDaaS prototype (M18).

WP6: The work carried out involves the first evaluation results; the preparation of the datasets to be exploited in each experiment; and the indicators to be measured, according to the I-BiDaaS experimental protocol.

WP7: The work carried out led on establishing a baseline for the market definition and analysis within which the analysis of the industrial challenges phased would be analyzed and eventually form the groundwork for developing the individual and collective exploitation strategies of the partners and the project.

WP8: The goal of WP8 is to set up and maintain the administrative, financial and management infrastructure of the I-BiDaaS project, as well as to describe a clear plan about the data that will be used in this project.

Final results

The major I-BiDaaS state of the art advances and innovations are:

WP1 provides a clear specification of a Big Data as-a-self-service platform which extends existing Big Data Reference Architectures that provides a safe experimentation environment for Big Data products, services, and tools.

WP2 provides a clear plan to fabricate realistic synthetic data that has statistically similar structure as the original data but hides potentially all confident information that the project use cases might have. Moreover, I-BiDaaS will develop advanced machine-learning based analytics to be able to provide descriptive statistics about the data.

WP3 advances the state of the art by providing advanced data analytics using structured convex optimization algorithms and by implementing machine learning algorithms using a patented data sampling, among others. The impact we expect is related with the speed up of the data analytics processes, and the creation of a BSC spin-off company (planned for the end of 2019).

WP4 advances the state of the art by extending the traditional distributed stream processing model, to allow the offloading of compute-intense calculations to external parallel accelerators (i.e., GPUs). The impact will be boosted through relevant scientific publications and potential novel integrations of several streaming analytics tools which have not been combined and integrated before.

WP5 innovations include the realization of the I-BiDaaS solution allowing both experts and non-experts to carry out experiments on big-data, offering different analytics capabilities pre-defined as a self-service for non-experts, while more advanced users are able to customize their big-data analysis.

WP6 has judiciously defined several industrial experiments across three different domains – manufacturing, telecom, and banking, along with quantitative and qualitative (business-related) metrics which will allow to carefully and systematically assess impact of the innovations. The corresponding I-BiDaaS iterative experimental protocol takes into consideration both technical benchmarks and business requirements, aiming not only to evaluate the performance of the I-BiDaaS solution but also validate its alignment with the needs of the industrial users.

WP7 has determined that significant evolution is being observed in the Big Data Market, which will have important economic impact as well as societal impact. Therefore, solutions such as I-BiDaaS are critical instruments for decision-making process. For instance, the market analysis we performed in the context of this project, Big Data is delivering the most value to enterprises by decreasing expenses (49.2%) and creating new avenues for innovation and disruption (44.3%).

Website & more info

More info: http://www.ibidaas.eu.