Coordinatore | NORGES TEKNISK-NATURVITENSKAPELIGEUNIVERSITET NTNU
Organization address
address: HOGSKOLERINGEN 1 contact info |
Nazionalità Coordinatore | Norway [NO] |
Totale costo | 212˙225 € |
EC contributo | 212˙225 € |
Programma | FP7-PEOPLE
Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013) |
Code Call | FP7-PEOPLE-2010-IEF |
Funding Scheme | MC-IEF |
Anno di inizio | 2011 |
Periodo (anno-mese-giorno) | 2011-09-01 - 2013-08-31 |
# | ||||
---|---|---|---|---|
1 |
NORGES TEKNISK-NATURVITENSKAPELIGEUNIVERSITET NTNU
Organization address
address: HOGSKOLERINGEN 1 contact info |
NO (TRONDHEIM) | coordinator | 212˙225.60 |
Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.
'The advent of cloud computing comprises a new paradigm that entails a large number of low-end processors that perform parallel processing of (usually data-intensive and resource-demanding) computing jobs. This is particularly appealing for a wide variety of data-intensive applications, including scientific data management, analysis and mining of complex data, and internet-scale service provisioning. Nevertheless, processing complex queries in the cloud is still in a primitive stage, largely relying on the concept of map-reduce, without inherent indexing mechanisms that can boost the performance of query processing. In contrast, database research prides itself on scalable techniques for advanced query processing.
CloudIX intends to transfer knowledge from data management technology to the cloud infrastructure, in order to support advanced query processing. To this end, CloudIX aims to propose a generic, distributed, dynamic and adaptive indexing framework for handling multidimensional data in the cloud. Capitalizing on this indexing infrastructure, CloudIX will present novel efficient algorithms for processing advanced query types, such as range, nearest-neighbor, top-k, rank joins, and skyline queries. Furthermore, CloudIX will propose new economical cost models that need to be tightly integrated in the process of query optimization.
The research results of CloudIX are expected to act as a catalyst that attracts applications that rely on advanced query processing to be moved to the cloud. As a showcase scenario, two applications will be deployed on top of CloudIX, one related with scientific data management, to demonstrate the feasibility and efficiency of the proposed framework.'
Cloud computing has revolutionised the landscape of the information technology (IT) world with affordable computing resources. An EU-funded project has developed the tools to selectively inspect only the most useful data from the cloud data sets.
Computer users are increasingly faced with finding means to store vast amounts of data. Larger hard drives do meet some of these needs but there is growing trend towards saving data on an off-site storage system. Within just a few years, companies have switched from hardware to such third party cloud services.
The advent of cloud infrastructures has also made it feasible to analyse massive data sets with parallel processing integrated into the new virtual environment. The 'Cloud-based indexing and query processing' (http://research.idi.ntnu.no/cloudix/ (CLOUDIX)) project adopted MapReduce to process and generate large data sets. The cutting-edge research work conducted during the two-year project significantly increased the performance of MapReduce.
MapReduce is a programming model widely used for special-purpose computations involving large amounts of data such as web request logs. It is also used to derive various kinds of data including inverted indices. A "map" function is applied to each logical "record" to compute a set of intermediate key values. Then, a "reduce" process identifies all values that share the same key to combine derived data appropriately.
The CLOUDIX researchers provided mechanisms for accessing a subset of the input data, instead of scanning all data to produce the same result. Specifically, advanced algorithms support early termination of data processing when sufficient data for producing the correct result has been accessed. The decisive first steps have also been made towards integrating efficient ranking techniques to sort results according to their relevance.
During the CLOUDIX project, different approaches were combined to address the shortcomings of the most prominent framework for parallel query processing in the cloud. On the other hand, its merits include scalability, fault-tolerance, load-balancing and most importantly simplicity. The CLOUDIX results, published in peer-reviewed scientific journals, are expected to help scientists and professionals save working hours while analysing large data sets.