CLOUDIX

CloudIX: Cloud-based Indexing and Query Processing

 Coordinatore NORGES TEKNISK-NATURVITENSKAPELIGEUNIVERSITET NTNU 

 Organization address address: HOGSKOLERINGEN 1
city: TRONDHEIM
postcode: 7491

contact info
Titolo: Mr.
Nome: øyvin
Cognome: Sæther
Email: send email
Telefono: +47 73597679

 Nazionalità Coordinatore Norway [NO]
 Totale costo 212˙225 €
 EC contributo 212˙225 €
 Programma FP7-PEOPLE
Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)
 Code Call FP7-PEOPLE-2010-IEF
 Funding Scheme MC-IEF
 Anno di inizio 2011
 Periodo (anno-mese-giorno) 2011-09-01   -   2013-08-31

 Partecipanti

# participant  country  role  EC contrib. [€] 
1    NORGES TEKNISK-NATURVITENSKAPELIGEUNIVERSITET NTNU

 Organization address address: HOGSKOLERINGEN 1
city: TRONDHEIM
postcode: 7491

contact info
Titolo: Mr.
Nome: øyvin
Cognome: Sæther
Email: send email
Telefono: +47 73597679

NO (TRONDHEIM) coordinator 212˙225.60

Mappa


 Word cloud

Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.

values    cloud    reduce    parallel    made    cloudix    data    mapreduce    infrastructure    techniques    amounts    computing    mechanisms    framework    intensive    scientific    algorithms    queries    performance    advent    query    map    efficient    indexing   

 Obiettivo del progetto (Objective)

'The advent of cloud computing comprises a new paradigm that entails a large number of low-end processors that perform parallel processing of (usually data-intensive and resource-demanding) computing jobs. This is particularly appealing for a wide variety of data-intensive applications, including scientific data management, analysis and mining of complex data, and internet-scale service provisioning. Nevertheless, processing complex queries in the cloud is still in a primitive stage, largely relying on the concept of map-reduce, without inherent indexing mechanisms that can boost the performance of query processing. In contrast, database research prides itself on scalable techniques for advanced query processing.

CloudIX intends to transfer knowledge from data management technology to the cloud infrastructure, in order to support advanced query processing. To this end, CloudIX aims to propose a generic, distributed, dynamic and adaptive indexing framework for handling multidimensional data in the cloud. Capitalizing on this indexing infrastructure, CloudIX will present novel efficient algorithms for processing advanced query types, such as range, nearest-neighbor, top-k, rank joins, and skyline queries. Furthermore, CloudIX will propose new economical cost models that need to be tightly integrated in the process of query optimization.

The research results of CloudIX are expected to act as a catalyst that attracts applications that rely on advanced query processing to be moved to the cloud. As a showcase scenario, two applications will be deployed on top of CloudIX, one related with scientific data management, to demonstrate the feasibility and efficiency of the proposed framework.'

Introduzione (Teaser)

Cloud computing has revolutionised the landscape of the information technology (IT) world with affordable computing resources. An EU-funded project has developed the tools to selectively inspect only the most useful data from the cloud data sets.

Descrizione progetto (Article)

Computer users are increasingly faced with finding means to store vast amounts of data. Larger hard drives do meet some of these needs but there is growing trend towards saving data on an off-site storage system. Within just a few years, companies have switched from hardware to such third party cloud services.

The advent of cloud infrastructures has also made it feasible to analyse massive data sets with parallel processing integrated into the new virtual environment. The 'Cloud-based indexing and query processing' (http://research.idi.ntnu.no/cloudix/ (CLOUDIX)) project adopted MapReduce to process and generate large data sets. The cutting-edge research work conducted during the two-year project significantly increased the performance of MapReduce.

MapReduce is a programming model widely used for special-purpose computations involving large amounts of data such as web request logs. It is also used to derive various kinds of data including inverted indices. A "map" function is applied to each logical "record" to compute a set of intermediate key values. Then, a "reduce" process identifies all values that share the same key to combine derived data appropriately.

The CLOUDIX researchers provided mechanisms for accessing a subset of the input data, instead of scanning all data to produce the same result. Specifically, advanced algorithms support early termination of data processing when sufficient data for producing the correct result has been accessed. The decisive first steps have also been made towards integrating efficient ranking techniques to sort results according to their relevance.

During the CLOUDIX project, different approaches were combined to address the shortcomings of the most prominent framework for parallel query processing in the cloud. On the other hand, its merits include scalability, fault-tolerance, load-balancing and most importantly simplicity. The CLOUDIX results, published in peer-reviewed scientific journals, are expected to help scientists and professionals save working hours while analysing large data sets.

Altri progetti dello stesso programma (FP7-PEOPLE)

POLLEN (2013)

Plant Phenotyping by Vibrational Spectroscopy of Pollen

Read More  

EMUVE (2013)

Euro-Mediterranean Urban Voids Ecology

Read More  

MNEMOSMELL (2008)

Unraveling the mechanisms of odor coding and sent-tracking in Drosophila larvae

Read More