HEISENDATA

HeisenData - Towards a Next-Generation Uncertain-Data Management System

 Coordinatore THE RESEARCH COMMITTEE OF THE TECHNICAL UNIVERSITY OF CRETE 

 Organization address address: BUILDING E4 CAMPUS KONOUPIDIANA
city: CHANIA
postcode: 73132

contact info
Titolo: Prof.
Nome: Nikolaos
Cognome: Varotsis
Email: send email
Telefono: -65183
Fax: -56597

 Nazionalità Coordinatore Greece [EL]
 Totale costo 100˙000 €
 EC contributo 100˙000 €
 Programma FP7-PEOPLE
Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)
 Code Call FP7-PEOPLE-2009-RG
 Funding Scheme MC-IRG
 Anno di inizio 2010
 Periodo (anno-mese-giorno) 2010-03-01   -   2014-02-28

 Partecipanti

# participant  country  role  EC contrib. [€] 
1    THE RESEARCH COMMITTEE OF THE TECHNICAL UNIVERSITY OF CRETE

 Organization address address: BUILDING E4 CAMPUS KONOUPIDIANA
city: CHANIA
postcode: 73132

contact info
Titolo: Prof.
Nome: Nikolaos
Cognome: Varotsis
Email: send email
Telefono: -65183
Fax: -56597

EL (CHANIA) coordinator 100˙000.00

Mappa


 Word cloud

Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.

structures    data    query    relational    real    patterns    ie    components    heisendata    reasoning    synopses    helped    framework    inference    supporting    database    uncertainty    life    effectiveness    optimal    probabilistic    pdbs    uncertain    learning    team    class    pdbss    correlation    models    conventional    model    histograms    algorithms    scalable    simplistic    architectures    statistical    extraction    tools    effectively    databases   

 Obiettivo del progetto (Objective)

'Several real-world applications need to manage and reason about large amounts of data that are inherently uncertain. For instance, pervasive computing applications must constantly reason about volumes of noisy sensory readings, e.g., for motion prediction and human behavior modeling; information-extraction tools can assign different possible labels with varying degrees of confidence to segments of text, due to the uncertainties and noise present in free-text data. Such probabilistic data analyses require sophisticated machine-learning tools that can effectively model the complex correlation patterns present in real-life data. Unfortunately, to date, approaches to Probabilistic Database Systems (PDBSs) have relied on somewhat simplistic models of uncertainty that can be easily mapped onto existing relational architectures: Probabilities are typically associated with individual data tuples, with little or no support for capturing data correlations. This research proposal aims to design and build a novel, extensible PDBS that supports a broad class of statistical models and probabilistic-reasoning tools as first-class system objects, alongside a traditional relational-table store. Our proposed architecture will employ statistical models to effectively encode data-correlation patterns, and promote probabilistic inference as part of the standard database operator repertoire to support efficient and sound query processing. This tight coupling of relational databases and statistical models represents a major departure from conventional database systems, and many of the core system components need to be revisited and fundamentally re-thought. The proposed research will attack several of the key challenges arising in this novel PDBS paradigm (including, query processing, query optimization, data summarization, extensibility, and model learning and evolution), build usable prototypes, and investigate key application domains (e.g., information extraction).'

Introduzione (Teaser)

An EU team developed data systems that use statistical and probabilistic reasoning to reduce uncertainty. The project helped to unify such methods with conventional databases, in part by developing scalable algorithms and a variety of new tools.

Descrizione progetto (Article)

Various software applications must manage and make decisions using data with high levels of uncertainty. While certain tools can fill in the gaps to some degree, such tools are generally simplistic and limited.

The EU-funded 'Heisendata - towards a next-generation uncertain-data management system' (http://heisendata.softnet.tuc.gr/ (HEISENDATA)) project aimed to improve matters. The team planned to design and build new probabilistic database systems (PDBSs), supporting statistical models and probabilistic reasoning in addition to conventional database structures. The project intended to address the challenges involved in supporting such a novel union, including redesign of key system components. HEISENDATA ran for four years to February 2014.

Project work covered three main branches: new probabilistic data synopses for query optimisation, new PDBS algorithms and architectures, and scalable algorithms and tools.

The data synopses involved defining and creating algorithms for building histograms. For various error metrics, the new algorithms constructed optimal or near-optimal histograms and wavelet synopses. Further work introduced probabilistic histograms, which allowed a more accurate representation of the data's uncertainty characteristics.

Additionally, the team addressed problems related to unstructured text containing units of structured information. The solutions extended a leading information extraction (IE) model, by developing two query approaches. The efficiency and effectiveness of the approaches were compared using real-life data sets. The result was a set of rules for choosing appropriate inference algorithms under various conditions, yielding up to 10-fold speed improvements.

The project also devised a framework for scaling any generic entity resolution algorithm, and demonstrated the framework's effectiveness. Further work helped to integrate the IE pipeline with probabilistic query processing.

HEISENDATA found new statistical methods for processing data with high uncertainties, and integrated the methods into conventional database structures. The work addressed a topic of interest to the academic and commercial sectors.

Altri progetti dello stesso programma (FP7-PEOPLE)

THERMO-SPIN (2014)

Thermoelectric power generation from anomalous Nernst effect based on rare earth free hard magnetic materials

Read More  

HPFPLATTICEQCD (2012)

High precision flavour physics using lattice QCD

Read More  

ASCEND (2013)

Advanced SuperConducting devices for ENhanced infrared Detection (ASCEND)

Read More