HELENLP

Heterogeneous Learning for Natural Language Processing

 Coordinatore TECHNION - ISRAEL INSTITUTE OF TECHNOLOGY 

 Organization address address: TECHNION CITY - SENATE BUILDING
city: HAIFA
postcode: 32000

contact info
Titolo: Mr.
Nome: Mark
Cognome: Davison
Email: send email
Telefono: +972 4 829 4854
Fax: +972 4 823 2958

 Nazionalità Coordinatore Israel [IL]
 Totale costo 100˙000 €
 EC contributo 100˙000 €
 Programma FP7-PEOPLE
Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)
 Code Call FP7-PEOPLE-2009-RG
 Funding Scheme MC-IRG
 Anno di inizio 2010
 Periodo (anno-mese-giorno) 2010-12-01   -   2014-11-30

 Partecipanti

# participant  country  role  EC contrib. [€] 
1    TECHNION - ISRAEL INSTITUTE OF TECHNOLOGY

 Organization address address: TECHNION CITY - SENATE BUILDING
city: HAIFA
postcode: 32000

contact info
Titolo: Mr.
Nome: Mark
Cognome: Davison
Email: send email
Telefono: +972 4 829 4854
Fax: +972 4 823 2958

IL (HAIFA) coordinator 100˙000.00

Mappa


 Word cloud

Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.

machine    statistical    learning    language    amount    sentences    heterogeneous    supervision    full    parsed    parser    parse    natural    data    human   

 Obiettivo del progetto (Objective)

'A major challenge in machine learning and artificial intelligence is to reduce the dependency in full direct supervision and learn from various undirected resources as well. Most successful machine-learning systems require some amount of human supervision. Currently, a dominant paradigm for building a statistical parser, for example, is to first have human annotators to manually parse a large amount of sentences, and then use the parsed sentences to learn the parameters of the parsing system. For example, a parser built using the Penn Tree Bank, a large corpora of parsed sentences from the Wall Street Journal, is expected to parse well newswire text fragments, but not e-mails, which are different in nature. Yet, one would like to employ all data available from various resources, genres and types to build either a general system or a system that is adapted to a particular task. The goal of the proposed project is to design new paradigms for large-scale learning of natural language problems in various languages from heterogeneous data sources of variable size, quality, amount of supervision and type. Our primary objective is to develop theory, design algorithms, analyze them and build systems for processing written and spoken natural language. Furthermore, the world-wide-web and similar available resources contain a huge amount of heterogeneous collections of data. I propose to make use of the heterogeneous data and based on the tools I will develop to build statistical-based automated systems for various natural language processing tasks, with applications ranging from automatic document classification, via a full range of information extractions to speech analysis and recognition.'

Altri progetti dello stesso programma (FP7-PEOPLE)

SAMUL-NANO-HEP (2014)

Self-Assembling Multivalent Biodegradable Ligands for Nanoscale Heparin Binding

Read More  

BINOS (2012)

Bistable Nano-Objects on Surfaces

Read More  

GEOMCRITRAND (2011)

The geometry of critical random and pseudorandom systems

Read More