#	Pagina
attuale pagina	/open-fp7/projects/186934/index.html
-1	/open-h2020/projects/200326/index.html
-2	/open-fp7/projects/102968/index.html
-3	/open-fp7/projects/88785/index.html
-4	/open-fp7/projects/109962/index.html
-5	/open-h2020/projects/222349/index.html
-6	/open-h2020/projects/201274/index.html
-7	/open-h2020/projects/217263/index.html
-8	/open-fp7/projects/100854/index.html
-9	/open-h2020/projects/227262/index.html
-10	/open-h2020/projects/218664/index.html

MULTILEX

Multilingual Lexicon Extraction from Comparable Corpora

Coordinatore	JOHANNES GUTENBERG UNIVERSITAET MAINZ Organization address address: SAARSTRASSE 21 city: MAINZ postcode: 55099 contact info Titolo: Dr. Nome: Sascha Cognome: Hofmann Email: send email Telefono: +49 7274 508 35111 Fax: +49 7274 508 35412
Nazionalità Coordinatore	Germany [DE]
Totale costo	100˙000 €
EC contributo	100˙000 €
Programma	FP7-PEOPLE Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)
Code Call	FP7-PEOPLE-2013-CIG
Funding Scheme	MC-CIG
Anno di inizio	2014
Periodo (anno-mese-giorno)	2014-09-01 - 2018-08-31

Partecipanti

#	participant	country	role	EC contrib. [€]
1	JOHANNES GUTENBERG UNIVERSITAET MAINZ Organization address address: SAARSTRASSE 21 city: MAINZ postcode: 55099 contact info Titolo: Dr. Nome: Sascha Cognome: Hofmann Email: send email Telefono: +49 7274 508 35111 Fax: +49 7274 508 35412	DE (MAINZ)	coordinator	100˙000.00

Mappa

Word cloud

Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.

of words translated human languages translations texts acquisition or corpora alignments multiword parallel word cross language

Obiettivo del progetto (Objective)

'Given large collections of parallel (i.e. translated) texts, it is well-known how to, by successively applying a sentence- and a word-alignment step, establish correspondences between words across languages. However, parallel texts are a scarce resource for most language pairs involving lesser-used languages. On the other hand, human second language acquisition seems not to require the reception of large amounts of translated texts, which indicates that there must be another way of crossing the language barrier. Apparently, the human capabilities are based on looking at comparable resources, i.e. texts or speech on related topics in different languages, which, however, are not translations of each other. Comparable (written or spoken) corpora are far more common than parallel corpora, thus offering the chance to overcome the data acquisition bottleneck. Despite its cognitive motivation, in the proposed project we will not attempt to simulate the complexities of human second language acquisition, but will show that it is possible by purely technical means to automatically extract information on word- and multiword-translations from comparable corpora. The aim is to push the boundaries of current approaches, which typically utilize correlations between co-occurrence patterns across languages, in several ways: 1) Eliminating the need for initial lexicons by using a bootstrapping approach which only requires a few seed translations. 2) Implementing a new methodology which first establishes alignments between comparable documents across languages, and then computes cross-lingual alignments between words and multiword-units. 3) Improving the quality of computed word translations by applying an interlingua approach, which, by relying on several pivot languages, allows a highly effective multi-dimensional cross-check. 4) We will show that, by looking at foreign citations, language translations can even be derived from a single monolingual text corpus.'

Altri progetti dello stesso programma (FP7-PEOPLE)