HYGHTRA

A Hybrid High Quality Translation System

 Coordinatore UNIVERSITY OF LEEDS 

 Organization address address: WOODHOUSE LANE
city: LEEDS
postcode: LS2 9JT

contact info
Titolo: Ms.
Nome: Keri
Cognome: Dunning
Email: send email
Telefono: +44 113 343 7690
Fax: +44 113 343 4058

 Nazionalità Coordinatore United Kingdom [UK]
 Totale costo 833˙193 €
 EC contributo 833˙193 €
 Programma FP7-PEOPLE
Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)
 Code Call FP7-PEOPLE-2009-IAPP
 Funding Scheme MC-IAPP
 Anno di inizio 2010
 Periodo (anno-mese-giorno) 2010-12-01   -   2014-11-30

 Partecipanti

# participant  country  role  EC contrib. [€] 
1    UNIVERSITY OF LEEDS

 Organization address address: WOODHOUSE LANE
city: LEEDS
postcode: LS2 9JT

contact info
Titolo: Ms.
Nome: Keri
Cognome: Dunning
Email: send email
Telefono: +44 113 343 7690
Fax: +44 113 343 4058

UK (LEEDS) coordinator 571˙811.00
2    LINGENIO GMBH

 Organization address address: KARLSRUHER STRASSE 10
city: HEIDELBERG
postcode: 69126

contact info
Titolo: Dr.
Nome: Kurt
Cognome: Eberle
Email: send email
Telefono: +49 6221 189827
Fax: +49 6221 619766

DE (HEIDELBERG) participant 261˙382.00

Mappa


 Word cloud

Esplora la "nuvola delle parole (Word Cloud) per avere un'idea di massima del progetto.

rbmt    data    quality    machine    corpora    bilingual    acquisition    rule    smt    dictionaries    mt    conventional    translation   

 Obiettivo del progetto (Objective)

'Since roughly a decade statistical machine translation (SMT) predominates in academic research. However, most commercial MT suppliers continue to offer systems based on more traditional rule-based architectures (RBMT). Difficulties with replacing the translation engines in the product set-up may explain this discrepancy in part. However, the main reasons are that RBMT makes available a whole bunch of functions which SMT does not provide, including human-readable, fully worked out 'conventional' dictionaries, and that for a number of language pairs RBMT-quality is still higher.

SMT needs huge bilingual text corpora to compute satisfactory translation models, and it is inherently weak when dealing with rare data and non-local phenomena. Its advantages are low cost and robustness. The main disadvantages of RBMT are high cost and shortcomings with respect to resolving structural and lexical ambiguities.

We propose a hybrid architecture for high quality machine translation which combines the strengths of both approaches and minimizes their weaknesses: At the core is a rule-based MT system which provides morphology, declarative grammars, semantic categories, and small dictionaries, but which avoids all expensive kinds of intellectual knowledge acquisition. Instead of manually working out large dictionaries and compiling information on disambiguation preference, we suggest a novel corpus-based bootstrapping method for automatically expanding dictionaries, and for training the analytical performance and the choice of transfer alternatives.

As bilingual corpora with good literal translations are a sparse resource, we focus in particular on exploiting comparable monolingual corpora. We locate unknown words and expressions, and then use a statistically tuned analysis component in combination with similarity assumptions to identify relations across languages. This approach should make it possible to overcome the data acquisition bottleneck of conventional SMT.'

Altri progetti dello stesso programma (FP7-PEOPLE)

FISH&FISHERS (2014)

The role of the behavioural interactions between fish and fishers on fisheries sustainability

Read More  

DHRS-CIM (2008)

Distributed Human-Robot System for Chemical Incident Management

Read More  

SOLSKA (2013)

Gender-Sensitive Biography of the Polish Actress Irena Solska (1877-1958)

Read More