Opendata, web and dolomites


MMT will deliver a language independent commercial online translation service based on a new open-source machine translation distributed architecture

Total Cost €


EC-Contrib. €






Project "MMT" data sheet

The following table provides information about the project.


Organization address
address: PIAZZA CITERA 1
postcode: 40
website: n.a.

contact info
title: n.a.
name: n.a.
surname: n.a.
function: n.a.
email: n.a.
telephone: n.a.
fax: n.a.

 Coordinator Country Italy [IT]
 Project website
 Total cost 3˙695˙200 €
 EC max contribution 2˙994˙700 € (81%)
 Programme 1. H2020-EU. (Content technologies and information management: ICT for digital content, cultural and creative industries)
 Code Call H2020-ICT-2014-1
 Funding Scheme IA
 Starting year 2015
 Duration (year-month-day) from 2015-01-01   to  2017-12-31


Take a look of project's partnership.

# participants  country  role  EC contrib. [€] 
1    TRANSLATED SRL IT (POMEZIA) coordinator 1˙004˙500.00
2    FONDAZIONE BRUNO KESSLER IT (TRENTO) participant 757˙500.00
3    TAUS BV NL (DE RIJP) participant 630˙000.00
4    THE UNIVERSITY OF EDINBURGH UK (EDINBURGH) participant 602˙700.00


 Project objective

The goal of MMT is to deliver a language independent commercial online translation service based on a new open-source machine translation distributed architecture.

MMT does not require any initial training phase. Once fed with training data MMT will be ready to translate. MMT de-facto will merge translation memory and machine translation technology into one single product. Quality of translations will increase as soon as new training data are added.

MMT manages context automatically so that it will not require building domain specific systems. MMT will provide best translation quality for any topic/domain by storing training segments together with context linking information.

MMT enables scalability of data and users so that no more expensive ad-hoc hardware installations are needed. The MMT architecture will support high performance and linear scalability up to thousands of nodes. The same software will work to set-up a personal translation system or to create a web-based service on a cluster of commodity nodes able to handle terabytes of data and millions of users.

MMT will create a data collection infrastructure that accelerates the process of filling the data gap between large IT companies and the MT industry. MMT will leverage the data crawled on the web by Common Crawl, TAUS, Translated’s MyMemory and Matecat data and facilities to set up a processing pipeline that will create unprecedented amounts of clean parallel and monolingual data to develop machine translation systems.


List of deliverables.
Open source distribution Other 2019-05-30 13:29:06
First Design and Specifications Report Documents, reports 2019-05-30 13:29:02
Third Report on Database and MT Infrastructure Documents, reports 2019-05-30 12:21:27
First Evaluation Plan Report Documents, reports 2019-05-30 12:21:21
First Report on Database and MT Infrastructure Documents, reports 2019-05-30 12:21:27
Second Report on Database and MT Infrastructure Documents, reports 2019-05-30 12:21:25
Second Design and Specifications Report Documents, reports 2019-05-30 12:21:15
First Report on Data Supply Documents, reports 2019-05-30 12:21:05
Data Management Plan Documents, reports 2019-05-30 12:21:04
Second Report on Data Supply Documents, reports 2019-05-30 12:21:16
Second Evaluation Plan Report Documents, reports 2019-05-30 12:21:18
Report on Data Repository Documents, reports 2019-05-30 12:21:24
Live Integration with TAUS and MyMemory Other 2019-05-30 12:21:22
Legal framework for the usage of data collected Documents, reports 2019-05-30 13:29:04
First Technology Assessment Report Documents, reports 2019-05-30 12:21:22
Open Source Installer, Billion words system with APIs & live interface Other 2019-05-30 12:21:14
Second Technology Assessment Report Documents, reports 2019-05-30 12:21:13

Take a look to the deliverables list in detail:  detailed list of MMT deliverables.


year authors and title journal last update
List of publications.
2017 M. Amin Farajian, Marco Turchi, Matteo Negri, Nicola Bertoldi, and Marcello Federico
Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario
published pages: , ISSN: , DOI:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers 2019-05-30
2017 Surafel Melaku Lakew, Quintino Francesco Lotito, Marco Turchi, Matteo Negri, and Marcello Federico
Improving Zero-Shot Translation of Low-Resource Languages
published pages: , ISSN: , DOI:
Proceedings of the 14th International Workshop on Spoken Language Translation 2019-05-30
2017 Mattia Antonino Di Gangi and Marcello Federico
Monolingual Embeddings for Low Resourced Neural Machine Translation
published pages: , ISSN: , DOI:
Proceedings of the 14th International Workshop on Spoken Language Translation 2019-05-30
2016 A. Ruopp
The Reasonable Effectiveness of Data
published pages: 123-142, ISSN: , DOI:
Proceedings of 12th Conference of the Association for Machine Translation in the Americas 2019-05-30
2016 U. Germann, E. Barbu, L. Bentivogli, N. Bertoldi, N. Bogoychev, C. Buck, D. Caroselli, L. Carvalho, A. Cattelan, R. Cattoni, M. Cettolo, M. Federico, B. Haddow, D. Madl, L. Mastrostefano, P. Mathur, A. Ruopp, A. Samiotou, V. Sudharshan, M. Trombetti, J. van der Meer
Modern MT: A New Open-Source MachineTranslation Platform for the Translation Industry
published pages: 397–397, ISSN: , DOI:
Baltic Journal of Modern Computing Vol. 4, No. 2 2019-05-30
2016 M. J. Sabet, M. Negri, Marco Turchi, E. Barbu
An Unsupervised Method for Automatic Translation Memory Cleaning
published pages: 287–292, ISSN: , DOI:
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics August 7-12, 2016 2019-05-30
2017 Rico Sennrich, Alexandra Birch, Anna Currey, Ulrich Germann, Barry Haddow, Kenneth Heafield, Antonio Valerio Miceli Barone, and Philip Williams.
The University of Edinburgh\'s Neural MT systems for WMT17
published pages: , ISSN: , DOI:
Proceedings of the Second Conference on Machine Translation 2019-05-30
2017 Mattia Antonino Di Gangi, and Marcello Federico
Can Monolingual Embeddings Improve Neural Machine Translation?
published pages: , ISSN: , DOI:
Proceedings of the 4th Italian Conference on Computational Linguistics (CLIC-IT) 2019-05-30
2016 M. J. Sabet, M. Negri, M. Turchi, J. G. C. de Souza, M. Federico
TMOP: A Tool for Unsupervised Translation Memory Cleaning
published pages: 49-54, ISSN: , DOI:
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics System Demonstrations, August 7 2019-05-30
2017 Surafel Melaku Lakew, Quintino Francesco Lotito, Marco Turchi, Matteo Negri, and Marcello Federico
FBK’s Multilingual Neural Machine Translation System for IWSLT 2017
published pages: , ISSN: , DOI:
Proceedings of the 14th International Workshop on Spoken Language Translation 2019-05-30
2017 M. A. Farajian, Marco Turchi, Matteo Negri and Marcello Federico
Multi-Domain Neural Machine Translation through Unsupervised Adaptation
published pages: , ISSN: , DOI:
Proceedings of the 2nd Conference on Machine Translation (WMT17), Volume 1: Research Papers 2019-05-30
2016 M. Federico
MT Adaptation from TMs in ModernMT
published pages: 19-57, ISSN: , DOI:
Proceedings of 12th Conference of the Association for Machine Translation in the Americas vol. 2: MT Users\' Track 2019-05-30
2017 Marcin Junczys-Dowmunt, Roman Grundkiewicz
An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing
published pages: , ISSN: , DOI:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers) 2019-05-30
2017 Rajen Chatterjee, Matteo Negri, Marco Turchi, Marcello Federico, Lucia Specia, and Frédéric Blain
Guiding Neural Machine Translation Decoding with External Knowledge
published pages: , ISSN: , DOI:
Proceedings of the 2nd Conference on Machine Translation (WMT17), Volume 1: Research Papers 2019-05-30
2016 Nikolay Bogoychev and Hieu Hoang
Fast and Highly Parallelizable PhraseTable for Statistical Machine Translation
published pages: 102–109, ISSN: , DOI:
Proceedings of the First Conference on Machine Translation Vol. 1 2019-05-30
2017 Mattia A. Di Gangi, Nicola Bertoldi and Marcello Federico
FBK’s Participation to the English-to-German News Translation Task of WMT 2017
published pages: , ISSN: , DOI:
Proceedings of the 2nd Conference on Machine Translation (WMT17), Volume 2: Shared Task Papers 2019-05-30
2016 L. Bentivogli, M. Cettolo, M. A. Farajian, M. Federico
WAGS: A Beautiful English-Italian Benchmark Supporting Word Alignment Evaluation of Rare Words
published pages: 3535-3542, ISSN: , DOI:
Proceedings of the 10th Language Resources and Evaluation Conference 2019-05-30
2016 L. Bentivogli, A. Bisazza, M. Cettolo, M. Federico
Neural versus Phrase-Based Machine Translation Quality: a Case Study
published pages: 257-267, ISSN: , DOI:
Proceedings of Conference on Empirical Methods in Natural Language Processing November 1-5, 2016 2019-05-30
2015 Ulrich Germann
Sampling Phrase Tables for the Moses Statistical Machine Translation System
published pages: 39-50, ISSN: 0032-6585, DOI: 10.1515/pralin-2015-0012
The Prague Bulletin of Mathematical Linguistics No. 104, October 2015 2019-05-30
2017 Matteo Negri, Duygu Ataman, Masoud Jalili Sabet, Marco Turchi, Marcello Federico
Automatic translation memory cleaning
published pages: 93-115, ISSN: 0922-6567, DOI: 10.1007/s10590-017-9191-5
Machine Translation 31/3 2019-05-30
2017 Nicola Bertoldi, Roldano Cattoni, Mauro Cettolo, Amin Farajian, Marcello Federico, Davide Caroselli, Luca Mastrostefano, Andrea Rossi, Marco Trombetti, Ulrich Germann, and David Madl
MMT: New Open Source MT for the Translation Industry
published pages: , ISSN: , DOI:
Proceedings of the 20th Annual Conference of the European Association for Machine Translation 2019-05-30
2017 Antonio Valerio Miceli Barone, Barry Haddow, Ulrich Germann, and Rico Sennrich
Regularization techniques for fine-tuning in neural machine translation
published pages: , ISSN: , DOI:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2019-05-30
2016 Nikolay Bogoychev and Adam Lopez
N-gram Language Models for Massively Parallel Devices
published pages: , ISSN: , DOI:
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics 2019-05-30
2016 M. A. Farajian, R. Chatterjee, C. Conforti, S. Jalalvand, V. Balaraman, M. A. Di Gangi, D. Ataman, M. Turchi, M. Negri, M. Federico,
FBK\'s Neural Machine Translation Systems for IWSLT 2016
published pages: , ISSN: , DOI:
Proceedings of the 13th Workshop on Spoken Language Translation December 8-9, 2016 2019-05-30
2017 Surafel Melaku Lakew, Mattia Antonino Di Gangi, and Marcello Federico
Multilingual Neural Machine Translation for Low Resource Languages
published pages: , ISSN: , DOI:
Proceedings of the 4th Italian Conference on Computational Linguistics (CLIC-IT) 2019-05-30
2016 C. Buck, P. Koehn
Findings of the WMT 2016 Bilingual Document Alignment Shared Task
published pages: 554–563, ISSN: , DOI:
Proceedings of the First Conference on Machine Translation Volume 2: Shared Task Papers , 2019-05-30
2016 M. Cettolo, J. Niehues, S. Stüker, L. Bentivogli, M. Federico
The IWSLT 2016 Evaluation Campaign
published pages: , ISSN: , DOI:
Proceedings of the 13th Workshop on Spoken Language Translation December 8-9, 2016 2019-05-30

Are you the coordinator (or a participant) of this project? Plaese send me more information about the "MMT" project.

For instance: the website url (it has not provided by EU-opendata yet), the logo, a more detailed description of the project (in plain text as a rtf file or a word file), some pictures (as picture files, not embedded into any word file), twitter account, linkedin page, etc.

Send me an  email ( and I put them in your project's page as son as possible.

Thanks. And then put a link of this page into your project's website.

The information about "MMT" are provided by the European Opendata Portal: CORDIS opendata.

More projects from the same programme (H2020-EU.

SEWA (2015)

Automatic Sentiment Estimation in the Wild

Read More  

POPART (2015)

Previz for On-set Production - Adaptive Realtime Tracking

Read More  

Film265 (2015)

Improving European VoD Creative Industry with High Efficiency Video Delivery

Read More