A Test Suite and Manual Evaluation of Document-Level NMT at WMT19 (original) (raw)
Related papers
LIUM Machine Translation Systems for WMT17 News Translation Task
Proceedings of the Second Conference on Machine Translation
This paper describes LIUM submissions to WMT17 News Translation Task for English↔German, English↔Turkish, English→Czech and English→Latvian language pairs. We train BPE-based attentive Neural Machine Translation systems with and without factored outputs using the open source nmtpy framework. Competitive scores were obtained by ensembling various systems and exploiting the availability of target monolingual corpora for back-translation. The impact of back-translation quantity and quality is also analyzed for English→Turkish where our post-deadline submission surpassed the best entry by +1.6 BLEU.
The Karlsruhe Institute of Technology Systems for the News Translation Task in WMT 2016
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
In this paper, we present the KIT translation systems as well as the KIT-LIMSI systems for the ACL 2016 First Conference on Machine Translation. We participated in the shared task Machine Translation of News and submitted translation systems for three different directions: English→German, German→English and English→Romanian. We used a phrase-based machine translation system and investigated several models to rescore the system. We used neural network language and translation models. Using these models, we could improve the translation performance in all language pairs we participated.
The TALP-UPC Participation in WMT21 News Translation Task: an mBART-based NMT Approach
2021
This paper describes the submission to the WMT 2021 news translation shared task by the UPC Machine Translation group. The goal of the task is to translate German to French (De-Fr) and French to German (Fr-De). Our submission focuses on fine-tuning a pre-trained model to take advantage of monolingual data. We fine-tune mBART50 using the filtered data, and additionally, we train a Transformer model on the same data from scratch. In the experiments, we show that fine-tuning mBART50 results in 31.69 BLEU for De-Fr and 23.63 BLEU for Fr-De, which increases 2.71 and 1.90 BLEU accordingly, as compared to the model we train from scratch. Our final submission is an ensemble of these two models, further increasing 0.3 BLEU for Fr-De.
The ADAPT System Description for the WMT20 News Translation Task
2020
This paper describes the ADAPT Centre’s submissions to the WMT20 News translation shared task for English-to-Tamil and Tamil-to-English. We present our machine translation (MT) systems that were built using the state-of-the-art neural MT (NMT) model, Transformer. We applied various strategies in order to improve our baseline MT systems, e.g. onolin- gual sentence selection for creating synthetic training data, mining monolingual sentences for adapting our MT systems to the task, hyperparameters search for Transformer in lowresource scenarios. Our experiments show that adding the aforementioned techniques to the baseline yields an excellent performance in the English-to-Tamil and Tamil-to-English translation tasks.
PROMT Systems for WMT 2020 Shared News Translation Task
2020
This paper describes the PROMT submissions for the WMT 2020 Shared News Translation Task. This year we participated in four language pairs and six directions: English-Russian, Russian-English, English-German, German-English, Polish-English and Czech-English. All our submissions are MarianNMT-based neural systems. We use more data compared to last year and update our back-translations with better models from the previous year. We show competitive results in terms of BLEU in most directions.
ViNMT: Neural Machine Translation Tookit
ArXiv, 2021
We present an open-source toolkit for neural machine translation (NMT). The new toolkit is mainly based on the vaulted Transformer (Vaswani et al., 2017) along with many other improvements detailed below, in order to create a self-contained, simple to use, consistent and comprehensive framework for Machine Translation tasks of various domains. It is tooled to support both bilingual and multilingual translation tasks, starting from building the model from respective corpora, to inferring new predictions or packaging the model to serving-capable JIT format. The source code and data are available at https://github.com/KCDichDaNgu/ MultilingualMT-UET-KC4.0.
PROMT Systems for WMT 2019 Shared Translation Task
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), 2019
This paper describes the PROMT submissions for the WMT 2019 Shared News Translation Task. This year we participated in two language pairs and in three directions: English-Russian, English-German and German-English. All our submissions are MarianNMT-based neural systems. We use significantly more data compared to the last year. We also present our improved data filtering pipeline.
An in-depth Study of Neural Machine Translation Performance
2019
With the rise of deep learning and rapidly increasing popularity of it, neural machine translation (NMT) has become one of the major research areas. Sequence-to-sequence models are widely used in NTM tasks, and one of the state-of-the-art models, the Transformer, has also encoder-decoder architecture with an additional attention mechanism. Despite a substantial amount of research in improving NMT models’ translation qualities and speeds, to the best of our knowledge, none of them gives a detailed performance analysis of each step in a model. In this paper we analyze the Transformer model’s performance and translation quality in different settings. We conclude that beam search is the bottleneck of the NMT inference and analyze beam search’s effect on the performance and quality in detail. We observe that the beam size is one of the largest contributors to the Transformer’s execution time. Additionally, we observe that the beam size only affects BLEU score at word level, and not at to...
ViNMT: Neural Machine Translation Toolkit
arXiv (Cornell University), 2021
We present an open-source toolkit for neural machine translation (NMT). The new toolkit is mainly based on the vaulted Transformer (Vaswani et al., 2017) along with many other improvements detailed below, in order to create a self-contained, simple to use, consistent and comprehensive framework for Machine Translation tasks of various domains. It is tooled to support both bilingual and multilingual translation tasks, starting from building the model from respective corpora, to inferring new predictions or packaging the model to serving-capable JIT format. The source code and data are available at https://github.com/KCDichDaNgu/ MultilingualMT-UET-KC4.0.
PROMT Systems for WMT 2018 Shared Translation Task
Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 2018
This paper describes the PROMT submissions for the WMT 2018 Shared News Translation Task. This year we participated only in the English-Russian language pair. We built two primary neural networks-based systems: 1) a pure Marianbased neural system and 2) a hybrid system which incorporates OpenNMTbased neural post-editing component into our RBMT engine. We also submitted pure rule-based translation (RBMT) for contrast. We show competitive results with both primary submissions which significantly outperform the RBMT baseline.