COMET-22: Unbabel-IST 2022 Submission for the Metrics Shared Task (original) (raw)
Abstract
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2022 Metrics Shared Task. Our primary submission – dubbed COMET-22 – is an ensemble between a COMET estimator model trained with Direct Assessments and a newly proposed multitask model trained to predict sentence-level scores along with OK/BAD word-level tags derived from Multidimensional Quality Metrics error annotations. These models are ensembled together using a hyper-parameter search that weights different features extracted from both evaluation models and combines them into a single score. For the reference-free evaluation, we present CometKiwi. Similarly to our primary submission, CometKiwi is an ensemble between two models. A traditional predictor-estimator model inspired by OpenKiwi and our new multitask model trained on Multidimensional Quality Metrics which can also be used without references. Both our submissions show improved correlations compared to state-of-the-art metrics from last year as well as increased robustness to critical errors.
Anthology ID:
2022.wmt-1.52
Volume:
Proceedings of the Seventh Conference on Machine Translation (WMT)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Philipp Koehn,Loïc Barrault,Ondřej Bojar,Fethi Bougares,Rajen Chatterjee,Marta R. Costa-jussà,Christian Federmann,Mark Fishel,Alexander Fraser,Markus Freitag,Yvette Graham,Roman Grundkiewicz,Paco Guzman,Barry Haddow,Matthias Huck,Antonio Jimeno Yepes,Tom Kocmi,André Martins,Makoto Morishita,Christof Monz,Masaaki Nagata,Toshiaki Nakazawa,Matteo Negri,Aurélie Névéol,Mariana Neves,Martin Popel,Marco Turchi,Marcos Zampieri
Venue:
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
578–585
Language:
URL:
https://aclanthology.org/2022.wmt-1.52
DOI:
Bibkey:
Cite (ACL):
Ricardo Rei, José G. C. de Souza, Duarte Alves, Chrysoula Zerva, Ana C Farinha, Taisiya Glushkova, Alon Lavie, Luisa Coheur, and André F. T. Martins. 2022. COMET-22: Unbabel-IST 2022 Submission for the Metrics Shared Task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 578–585, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
COMET-22: Unbabel-IST 2022 Submission for the Metrics Shared Task (Rei et al., WMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.wmt-1.52.pdf