Joining linguistic and statistical methods for Spanish-to-Basque speech translation (original) (raw)
Related papers
Computer-assisted translation using finite-state transducers
2005
Resumen: Traducción asistida por ordenador (CAT) es una aproximación alternativa a la traducción automática que integra el conocimiento humano en el proceso de la traducción automática. En este marco, un traductor interactúa con un sistema de traducción que dinámicamente ofrece una lista de traducciones que mejor completan la parte de oración ya traducida. La tecnología de transductores estocásticos de estados finitos se propone para dar apoyo a este sistema CAT. Este sistema fue evaluado en dos tareas reales de diferente complejidad en varias lenguas. Palabras clave: traducción automática, traducción asistida por ordenador, transductores estocásticos de estados finitos Abstract: Computer-Assisted Translation (CAT) is an alternative approach to machine translation, that integrates human expertise into the automatic translation process. In this framework, a human translator interacts with a translation system that dynamically offers a list of translations that best completes the part of the sentence already translated. Stochastic finite-state transducer technology is proposed to support this CAT system. The system was assessed on two real tasks of different complexity in several languages.
Machine Translation with Inferred Stochastic Finite-State Transducers
Computational Linguistics, 2004
Finite-state transducers are models that are being used in different areas of pattern recognition and computational linguistics. One of these areas is machine translation, in which the approaches that are based on building models automatically from training examples are becoming more and more attractive. Finite-state transducers are very adequate for use in constrained tasks in which training samples of pairs of sentences are available. A technique for inferring finite-state transducers is proposed in this article. This technique is based on formal relations between finite-state transducers and rational grammars. Given a training corpus of source-target pairs of sentences, the proposed approach uses statistical alignment methods to produce a set of conventional strings from which a stochastic rational grammar (e.g., an n-gram) is inferred. This grammar is finally converted into a finite-state transducer. The proposed methods are assessed through a series of machine translation experiments within the framework of the EuTrans project.
2007
The goal of this work is to improve current translation models by taking into account additional knowledge sources such as semantically motivated segmentation or statistical categorization. Specifically, two different approaches are discussed. On the one hand, phrase-based approach, and on the other hand, categorization. For both approaches, both statistical and linguistic alternatives are explored. As for translation framework, finite-state transducers are considered. These are versatile models that can be easily integrated on-the-fly with acoustic models for speech translation purposes. In what the experimental framework concerns, all the models presented were evaluated and compared taking confidence intervals into account.
Speech translation can be tackled by means of the so-called decoupled approach: a speech recognition system followed by a text translation system. The major drawback of this two-pass decoding approach lies in the fact that the translation system has to cope with the errors derived from the speech recognition system. There is hardly any cooperation between the acoustic and the translation knowledge sources. There is a line of research focusing on alternatives to implement speech translation efficiently: ranging from semi-decoupled to tightly integrated approaches. The goal of integration is to make acoustic and translation models cooperate in the underlying decision problem. That is, the translation is built by virtue of the joint action of both models. As a side-advantage of the integrated approaches, the translation is obtained in a single-pass decoding strategy. The aim of this paper is to assess the quality of the hypotheses explored within different speech translation approaches. Evidence of the performance is given through experimental results on a limited-domain task.
Finite-state models for computer assisted translation
2004
Current methodologies for automatic translation cannot be expected to produce high quality translations. However, some techniques based on these methodologies can increase the productivity of human translators. The basis of one of these methodologies are finite-state transducers, which are adequate models for computer assisted translation. These models have proved its efficiency in many pattern recognition and artificial intelligence tasks such as speech recognition, handwriting recognition and machine translation for specific domains.
2005
Statistical techniques and grammatical inference have been used for dealing with automatic speech recognition with success, and can also be used for speech-to-speech machine translation. In this paper, new advances on a method for finite-state transducer inference are presented. This method has been tested experimentally in a speech-input translation task using a recognizer that allows a flexible use of models by means of efficient algorithms for on-thefly transducer composition. These are the first reported results of a speech-to-speech translation task involving European Portuguese input that we know of.
Speech-to-speech translation based on finite-state transducers
2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), 2001
Nowadays, the most successful speech recognition systems are based on stochastic finite-state networks (hidden Markov models and n-grams). Speech translation can be accomplished in a similar way as speech recognition. Stochastic finite-state transducers, which are specific stochastic finitestate networks, have proved very adequate for translation modeling. In this work a speech-to-speech translation system, the EUTRANS system, is presented. The acoustic, language and translation models are finite-state networks that are automatically learnt from training samples. This system was assessed in a series of translation experiments from Spanish to English and from Italian to English in an application involving the interaction (by telephone) of a customer with a receptionist at the front-desk of a hotel.
From machine translation to computer assisted translation using finite-state models
2004
State-of-the-art machine translation techniques are still far from producing high quality translations. This drawback leads us to introduce an alternative approach to the translation problem that brings human expertise into the machine translation scenario. In this framework, namely Computer Assisted Translation (CAT), human translators interact with a translation system, as an assistance tool, that dinamically offers, a list of translations that best completes the part of the sentence already translated. In this paper, finite state transducers are presented as a candidate technology in the CAT paradigm. The appropriateness of this technique is evaluated on a printer manual corpus and results from preliminary experiments confirm that human translators would reduce to less than 25% the amount of work to be done for the same task.