Investigating lexical substitution scoring for subtitle generation (original) (raw)
Related papers
Multi-lingual dependency parsing at NAIST
Proceedings of the Tenth …, 2006
CoNLL has turned ten! With a mix of pride and amazement over how time flies, we now celebrate the tenth time that ACL's special interest group on natural language learning, SIGNLL, holds its yearly conference.
Proceedings of the Thirteenth Conference on Computational Natural Language Learning Shared Task - CoNLL '09, 2009
For the 11th straight year, the Conference on Computational Natural Language Learning has been accompanied by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2009, the shared task was dedicated to the joint parsing of syntactic and semantic dependencies in multiple langauges. This shared task combines the shared tasks of the previous five years under a unique dependency-based formalism similar to the 2008 task. In this paper, we define the shared task, describe how the data sets were created, report and analyze the results and summarize the approaches of the participating systems.
The EVALITA Dependency Parsing Task: From 2007 to 2011
Lecture Notes in Computer Science, 2013
Established in 2007, EVALITA (http://www.evalita.it) is the evaluation campaign of Natural Language Processing and Speech Technologies for the Italian language, organized around shared tasks focusing on the analysis of written and spoken language respectively. EVALITA's shared tasks are aimed at contributing to the development and dissemination of natural language resources and technologies by proposing a shared context for training and evaluation. Following the success of previous editions, we organized EVALITA 2014, the fourth evaluation campaign with the aim of continuing to provide a forum for the comparison and evaluation of research outcomes as far as Italian is concerned from both academic institutions and industrial organizations. The event has been supported by the NLP Special Interest Group of the Italian Association for Artificial Intelligence (AI*IA) and by the Italian Association of Speech Science (AISV). The novelty of this year is that the final workshop of EVALITA is co-located with the 1st Italian Conference of Computational Linguistics (CLiC-it, http://clic.humnet.unipi.it/), a new event aiming to establish a reference forum for research on Computational Linguistics of the Italian community with contributions from a wide range of disciplines going from Computational Linguistics, Linguistics and Cognitive Science to Machine Learning, Computer Science, Knowledge Representation, Information Retrieval and Digital Humanities. The co-location with CLiC-it potentially widens the potential audience of EVALITA. The final workshop, held in Pisa on the 11th December 2014 within the context of the XIII AI*IA Symposium on Artificial Intelligence (Pisa, 10-12 December 2014, http://aiia2014.di.unipi.it/), gathers the results of 8 tasks, 4 of which focusing on written language and 4 on speech technologies. In this EVALITA edition, we received 30 expressions of interest, 55 registrations and 43 actual submissions to 8 proposed tasks distributed as follows:
2 . 1 The CoNLL – X Shared Task
2012
This paper addresses the problem of optimizing the training treebank data because the size and quality of the data has always been a bottleneck for the purposes of training. In previous studies we realized that current corpora used for training machine learning–based dependency parsers contain a significant proportion of redundant information at the syntactic structure level. Since the development of such training corpora involves a big effort, we argue that an appropriate process for selecting the sentences to be included in them can result in having parsing models as accurate as the ones given when training with bigger – non optimized corpora (or alternatively, bigger accuracy for an equivalent annotation effort). This argument is supported by the results of the study we carried out, which is presented in this paper. Therefore, this paper demonstrates that the training corpora contain more information than needed for training accurate data–driven dependency parsers.
Introduction to the CoNLL-2005 shared task
Proceedings of the Ninth Conference on Computational Natural Language Learning - CONLL '05, 2005
In this paper we describe the CoNLL-2005 shared task on Semantic Role Labeling. We introduce the specification and goals of the task, describe the data sets and evaluation methods, and present a general overview of the 19 systems that have contributed to the task, providing a comparative description and results.
The Association for Computational Linguistics, 2014
This tutorial discusses a framework for incremental left-to-right structured predication, which makes use of global discriminative learning and beam-search decoding. The method has been applied to a wide range of NLP tasks in recent years, and achieved competitive accuracies and efficiencies. We give an introduction to the algorithms and efficient implementations, and discuss their applications to a range of NLP tasks.
Analyzing the CoNLL--X Shared Task from a Sentence Accuracy Perspective
2012
Resumen: Hoy en día, dada la relevancia de las CoNLL shared tasks para Análisis de Dependencias, las medidas más usadas son las que allí se computaron. Esas medidas, están basadas en calcular globalmente la precisión palabra por palabra (o token por token) para todo el conjunto de frases. En nuestra opinión el usuario final de un analizador de dependencias podría esperar una precisión local basada en evaluar la precisión frase a frase. En estos casos, unas medidas diferentes pueden añadir algo de información que podría ser relevante acerca de que analizador devuelve un mejor resultado. Es por ello que presentamos el estudio de este artículo con la intención de enriquecer la descripción del comportamiento de los analizadores de dependencias. Palabras clave: Análisis sintáctico de dependencias, CoNLL Shared Tasks, Precisión por frase.
The CoNLL-2009 shared task: Syntactic and semantic dependencies in multiple languages
2009
Abstract For the 11th straight year, the Conference on Computational Natural Language Learning has been accompanied by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2009, the shared task was dedicated to the joint parsing of syntactic and semantic dependencies in multiple languages. This shared task combines the shared tasks of the previous five years under a unique dependency-based formalism similar to the 2008 task.
The CoNLL 2007 shared task on dependency parsing
2007
The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide a first analysis of these results. 2 Task Definition In this section, we provide the task definitions that were used in the two tracks of the CoNLL 2007 Shard Task, the multilingual track and the domain adaptation track, together with some background and motivation for the design choices made. First of all, we give a brief description of the data format and evaluation metrics, which were common to the two tracks. 2.1 Data Format and Evaluation Metrics