Ana Garcia Serrano | Universidad Nacional de Educación a Distancia (original) (raw)
Papers by Ana Garcia Serrano
arXiv (Cornell University), May 18, 2022
Revista de Humanidades Digitales, 2017
Las Humanidades Digitales pretenden facilitar el acceso y entendimiento de documentos históricos ... more Las Humanidades Digitales pretenden facilitar el acceso y entendimiento de documentos históricos mediante aplicaciones informáticas. En este proceso es importante la etapa de representación formal y digital de los contenidos para facilitar el posterior proceso de estos en aplicaciones, por ejemplo, de acceso y visualización, búsqueda y organización automática de contenidos. En este sentido, este trabajo presenta diferentes aproximaciones de representación en el ámbito del proyecto DIMH (accesible desde https://dimh.hypotheses.org/). En particular se detalla el desarrollo de una ontología para la representación de los contenidos del corpus DIMH. Para reforzar la comprensión de la estructura ontológica y mostrar la potencia de los modelos ontológicos, se presentan una serie de ejemplos prácticos y de consultas.
Proces. del Leng. Natural, 2021
Unsupervised Named Entity Recognition (NER) approaches do not depend on labelled data to function... more Unsupervised Named Entity Recognition (NER) approaches do not depend on labelled data to function properly but rather on a source of knowledge, in which promising candidates can be looked up to find the corresponding concept. In the biomedical domain knowledge source like this already exists; namely the Unified Medical Language System (UMLS). In this paper, three different unsupervised NER models using UMLS, namely MetaMap, cTakes and MetaMapLite are evaluated and compared from the results published by Demner-Fushman, Rogers and Aronson (2017) and Reategui and Ratte (2018). The Unsupervised Biomedical Named Entity Recognition framework (UB-NER) is developed, with which the results of the experiments of the three models, five datasets and two NER tasks are presented.
Lecture Notes in Computer Science, 2004
This paper describes the first set of experiments defined by the MIRACLE (Multilingual Informatio... more This paper describes the first set of experiments defined by the MIRACLE (Multilingual Information RetrievAl for the CLEf campaign) research group for some of the cross language tasks defined by CLEF. These experiments combine different basic techniques, linguistic-oriented and statistic-oriented, to be applied to the indexing and retrieval processes.
Proceedings of the XVII International Conference on Human Computer Interaction, 2016
There are some similarities in developing a traditional Higher Education (HE) eLearning course an... more There are some similarities in developing a traditional Higher Education (HE) eLearning course and MOOCs (Massive Open Online Courses), due to the use of the basis of eLearning instructional design. But in MOOCs, students should be continually influenced by information, social interactions and experiences forcing the faculty to come up with new approaches and ideas to develop a really engaging course. In this paper, the process of MOOCifying an online course on Universal Accessibility is detailed. The needed quality model is based upon the one used for all online degree programs at our university and on a variable metric specially designed for UNED MOOC courses making possible to control how each course was structured, what kind of resources were used and how activities, interaction and assessment were included. The learning activities were completely adapted, along with the content itself and the on-line assessment. For this purpose, the Gardner's Multiple Intelligences Product Grid has been selected.
The UNED-UV group at the ImageCLEF2013 Campaign have participated in the Scalable Concept Image A... more The UNED-UV group at the ImageCLEF2013 Campaign have participated in the Scalable Concept Image Annotation subtask. We present a multimedia IR-based system for the annotation task. In this collection, the images do not have any textual description associated, so we have downloaded and preprocessed the web pages which contain the images. Regarding the concepts, we expanded their textual description with additional information from external resources as Wikipedia or WordNet and we generate a KLD concept model using recovered textual information. The multimedia IR-based system uses a logistic relevance algorithm to get a model for each of the concepts to be trained using visual image features. Finally, the fusion subsystem merges textual and visual scores for a certain image to belong a concept, and decides the presence of the concept in the images.
Procesamiento Del Lenguaje Natural, Mar 4, 2015
Resumen: El tiempo es un elemento de importancia capital en todo espacio de información y Twitter... more Resumen: El tiempo es un elemento de importancia capital en todo espacio de información y Twitter no es una excepción. La explotación de la información temporal en tareas de recuperación y organización de información, tiene una larga tradición. Sin embargo, esta clase de enfoques, basados en contenido, no han sido muy explorados para el dominio de Twitter, y en consecuencia escasean los Corpus de tweets anotados con información temporal. En este artículo, se propone un modelo de anotación de la información temporal en el dominio de Twitter, basado en el Análisis de Conceptos Formales, en el que los atributos del contexto serán las expresiones temporales, eventos y tipos de eventos presentes en los tweets. Se define un Calendario especialmente adecuado a los fenómenos de conmemoración de aniversarios y fechas señaladas en Twitter, el Calendario Imaginario-Colectivo. El Corpus de estudio ha sido extraido de la colección de RepLab2013. Se incluye un completo análisis del mismo desde una perspectiva temporal. Palabras clave: Información temporal, Anotación temporal de tweets, Representación de información basada en contenido
Transportation Research Part C: Emerging Technologies, 2005
Modern decision support systems (DSS) not only store large amounts of decision-relevant data, but... more Modern decision support systems (DSS) not only store large amounts of decision-relevant data, but also aim at assisting decision-makers to explore the meaning of that data, and to take decisions based on understanding. In transportation domains, a multiagent approach to the construction of DSS is becoming increasingly popular, because it does not only reduce design complexity, but it also adequately supports a dialogue-based stance on decision support interactions. However, despite recent advances in the field of agent-oriented software engineering, a principled approach to the design of multiagent systems for decision support is still to come. In this paper, we outline a design method for the construction of agent-based DSS. Setting out from an organisational and communicative model of decision support environments, we present an abstract
Collaborative dialogue technologies in distance learning, 1994
Additional file 1: We provide the Appendix A entitled "The reproducible benchmarks of biomed... more Additional file 1: We provide the Appendix A entitled "The reproducible benchmarks of biomedical semantic measures libraries" as supplementary material in one additional file. Appendix A introduces a detailed experimental setup, which is based on a publicly available reproducibility dataset [65] provided as supplementary material to allow the exact replication of all the experiments and results reported herein, as well as providing the source code of our benchmarks.
This protocol introduces a set of reproducibility resources with the aim of allowing the exact re... more This protocol introduces a set of reproducibility resources with the aim of allowing the exact replication of the experiments introduced by our main paper [1], which introduces the largest and for the first time reproducible experimental survey on biomedical sentence similarity. HESML V2R1 [2] is the sixth release of our Half-Edge Semantic Measures Library (HESML), which is a linearly scalable and efficient Java software library of ontology-based semantic similarity measures and Information Content (IC) models for ontologies like WordNet, SNOMED-CT, MeSH and GO. This protocol sets a self-contained reproducibility platform which contains the Java source code and binaries of our main benchmark program, as well as a Docker image which allows the exact replication of our experiments in any software platform supported by Docker, such as all Linux-based operating systems, Windows or MacOS. All the necessary resources for executing the experiments are published in the permanent repository ...
This dataset introduces a set of reproducibility resources with the aim of allowing the exact rep... more This dataset introduces a set of reproducibility resources with the aim of allowing the exact replication of the experiments introduced by our companion paper, which compare the performance of the three UMLS-based semantic similarity libraries reported in the literature as follows: (1) UMLS::Similarity [20], (2) Semantic Measures Library (SML) [3], and the latest version of our Half-Edge Semantic Measures Library (HESML) introduced in our aforementioned companion paper. HESML V1R5 is the fifth release of our Half-Edge Semantic Measures Library (HESML) detailed in [15] which is a linearly scalable and efficient Java software library of ontology-based semantic similarity measures and Information Content (IC) models for ontologies like WordNet, SNOMED-CT, MeSH and GO. This dataset sets a self-contained reproducibility platform which contains the Java source code and binaries of our main benchmark program, as well as a Docker image which allows the exact replication of our experiments i...
Whitestein Series in Software Agent Technologies
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2009
INTELIGENCIA ARTIFICIAL, 2002
Una de las hipótesis para la mejora de la interacción persona-ordenador se basa en el uso del len... more Una de las hipótesis para la mejora de la interacción persona-ordenador se basa en el uso del lenguaje natural; en un primer nivel básico se emplea conocimiento lingüístico simple en aquellos procesos no muy complejos involucrados en la interacción. En este trabajo se muestran aspectos relacionados con la integración de la tecnología disponible de tratamiento de lenguaje natural en el desarrollo de un metabuscador que alcance un mayor grado de acierto en la recuperación de información realizada por un buscador tradicional así como en el tratamiento posterior de los documentos recuperados. En particular, se describe el proceso realizado para la extensión de las consultas de los usuarios con información lingüística empleando dos recursos léxicos para el castellano: ARIES para el tratamiento de la morfología y EuroWordnet para el tratamiento de la semántica. Este trabajo forma parte del sistema MESIA, Modelo computacional para extracción selectiva de información de textos cortos, que amplía la búsqueda habitual (consulta y presentación de resultados) con nuevas capacidades morfológicas y semánticas y analiza otros aspectos obtenidos a partir de la estructura de las páginas, del tratamiento lingüístico de algunas de las unidades de texto seleccionadas automáticamente y de la experiencia de uso. El sistema está diseñado para el sitio Web de la Comunidad Autónoma de Madrid (CAM), lo que representa una restricción en la cantidad de información disponible, pero se mantiene la problemática general de la búsqueda de información ya que la información contenida en estas páginas abarca prácticamente todas las categorías informativas que la Administración puede ofrecer al ciudadano.
IFIP International Federation for Information Processing
Virtual assistants are a promising business for the near future in the web era. This implies that... more Virtual assistants are a promising business for the near future in the web era. This implies that the supporting applications have to be endowed with advanced capabilities to service offerings and to communicate with the users in a more direct and natural way. This paper presents the agent-based architecture of the virtual assistant and focuses on the dialogue module. The content exchange between the agents is based on communicative acts to cope with the complexity of unrestricted language used by human users communicating with online assistants. The assistant is capable to interact with users and to provide the right output through the exploitation of different information sources. The approach was applied and tested on the insurance field in the frame of the European research project VIP-Advisor 1 .
Lecture Notes in Computer Science, 2004
Interacting with Computers, 2006
IEEE Transactions on Multimedia, 2013
arXiv (Cornell University), May 18, 2022
Revista de Humanidades Digitales, 2017
Las Humanidades Digitales pretenden facilitar el acceso y entendimiento de documentos históricos ... more Las Humanidades Digitales pretenden facilitar el acceso y entendimiento de documentos históricos mediante aplicaciones informáticas. En este proceso es importante la etapa de representación formal y digital de los contenidos para facilitar el posterior proceso de estos en aplicaciones, por ejemplo, de acceso y visualización, búsqueda y organización automática de contenidos. En este sentido, este trabajo presenta diferentes aproximaciones de representación en el ámbito del proyecto DIMH (accesible desde https://dimh.hypotheses.org/). En particular se detalla el desarrollo de una ontología para la representación de los contenidos del corpus DIMH. Para reforzar la comprensión de la estructura ontológica y mostrar la potencia de los modelos ontológicos, se presentan una serie de ejemplos prácticos y de consultas.
Proces. del Leng. Natural, 2021
Unsupervised Named Entity Recognition (NER) approaches do not depend on labelled data to function... more Unsupervised Named Entity Recognition (NER) approaches do not depend on labelled data to function properly but rather on a source of knowledge, in which promising candidates can be looked up to find the corresponding concept. In the biomedical domain knowledge source like this already exists; namely the Unified Medical Language System (UMLS). In this paper, three different unsupervised NER models using UMLS, namely MetaMap, cTakes and MetaMapLite are evaluated and compared from the results published by Demner-Fushman, Rogers and Aronson (2017) and Reategui and Ratte (2018). The Unsupervised Biomedical Named Entity Recognition framework (UB-NER) is developed, with which the results of the experiments of the three models, five datasets and two NER tasks are presented.
Lecture Notes in Computer Science, 2004
This paper describes the first set of experiments defined by the MIRACLE (Multilingual Informatio... more This paper describes the first set of experiments defined by the MIRACLE (Multilingual Information RetrievAl for the CLEf campaign) research group for some of the cross language tasks defined by CLEF. These experiments combine different basic techniques, linguistic-oriented and statistic-oriented, to be applied to the indexing and retrieval processes.
Proceedings of the XVII International Conference on Human Computer Interaction, 2016
There are some similarities in developing a traditional Higher Education (HE) eLearning course an... more There are some similarities in developing a traditional Higher Education (HE) eLearning course and MOOCs (Massive Open Online Courses), due to the use of the basis of eLearning instructional design. But in MOOCs, students should be continually influenced by information, social interactions and experiences forcing the faculty to come up with new approaches and ideas to develop a really engaging course. In this paper, the process of MOOCifying an online course on Universal Accessibility is detailed. The needed quality model is based upon the one used for all online degree programs at our university and on a variable metric specially designed for UNED MOOC courses making possible to control how each course was structured, what kind of resources were used and how activities, interaction and assessment were included. The learning activities were completely adapted, along with the content itself and the on-line assessment. For this purpose, the Gardner's Multiple Intelligences Product Grid has been selected.
The UNED-UV group at the ImageCLEF2013 Campaign have participated in the Scalable Concept Image A... more The UNED-UV group at the ImageCLEF2013 Campaign have participated in the Scalable Concept Image Annotation subtask. We present a multimedia IR-based system for the annotation task. In this collection, the images do not have any textual description associated, so we have downloaded and preprocessed the web pages which contain the images. Regarding the concepts, we expanded their textual description with additional information from external resources as Wikipedia or WordNet and we generate a KLD concept model using recovered textual information. The multimedia IR-based system uses a logistic relevance algorithm to get a model for each of the concepts to be trained using visual image features. Finally, the fusion subsystem merges textual and visual scores for a certain image to belong a concept, and decides the presence of the concept in the images.
Procesamiento Del Lenguaje Natural, Mar 4, 2015
Resumen: El tiempo es un elemento de importancia capital en todo espacio de información y Twitter... more Resumen: El tiempo es un elemento de importancia capital en todo espacio de información y Twitter no es una excepción. La explotación de la información temporal en tareas de recuperación y organización de información, tiene una larga tradición. Sin embargo, esta clase de enfoques, basados en contenido, no han sido muy explorados para el dominio de Twitter, y en consecuencia escasean los Corpus de tweets anotados con información temporal. En este artículo, se propone un modelo de anotación de la información temporal en el dominio de Twitter, basado en el Análisis de Conceptos Formales, en el que los atributos del contexto serán las expresiones temporales, eventos y tipos de eventos presentes en los tweets. Se define un Calendario especialmente adecuado a los fenómenos de conmemoración de aniversarios y fechas señaladas en Twitter, el Calendario Imaginario-Colectivo. El Corpus de estudio ha sido extraido de la colección de RepLab2013. Se incluye un completo análisis del mismo desde una perspectiva temporal. Palabras clave: Información temporal, Anotación temporal de tweets, Representación de información basada en contenido
Transportation Research Part C: Emerging Technologies, 2005
Modern decision support systems (DSS) not only store large amounts of decision-relevant data, but... more Modern decision support systems (DSS) not only store large amounts of decision-relevant data, but also aim at assisting decision-makers to explore the meaning of that data, and to take decisions based on understanding. In transportation domains, a multiagent approach to the construction of DSS is becoming increasingly popular, because it does not only reduce design complexity, but it also adequately supports a dialogue-based stance on decision support interactions. However, despite recent advances in the field of agent-oriented software engineering, a principled approach to the design of multiagent systems for decision support is still to come. In this paper, we outline a design method for the construction of agent-based DSS. Setting out from an organisational and communicative model of decision support environments, we present an abstract
Collaborative dialogue technologies in distance learning, 1994
Additional file 1: We provide the Appendix A entitled "The reproducible benchmarks of biomed... more Additional file 1: We provide the Appendix A entitled "The reproducible benchmarks of biomedical semantic measures libraries" as supplementary material in one additional file. Appendix A introduces a detailed experimental setup, which is based on a publicly available reproducibility dataset [65] provided as supplementary material to allow the exact replication of all the experiments and results reported herein, as well as providing the source code of our benchmarks.
This protocol introduces a set of reproducibility resources with the aim of allowing the exact re... more This protocol introduces a set of reproducibility resources with the aim of allowing the exact replication of the experiments introduced by our main paper [1], which introduces the largest and for the first time reproducible experimental survey on biomedical sentence similarity. HESML V2R1 [2] is the sixth release of our Half-Edge Semantic Measures Library (HESML), which is a linearly scalable and efficient Java software library of ontology-based semantic similarity measures and Information Content (IC) models for ontologies like WordNet, SNOMED-CT, MeSH and GO. This protocol sets a self-contained reproducibility platform which contains the Java source code and binaries of our main benchmark program, as well as a Docker image which allows the exact replication of our experiments in any software platform supported by Docker, such as all Linux-based operating systems, Windows or MacOS. All the necessary resources for executing the experiments are published in the permanent repository ...
This dataset introduces a set of reproducibility resources with the aim of allowing the exact rep... more This dataset introduces a set of reproducibility resources with the aim of allowing the exact replication of the experiments introduced by our companion paper, which compare the performance of the three UMLS-based semantic similarity libraries reported in the literature as follows: (1) UMLS::Similarity [20], (2) Semantic Measures Library (SML) [3], and the latest version of our Half-Edge Semantic Measures Library (HESML) introduced in our aforementioned companion paper. HESML V1R5 is the fifth release of our Half-Edge Semantic Measures Library (HESML) detailed in [15] which is a linearly scalable and efficient Java software library of ontology-based semantic similarity measures and Information Content (IC) models for ontologies like WordNet, SNOMED-CT, MeSH and GO. This dataset sets a self-contained reproducibility platform which contains the Java source code and binaries of our main benchmark program, as well as a Docker image which allows the exact replication of our experiments i...
Whitestein Series in Software Agent Technologies
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2009
INTELIGENCIA ARTIFICIAL, 2002
Una de las hipótesis para la mejora de la interacción persona-ordenador se basa en el uso del len... more Una de las hipótesis para la mejora de la interacción persona-ordenador se basa en el uso del lenguaje natural; en un primer nivel básico se emplea conocimiento lingüístico simple en aquellos procesos no muy complejos involucrados en la interacción. En este trabajo se muestran aspectos relacionados con la integración de la tecnología disponible de tratamiento de lenguaje natural en el desarrollo de un metabuscador que alcance un mayor grado de acierto en la recuperación de información realizada por un buscador tradicional así como en el tratamiento posterior de los documentos recuperados. En particular, se describe el proceso realizado para la extensión de las consultas de los usuarios con información lingüística empleando dos recursos léxicos para el castellano: ARIES para el tratamiento de la morfología y EuroWordnet para el tratamiento de la semántica. Este trabajo forma parte del sistema MESIA, Modelo computacional para extracción selectiva de información de textos cortos, que amplía la búsqueda habitual (consulta y presentación de resultados) con nuevas capacidades morfológicas y semánticas y analiza otros aspectos obtenidos a partir de la estructura de las páginas, del tratamiento lingüístico de algunas de las unidades de texto seleccionadas automáticamente y de la experiencia de uso. El sistema está diseñado para el sitio Web de la Comunidad Autónoma de Madrid (CAM), lo que representa una restricción en la cantidad de información disponible, pero se mantiene la problemática general de la búsqueda de información ya que la información contenida en estas páginas abarca prácticamente todas las categorías informativas que la Administración puede ofrecer al ciudadano.
IFIP International Federation for Information Processing
Virtual assistants are a promising business for the near future in the web era. This implies that... more Virtual assistants are a promising business for the near future in the web era. This implies that the supporting applications have to be endowed with advanced capabilities to service offerings and to communicate with the users in a more direct and natural way. This paper presents the agent-based architecture of the virtual assistant and focuses on the dialogue module. The content exchange between the agents is based on communicative acts to cope with the complexity of unrestricted language used by human users communicating with online assistants. The assistant is capable to interact with users and to provide the right output through the exploitation of different information sources. The approach was applied and tested on the insurance field in the frame of the European research project VIP-Advisor 1 .
Lecture Notes in Computer Science, 2004
Interacting with Computers, 2006
IEEE Transactions on Multimedia, 2013