Marco Roberti - Academia.edu (original) (raw)

Papers by Marco Roberti

Research paper thumbnail of Copy mechanism and tailored training for character-based data-to-text generation

In the last few years, many different methods have been focusing on using deep recurrent neural n... more In the last few years, many different methods have been focusing on using deep recurrent neural networks for natural language generation. The most widely used sequence-to-sequence neural methods are word-based: as such, they need a pre-processing step called delexicalization (conversely, relexicalization) to deal with uncommon or unknown words. These forms of processing, however, give rise to models that depend on the vocabulary used and are not completely neural. In this work, we present an end-to-end sequence-to-sequence model with attention mechanism which reads and generates at a character level, no longer requiring delexicalization, tokenization, nor even lowercasing. Moreover, since characters constitute the common "building blocks" of every text, it also allows a more general approach to text generation, enabling the possibility to exploit transfer learning for training. These skills are obtained thanks to two major features: (i) the possibility to alternate between...

Research paper thumbnail of I Learn. You Learn. We Learn? An Experiment in Collaborative Concept Mapping

In this paper we present an experiment on digitally-supported collaborative Concept Maps focused ... more In this paper we present an experiment on digitally-supported collaborative Concept Maps focused on asynchronous and remote collaboration. We investigated the integration of multiple perspectives on the same topic, providing users with a tool allowing an individual perspective for each user plus a shared one for the group. Several user actions were made available, affecting one or both perspectives, depending on the context. Results show that integrating different perspectives in a way that everyone can relate to is indeed a complex task: users need to be supported not only in the production of a shared Concept Map, but also in the process of adapting their mental representations, in order to understand, compare and possibly integrate others’ points of view. Our experiment shows that both collaboration in concept mapping (emphasis on the process) and collaboration on a Concept Map (emphasis on the result) are needed, whereas most tools, including the one we experimented with, focus ...

Research paper thumbnail of Controlling Hallucinations at Word Level in Data-to-Text Generation

ArXiv, 2021

Data-to-Text Generation (DTG) is a subfield of Natural Language Generation aiming at transcribing... more Data-to-Text Generation (DTG) is a subfield of Natural Language Generation aiming at transcribing structured data in natural language descriptions. The field has been recently boosted by the use of neural-based generators which exhibit on one side great syntactic skills without the need of hand-crafted pipelines; on the other side, the quality of the generated text reflects the quality of the training data, which in realistic settings only offer imperfectly aligned structure-text pairs. Consequently, state-of-art neural models include misleading statements –usually called hallucinations—in their outputs. The control of this phenomenon is today a major challenge for DTG, and is the problem addressed in the paper. Previous work deal with this issue at the instance level: using an alignment score for each table-reference pair. In contrast, we propose a finer-grained approach, arguing that hallucinations should rather be treated at the word level. Specifically, we propose a Multi-Branch...

Research paper thumbnail of The Rare Word Issue in Natural Language Generation: A Character-Based Solution

In this paper, we analyze the problem of generating fluent English utterances from tabular data, ... more In this paper, we analyze the problem of generating fluent English utterances from tabular data, focusing on the development of a sequence-to-sequence neural model which shows two major features: the ability to read and generate character-wise, and the ability to switch between generating and copying characters from the input: an essential feature when inputs contain rare words like proper names, telephone numbers, or foreign words. Working with characters instead of words is a challenge that can bring problems such as increasing the difficulty of the training phase and a bigger error probability during inference. Nevertheless, our work shows that these issues can be solved and efforts are repaid by the creation of a fully end-to-end system, whose inputs and outputs are not constrained to be part of a predefined vocabulary, like in word-based models. Furthermore, our copying technique is integrated with an innovative shift mechanism, which enhances the ability to produce outputs dir...

Research paper thumbnail of Correction to: Anomaly Detection Techniques in the Gaia Space Mission Data

Journal of Signal Processing Systems

Research paper thumbnail of Copy mechanism and tailored training for character-based data-to-text generation

ArXiv, 2019

In the last few years, many different methods have been focusing on using deep recurrent neural n... more In the last few years, many different methods have been focusing on using deep recurrent neural networks for natural language generation. The most widely used sequence-to-sequence neural methods are word-based: as such, they need a pre-processing step called delexicalization (conversely, relexicalization) to deal with uncommon or unknown words. These forms of processing, however, give rise to models that depend on the vocabulary used and are not completely neural. In this work, we present an end-to-end sequence-to-sequence model with attention mechanism which reads and generates at a character level, no longer requiring delexicalization, tokenization, nor even lowercasing. Moreover, since characters constitute the common "building blocks" of every text, it also allows a more general approach to text generation, enabling the possibility to exploit transfer learning for training. These skills are obtained thanks to two major features: (i) the possibility to alternate between...

Research paper thumbnail of Anomaly Detection Techniques in the Gaia Space Mission Data

Journal of Signal Processing Systems

Research paper thumbnail of Anomaly Detection Techniques in the Gaia Space Mission Data

Journal of Signal Processing Systems

Research paper thumbnail of A Deep Learning Approach to Anomaly Detection in the Gaia Space Mission Data

Advances in Computational Intelligence

The data reduction system of the Gaia space mission generates a large amount of intermediate data... more The data reduction system of the Gaia space mission generates a large amount of intermediate data and plots for diagnostics, beyond practical possibility of full human evaluation. We investigate the feasibility of adoption of deep learning tools for automatic detection of data anomalies, focusing on convolutional neural networks and comparing with a multilayer perceptron. The results evidence very good accuracy (∼ 99.7%) in the classification of the selected anomalies.

Research paper thumbnail of Copy mechanism and tailored training for character-based data-to-text generation

In the last few years, many different methods have been focusing on using deep recurrent neural n... more In the last few years, many different methods have been focusing on using deep recurrent neural networks for natural language generation. The most widely used sequence-to-sequence neural methods are word-based: as such, they need a pre-processing step called delexicalization (conversely, relexicalization) to deal with uncommon or unknown words. These forms of processing, however, give rise to models that depend on the vocabulary used and are not completely neural. In this work, we present an end-to-end sequence-to-sequence model with attention mechanism which reads and generates at a character level, no longer requiring delexicalization, tokenization, nor even lowercasing. Moreover, since characters constitute the common "building blocks" of every text, it also allows a more general approach to text generation, enabling the possibility to exploit transfer learning for training. These skills are obtained thanks to two major features: (i) the possibility to alternate between...

Research paper thumbnail of I Learn. You Learn. We Learn? An Experiment in Collaborative Concept Mapping

In this paper we present an experiment on digitally-supported collaborative Concept Maps focused ... more In this paper we present an experiment on digitally-supported collaborative Concept Maps focused on asynchronous and remote collaboration. We investigated the integration of multiple perspectives on the same topic, providing users with a tool allowing an individual perspective for each user plus a shared one for the group. Several user actions were made available, affecting one or both perspectives, depending on the context. Results show that integrating different perspectives in a way that everyone can relate to is indeed a complex task: users need to be supported not only in the production of a shared Concept Map, but also in the process of adapting their mental representations, in order to understand, compare and possibly integrate others’ points of view. Our experiment shows that both collaboration in concept mapping (emphasis on the process) and collaboration on a Concept Map (emphasis on the result) are needed, whereas most tools, including the one we experimented with, focus ...

Research paper thumbnail of Controlling Hallucinations at Word Level in Data-to-Text Generation

ArXiv, 2021

Data-to-Text Generation (DTG) is a subfield of Natural Language Generation aiming at transcribing... more Data-to-Text Generation (DTG) is a subfield of Natural Language Generation aiming at transcribing structured data in natural language descriptions. The field has been recently boosted by the use of neural-based generators which exhibit on one side great syntactic skills without the need of hand-crafted pipelines; on the other side, the quality of the generated text reflects the quality of the training data, which in realistic settings only offer imperfectly aligned structure-text pairs. Consequently, state-of-art neural models include misleading statements –usually called hallucinations—in their outputs. The control of this phenomenon is today a major challenge for DTG, and is the problem addressed in the paper. Previous work deal with this issue at the instance level: using an alignment score for each table-reference pair. In contrast, we propose a finer-grained approach, arguing that hallucinations should rather be treated at the word level. Specifically, we propose a Multi-Branch...

Research paper thumbnail of The Rare Word Issue in Natural Language Generation: A Character-Based Solution

In this paper, we analyze the problem of generating fluent English utterances from tabular data, ... more In this paper, we analyze the problem of generating fluent English utterances from tabular data, focusing on the development of a sequence-to-sequence neural model which shows two major features: the ability to read and generate character-wise, and the ability to switch between generating and copying characters from the input: an essential feature when inputs contain rare words like proper names, telephone numbers, or foreign words. Working with characters instead of words is a challenge that can bring problems such as increasing the difficulty of the training phase and a bigger error probability during inference. Nevertheless, our work shows that these issues can be solved and efforts are repaid by the creation of a fully end-to-end system, whose inputs and outputs are not constrained to be part of a predefined vocabulary, like in word-based models. Furthermore, our copying technique is integrated with an innovative shift mechanism, which enhances the ability to produce outputs dir...

Research paper thumbnail of Correction to: Anomaly Detection Techniques in the Gaia Space Mission Data

Journal of Signal Processing Systems

Research paper thumbnail of Copy mechanism and tailored training for character-based data-to-text generation

ArXiv, 2019

In the last few years, many different methods have been focusing on using deep recurrent neural n... more In the last few years, many different methods have been focusing on using deep recurrent neural networks for natural language generation. The most widely used sequence-to-sequence neural methods are word-based: as such, they need a pre-processing step called delexicalization (conversely, relexicalization) to deal with uncommon or unknown words. These forms of processing, however, give rise to models that depend on the vocabulary used and are not completely neural. In this work, we present an end-to-end sequence-to-sequence model with attention mechanism which reads and generates at a character level, no longer requiring delexicalization, tokenization, nor even lowercasing. Moreover, since characters constitute the common "building blocks" of every text, it also allows a more general approach to text generation, enabling the possibility to exploit transfer learning for training. These skills are obtained thanks to two major features: (i) the possibility to alternate between...

Research paper thumbnail of Anomaly Detection Techniques in the Gaia Space Mission Data

Journal of Signal Processing Systems

Research paper thumbnail of Anomaly Detection Techniques in the Gaia Space Mission Data

Journal of Signal Processing Systems

Research paper thumbnail of A Deep Learning Approach to Anomaly Detection in the Gaia Space Mission Data

Advances in Computational Intelligence

The data reduction system of the Gaia space mission generates a large amount of intermediate data... more The data reduction system of the Gaia space mission generates a large amount of intermediate data and plots for diagnostics, beyond practical possibility of full human evaluation. We investigate the feasibility of adoption of deep learning tools for automatic detection of data anomalies, focusing on convolutional neural networks and comparing with a multilayer perceptron. The results evidence very good accuracy (∼ 99.7%) in the classification of the selected anomalies.