Comparison Between Variational Autoencoder and Encoder-Decoder Models for Short Conversation (original) (raw)
Related papers
A Neural Conversational Model for Automatic Generation of Conversations
2018
1Department of Computer Science and Engineering, VJTI, University of Mumbai, India. 2Assistant Professor, Department of Computer Science and Engineering, VJTI, University of Mumbai, India. ---------------------------------------------------------------------***--------------------------------------------------------------------Abstract The conversations between humans and machines is regarded as one of the most hardcore problems in computer technology, which involves interdisciplinary techniques in information retrieval, ML, natural language Understanding and artificial intelligence. Interactive Entity which automatically generate conversations to exchange information smoothly between a people with little knowledge of the computer. Challenges lie in how to respond so as to maintain a relevant and continuous conversation with humans. Conversational modeling is an important task in natural language understanding and machine learning. This research applies generative models based metho...
Generative Deep Neural Networks for Dialogue: A Short Review
Researchers have recently started investigating deep neural networks for dialogue applications. In particular, generative sequence-to-sequence (Seq2Seq) models have shown promising results for unstructured tasks, such as word-level dialogue response generation. The hope is that such models will be able to leverage massive amounts of data to learn meaningful natural language representations and response generation strategies, while requiring a minimum amount of domain knowledge and hand-crafting. An important challenge is to develop models that can effectively incorporate dialogue context and generate meaningful and diverse responses. In support of this goal, we review recently proposed models based on generative encoder-decoder neural network architectures, and show that these models have better ability to incorporate long-term dialogue history, to model uncertainty and ambiguity in dialogue, and to generate responses with high-level compositional structure.
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
Proceedings of the AAAI Conference on Artificial Intelligence
We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.
Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017
Sequence-to-sequence models have been applied to the conversation response generation problem where the source sequence is the conversation history and the target sequence is the response. Unlike translation, conversation responding is inherently creative. The generation of long, informative, coherent, and diverse responses remains a hard task. In this work, we focus on the single turn setting. We add self-attention to the decoder to maintain coherence in longer responses, and we propose a practical approach, called the glimpse-model, for scaling to large datasets. We introduce a stochastic beam-search algorithm with segment-by-segment reranking which lets us inject diversity earlier in the generation process. We trained on a combined data set of over 2.3B conversation messages mined from the web. In human evaluation studies, our method produces longer responses overall, with a higher proportion rated as acceptable and excellent as length increases, compared to baseline sequenceto-sequence models with explicit lengthpromotion. A back-off strategy produces better responses overall, in the full spectrum of lengths. * Both authors contributed equally to this work. † Work done as a member of the Google Brain Residency program (g.co/brainresidency).
Predict and Use Latent Patterns for Short-Text Conversation
ArXiv, 2020
Many neural network models nowadays have achieved promising performances in Chit-chat settings. The majority of them rely on an encoder for understanding the post and a decoder for generating the response. Without given assigned semantics, the models lack the fine-grained control over responses as the semantic mapping between posts and responses is hidden on the fly within the end-to-end manners. Some previous works utilize sampled latent words as a controllable semantic form to drive the generated response around the work, but few works attempt to use more complex semantic forms to guide the generation. In this paper, we propose to use more detailed semantic forms, including latent responses and part-of-speech sequences sampled from the corresponding distributions, as the controllable semantics to guide the generation. Our experimental results show that the richer semantics are not only able to provide informative and diverse responses, but also increase the overall performance of ...
A Conditional Variational Framework for Dialog Generation
Deep latent variable models have been shown to facilitate the response generation for open-domain dialog systems. However , these latent variables are highly ran-domized, leading to uncontrollable generated responses. In this paper, we propose a framework allowing conditional response generation based on specific attributes. These attributes can be either manually assigned or automatically detected. Moreover , the dialog states for both speakers are modeled separately in order to reflect personal features. We validate this framework on two different scenarios, where the attribute refers to genericness and sentiment states respectively. The experiment result testified the potential of our model, where meaningful responses can be generated in accordance with the specified attributes.
Sensors
Generative conversational systems consisting of a neural network-based structural model and a linguistic model have always been considered to be an attractive area. However, conversational systems tend to generate single-turn responses with a lack of diversity and informativeness. For this reason, the conversational system method is further developed by modeling and analyzing the joint structural and linguistic model, as presented in the paper. Firstly, we establish a novel dual-encoder structural model based on the new Convolutional Neural Network architecture and strengthened attention with intention. It is able to effectively extract the features of variable-length sequences and then mine their deep semantic information. Secondly, a linguistic model combining the maximum mutual information with the foolish punishment mechanism is proposed. Thirdly, the conversational system for the joint structural and linguistic model is observed and discussed. Then, to validate the effectivenes...
Variational Hierarchical User-based Conversation Model
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Generating appropriate conversation responses requires careful modeling of the utterances and speakers together. Some recent approaches to response generation model both the utterances and the speakers, but these approaches tend to generate responses that are overly tailored to the speakers. To overcome this limitation, we propose a new model with a stochastic variable designed to capture the speaker information and deliver it to the conversational context. An important part of this model is the network of speakers in which each speaker is connected to one or more conversational partner, and this network is then used to model the speakers better. To test whether our model generates more appropriate conversation responses, we build a new conversation corpus containing approximately 27,000 speakers and 770,000 conversations. With this corpus, we run experiments of generating conversational responses and compare our model with other state-of-the-art models. By automatic evaluation metrics and human evaluation, we show that our model outperforms other models in generating appropriate responses. An additional advantage of our model is that it generates better responses for various new user scenarios, for example when one of the speakers is a known user in our corpus but the partner is a new user. For replicability, we make available all our code and data 1 .
Generative Dialogue System Using Neural Network
SSRN Electronic Journal, 2019
A conversation between humans and computers is regarded as one of the most hard-core problems in computer science, which involves interdisciplinary techniques in information retrieval, Machine Learning, natural language processing, and artificial intelligence. It is an Interactive Entity which automatically generates conversations to exchange information smoothly between people with little knowledge of the computer. The challenges lie in how to respond so as to maintain a relevant and continuous conversation with humans. This research applies a generative model-based method for conversation generation. This research is for the development of conversational agent, which generates conversations using recurrent neural network and its coupled memory unit.
A Neural Conversational Model Oriol Vinyals
Conversational modeling is an important task in natural language understanding and machine intelligence. Although previous approaches exist, they are often restricted to specific domains (e.g., booking an airline ticket) and require handcrafted rules. In this paper, we present a simple approach for this task which uses the recently proposed sequence to sequence framework. Our model converses by predicting the next sentence given the previous sentence or sentences in a conversation. The strength of our model is that it can be trained end-to-end and thus requires much fewer hand-crafted rules. We find that this straightforward model can generate simple conversations given a large conversational training dataset. Our preliminary suggest that, despite optimizing the wrong objective function, the model is able to extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles. On a domain-specific IT helpdesk dataset, the model can find a solution to a technical problem via conversations. On a noisy open-domain movie transcript dataset, the model can perform simple forms of common sense reasoning. As expected, we also find that the lack of consistency is a common failure mode of our model. Conversation 2: Simple Q&A (contexts and multiple choice) Human: my name is david . what is my name ?