Personalized Dialogue Response Generation Learned from Monologues (original) (raw)

Towards Coherent and Engaging Spoken Dialog Response Generation Using Automatic Conversation Evaluators

Proceedings of the 12th International Conference on Natural Language Generation

Encoder-decoder based neural architectures serve as the basis of state-of-the-art approaches in end-to-end open domain dialog systems. Since most of such systems are trained with a maximum likelihood (MLE) objective they suffer from issues such as lack of generalizability and the generic response problem, i.e., a system response that can be an answer to a large number of user utterances, e.g., "Maybe, I don't know." Having explicit feedback on the relevance and interestingness of a system response at each turn can be a useful signal for mitigating such issues and improving system quality by selecting responses from different approaches. Towards this goal, we present a system that evaluates chatbot responses at each dialog turn for coherence and engagement. Our system provides explicit turn-level dialog quality feedback, which we show to be highly correlated with human evaluation. To show that incorporating this feedback in the neural response generation models improves dialog quality, we present two different and complementary mechanisms to incorporate explicit feedback into a neural response generation model: reranking and direct modification of the loss function during training. Our studies show that a response generation model that incorporates these combined feedback mechanisms produce more engaging and coherent responses in an open-domain spoken dialog setting, significantly improving the response quality using both automatic and human evaluation.

Viola: A Topic Agnostic Generate-and-Rank Dialogue System

Cornell University - arXiv, 2021

We present Viola, an open-domain dialogue system for spoken conversation that uses a topic-agnostic dialogue manager based on a simple generate-and-rank approach. Leveraging recent advances of generative dialogue systems powered by large language models, Viola fetches a batch of response candidates from various neural dialogue models trained with different datasets and knowledge-grounding inputs. Additional responses originating from template-based generators are also considered, depending on the user's input and detected entities. The hand-crafted generators build on a dynamic knowledge graph injected with rich content that is crawled from the web and automatically processed on a daily basis. Viola's response ranker is a fine-tuned polyencoder that chooses the best response given the dialogue history. While dedicated annotations for the polyencoder alone can indirectly steer it away from choosing problematic responses, we add rule-based safety nets to detect neural degeneration and a dedicated classifier to filter out offensive content. We analyze conversations that Viola took part in for the Alexa Prize Socialbot Grand Challenge 4 and discuss the strengths and weaknesses of our approach. Lastly, we suggest future work with a focus on curating conversation data specifcially for socialbots that will contribute towards a more robust data-driven socialbot. 4th Proceedings of Alexa Prize (Alexa Prize 2020).

Diversifying Dialogue Generation with Non-Conversational Text

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Neural network-based sequence-to-sequence (seq2seq) models strongly suffer from the lowdiversity problem when it comes to opendomain dialogue generation. As bland and generic utterances usually dominate the frequency distribution in our daily chitchat, avoiding them to generate more interesting responses requires complex data filtering, sampling techniques or modifying the training objective. In this paper, we propose a new perspective to diversify dialogue generation by leveraging non-conversational text. Compared with bilateral conversations, nonconversational text are easier to obtain, more diverse and cover a much broader range of topics. We collect a large-scale nonconversational corpus from multi sources including forum comments, idioms and book snippets. We further present a training paradigm to effectively incorporate these text via iterative back translation. The resulting model is tested on two conversational datasets and is shown to produce significantly more diverse responses without sacrificing the relevance with context. * Equal contribution. Conversational Text Context 暗恋的人却不喜欢我 (Translation) The one I have a crush on doesn't like me. Response 摸摸头 Head pat. Non-Conversational Text Forum Comments 暗恋这碗酒，谁喝都会醉啊 Crush is an alcoholic drink, whoever drinks it will get intoxicated.

Large Scale Multi-Actor Generative Dialog Modeling

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Non-goal oriented dialog agents (i.e. chatbots) aim to produce varying and engaging conversations with a user; however, they typically exhibit either inconsistent personality across conversations or the average personality of all users. This paper addresses these issues by controlling an agent's persona upon generation via conditioning on prior conversations of a target actor. In doing so, we are able to utilize more abstract patterns within a person's speech and better emulate them in generated responses. This work introduces the GENERATIVE CONVERSATION CONTROL model, an augmented and fine-tuned GPT-2 language model that conditions on past reference conversations to probabilistically model multi-turn conversations in the actor's persona. We introduce an accompanying data collection procedure to obtain 10.3M conversations from 6 months worth of Reddit comments. We demonstrate that scaling model sizes from 117M to 8.3B parameters yields an improvement from 23.14 to 13.14 perplexity on 1.7M held out Reddit conversations. Increasing model scale yielded similar improvements in human evaluations that measure preference of model samples to the held out target distribution in terms of realism (31% increased to 37% preference), style matching (37% to 42%), grammar and content quality (29% to 42%), and conversation coherency (32% to 40%). We find that conditionally modeling past conversations improves perplexity by 0.47 in automatic evaluations. Through human trials we identify positive trends between conditional modeling and style matching and outline steps to further improve persona control. * First two authors have contributed equally. † Research conducted during an internship at NVIDIA.

Towards Robust Online Dialogue Response Generation

ArXiv, 2022

Although pre-trained sequence-to-sequence models have achieved great success in dialogue response generation, chatbots still suffer from generating inconsistent responses in real-world practice, especially in multi-turn settings. We argue that this can be caused by a discrepancy between training and realworld testing. At training time, chatbot generates response with the golden context, while it has to generate based on the context consisting of both user utterances and the model predicted utterances during real-world testing. With the growth of the number of utterances, this discrepancy becomes more serious in the multi-turn settings. In this paper, we propose a hierarchical sampling-based method consisting of both utterance-level sampling and semiutterance-level sampling, to alleviate the discrepancy, which implicitly increases the dialogue coherence. We further adopt reinforcement learning and re-ranking methods to explicitly optimize the dialogue coherence during training and in...

Generative Dialogue System Using Neural Network

SSRN Electronic Journal, 2019

A conversation between humans and computers is regarded as one of the most hard-core problems in computer science, which involves interdisciplinary techniques in information retrieval, Machine Learning, natural language processing, and artificial intelligence. It is an Interactive Entity which automatically generates conversations to exchange information smoothly between people with little knowledge of the computer. The challenges lie in how to respond so as to maintain a relevant and continuous conversation with humans. This research applies a generative model-based method for conversation generation. This research is for the development of conversational agent, which generates conversations using recurrent neural network and its coupled memory unit.

Generative Deep Neural Networks for Dialogue: A Short Review

Researchers have recently started investigating deep neural networks for dialogue applications. In particular, generative sequence-to-sequence (Seq2Seq) models have shown promising results for unstructured tasks, such as word-level dialogue response generation. The hope is that such models will be able to leverage massive amounts of data to learn meaningful natural language representations and response generation strategies, while requiring a minimum amount of domain knowledge and hand-crafting. An important challenge is to develop models that can effectively incorporate dialogue context and generate meaningful and diverse responses. In support of this goal, we review recently proposed models based on generative encoder-decoder neural network architectures, and show that these models have better ability to incorporate long-term dialogue history, to model uncertainty and ambiguity in dialogue, and to generate responses with high-level compositional structure.

A Neural Conversational Model for Automatic Generation of Conversations

2018

1Department of Computer Science and Engineering, VJTI, University of Mumbai, India. 2Assistant Professor, Department of Computer Science and Engineering, VJTI, University of Mumbai, India. ---------------------------------------------------------------------***--------------------------------------------------------------------Abstract The conversations between humans and machines is regarded as one of the most hardcore problems in computer technology, which involves interdisciplinary techniques in information retrieval, ML, natural language Understanding and artificial intelligence. Interactive Entity which automatically generate conversations to exchange information smoothly between a people with little knowledge of the computer. Challenges lie in how to respond so as to maintain a relevant and continuous conversation with humans. Conversational modeling is an important task in natural language understanding and machine learning. This research applies generative models based metho...

Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Sequence-to-sequence models have been applied to the conversation response generation problem where the source sequence is the conversation history and the target sequence is the response. Unlike translation, conversation responding is inherently creative. The generation of long, informative, coherent, and diverse responses remains a hard task. In this work, we focus on the single turn setting. We add self-attention to the decoder to maintain coherence in longer responses, and we propose a practical approach, called the glimpse-model, for scaling to large datasets. We introduce a stochastic beam-search algorithm with segment-by-segment reranking which lets us inject diversity earlier in the generation process. We trained on a combined data set of over 2.3B conversation messages mined from the web. In human evaluation studies, our method produces longer responses overall, with a higher proportion rated as acceptable and excellent as length increases, compared to baseline sequenceto-sequence models with explicit lengthpromotion. A back-off strategy produces better responses overall, in the full spectrum of lengths. * Both authors contributed equally to this work. † Work done as a member of the Google Brain Residency program (g.co/brainresidency).

Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent

2022

We present Chirpy Cardinal, an open-domain social chatbot. Aiming to be both informative and conversational, our bot chats with users in an authentic, emotionally intelligent way. By integrating controlled neural generation with scaffolded, handwritten dialogue, we let both the user and bot take turns driving the conversation, producing an engaging and socially fluent experience. Deployed in the fourth iteration of the Alexa Prize Socialbot Grand Challenge, Chirpy Cardinal handled thousands of conversations per day, placing second out of nine bots with an average user rating of 3.58/5.

Personalized Dialogue Response Generation Learned from Monologues (original) (raw)

Related papers