Nhat Le - Academia.edu (original) (raw)

Papers by Nhat Le

2021 IEEE 15th International Conference on Semantic Computing (ICSC)

Most existing commercial goal-oriented chatbots are diagram-based; i.e., they follow a rigid dial... more Most existing commercial goal-oriented chatbots are diagram-based; i.e., they follow a rigid dialog flow to fill the slot values needed to achieve a user's goal. Diagram-based chatbots are predictable, thus their adoption in commercial settings; however, their lack of flexibility may cause many users to leave the conversation before achieving their goal. On the other hand, state-of-the-art research chatbots use Reinforcement Learning (RL) to generate flexible dialog policies. However, such chatbots can be unpredictable, may violate the intended business constraints, and require large training datasets to produce a mature policy. We propose a framework that achieves a middle ground between the diagram-based and RL-based chatbots: we constrain the space of possible chatbot responses using a novel structure, the chatbot dependency graph, and use RL to dynamically select the best valid responses. Dependency graphs are directed graphs that conveniently express a chatbot's logic by defining the dependencies among slots: all valid dialog flows are encapsulated in one dependency graph. Our experiments in several domains show that our framework quickly adapts to user characteristics and achieves up to 23.77% improved success rate compared to a state-of-the-art RL model.

2019 IEEE International Conference on Humanized Computing and Communication (HCC), 2019

Transactional chatbots have become popular today, as they can automate repetitive transactions su... more Transactional chatbots have become popular today, as they can automate repetitive transactions such as making an appointment or buying a ticket. As users interact with a chatbot, rich chat logs are generated to evaluate and improve the effectiveness of the chatbot, which is the ratio of chats that lead to a successful state such as buying a ticket. A fundamental operation to achieve such analyses is the clustering of the chats in the chat log, which requires effective distance functions between a pair of chat sessions. In this paper, we propose and compare various distance measures for individual messages as well as whole sessions. We evaluate these measures using user studies on Mechanical Turk, where we first ask users to use our chatbots, and then ask them to judge the similarity of messages and sessions. Finally, we provide anecdotal results showing that our distance functions are effective in clustering messages and sessions.

International Journal of Semantic Computing, 2021

Most existing commercial goal-oriented chatbots are diagram-based; i.e. they follow a rigid dialo... more Most existing commercial goal-oriented chatbots are diagram-based; i.e. they follow a rigid dialog flow to fill the slot values needed to achieve a user’s goal. Diagram-based chatbots are predictable, thus their adoption in commercial settings; however, their lack of flexibility may cause many users to leave the conversation before achieving their goal. On the other hand, state-of-the-art research chatbots use Reinforcement Learning (RL) to generate flexible dialog policies. However, such chatbots can be unpredictable, may violate the intended business constraints, and require large training datasets to produce a mature policy. We propose a framework that achieves a middle ground between the diagram-based and RL-based chatbots: we constrain the space of possible chatbot responses using a novel structure, the chatbot dependency graph, and use RL to dynamically select the best valid responses. Dependency graphs are directed graphs that conveniently express a chatbot’s logic by definin...

JMIR Medical Informatics, 2020

Background Medicine 2.0—the adoption of Web 2.0 technologies such as social networks in health ca... more Background Medicine 2.0—the adoption of Web 2.0 technologies such as social networks in health care—creates the need for apps that can find other patients with similar experiences and health conditions based on a patient’s electronic health record (EHR). Concurrently, there is an increasing number of longitudinal EHR data sets with rich information, which are essential to fulfill this need. Objective This study aimed to evaluate the hypothesis that we can leverage similar EHRs to predict possible future medical concepts (eg, disorders) from a patient’s EHR. Methods We represented patients’ EHRs using time-based prefixes and suffixes, where each prefix or suffix is a set of medical concepts from a medical ontology. We compared the prefixes of other patients in the collection with the state of the current patient using various interpatient distance measures. The set of similar prefixes yields a set of suffixes, which we used to determine probable future concepts for the current patien...

BACKGROUND An increasing number of doctor reviews are being generated by patients on the internet... more BACKGROUND An increasing number of doctor reviews are being generated by patients on the internet. These reviews address a diverse set of topics (features), including wait time, office staff, doctor’s skills, and bedside manners. Most previous work on automatic analysis of Web-based customer reviews assumes that (1) product features are described unambiguously by a small number of keywords, for example, battery for phones and (2) the opinion for each feature has a positive or negative sentiment. However, in the domain of doctor reviews, this setting is too restrictive: a feature such as visit duration for doctor reviews may be expressed in many ways and does not necessarily have a positive or negative sentiment. OBJECTIVE This study aimed to adapt existing and propose novel text classification methods on the domain of doctor reviews. These methods are evaluated on their accuracy to classify a diverse set of doctor review features. METHODS We first manually examined a large number of...

2021 IEEE 15th International Conference on Semantic Computing (ICSC)

2019 IEEE International Conference on Humanized Computing and Communication (HCC), 2019

International Journal of Semantic Computing, 2021

JMIR Medical Informatics, 2020