Pankaj Rajan - Academia.edu (original) (raw)
Papers by Pankaj Rajan
arXiv (Cornell University), May 31, 2022
International Conference on Natural Language Generation, 2021
Incorporating external knowledge sources effectively in conversations is a longstanding problem i... more Incorporating external knowledge sources effectively in conversations is a longstanding problem in open-domain dialogue research. The existing literature on open-domain knowledge selection is limited and makes certain brittle assumptions on knowledge sources to simplify the overall task (Dinan et al., 2019), such as the existence of a single relevant knowledge sentence per context. In this work, we evaluate the existing state of open-domain conversation knowledge selection, showing where the existing methodologies regarding data and evaluation are flawed. We then improve on them by proposing a new framework for collecting relevant knowledge, and create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus, which we call WOW++. WOW++ averages 8 relevant knowledge sentences per dialogue context, embracing the inherent ambiguity of open-domain dialogue knowledge selection. We then benchmark various knowledge ranking algorithms on this augmented dataset with both intrinsic evaluation and extrinsic measures of response quality, showing that neural rerankers that use WOW++ can outperform rankers trained on standard datasets.
arXiv (Cornell University), Sep 13, 2022
Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students ... more Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students to explore and compete to develop conversational agents through the SocialBot Grand Challenge. The goal of the challenge is to build agents capable of conversing coherently and engagingly with humans on popular topics for 20 minutes, while achieving an average rating of at least 4.0/5.0. However, as conversational agents attempt to assist users with increasingly complex tasks, new conversational AI techniques and evaluation platforms are needed. The Alexa Prize TaskBot challenge, established in 2021, builds on the success of the SocialBot challenge by introducing the requirements of interactively assisting humans with real-world Cooking and Do-It-Yourself tasks, while making use of both voice and visual modalities. This challenge requires the TaskBots to identify and understand the user's need, identify and integrate task and domain knowledge into the interaction, and develop new ways of engaging the user without distracting them from the task at hand, among other challenges. This paper provides an overview of the TaskBot challenge, describes the infrastructure support provided to the teams with the CoBot Toolkit, and summarizes the approaches the participating teams took to overcome the research challenges. Finally, it analyzes the performance of the competing TaskBots during the first year of the competition.
Lecture notes in electrical engineering, 2021
User ratings play a significant role in spoken dialogue systems. Typically, such ratings tend to ... more User ratings play a significant role in spoken dialogue systems. Typically, such ratings tend to be averaged across all users and then utilized as feedback to improve the system or personalize its behavior. While this method can be useful to understand broad, general issues with the system and its behavior, it does not take into account differences between users that affect their ratings. In this work, we conduct a study to better understand how people rate their interactions with conversational agents. One macro-level characteristic that has been shown to correlate with how people perceive their inter-personal communication is personality [1, 2, 12]. We specifically focus on agreeableness and extraversion as variables that may explain variation in ratings and therefore provide a more meaningful signal for training or personalization. In order to elicit those personality traits during an interaction with a conversational agent, we designed and validated a fictional story, grounded in prior work in psychology. We then implemented the story into an experimental conversational agent that allowed users to opt-in to hearing the story. Our results suggest that for human-conversational agent interactions, extraversion may play a role in user ratings, but more data is needed to determine if the relationship is significant. Agreeableness, on the other hand, plays a statistically significant role in conversation ratings: users who are more agreeable are more likely to provide a higher rating for their interaction. In addition, we found that users who opted to hear the story were, in general, more likely to rate their conversational experience higher than those who did not.
With the advancement in the computing technology, the field of Artificial Neural Networks, replic... more With the advancement in the computing technology, the field of Artificial Neural Networks, replicating the logics of brain, gained vast interest during 80’s. With the growing interest in unexplored field of Neural Network and rapid growth of related technology, tremendous work was undertaken in the field of development of intelligent machines and Robotics. Most of the earlier research works are focused to transfer the skill from humans to machine. This paper presents Neural Network as a learner in the 1 st step and as a teacher in the second step. This paper demonstrates a methodology of transferring the skills from an expert to a less skilled apprentice using Neural Network as an intermediate transferring medium. The network is trained with random selection of input and output datasets generated during the expert’s performance on the task.
2009 Seventh International Conference on Advances in Pattern Recognition, 2009
Shadows are the natural accomplice of objects. As such, they have affected various algorithms dea... more Shadows are the natural accomplice of objects. As such, they have affected various algorithms dealing with image segmentation, object tracking and recognition. A lot of research has been focused on removing shadows from images while preserving the information available in the shadow region. In this paper, we present a simple yet robust algorithm for cast shadow removal utilizing images taken
Building open domain conversational systems that allow users to have engaging conversations on to... more Building open domain conversational systems that allow users to have engaging conversations on topics of their choice is a challenging task. The Alexa Prize Socialbot Grand Challenge was launched in 2016 to tackle the problem of achieving natural, sustained, coherent and engaging open-domain dialogs. In the third iteration of the competition, university teams have moved the needle on the state of the art, bringing together common sense knowledge representations, neural response generation models, NLU systems enhanced by large-scale transformer models and improved dialog policies to switch between graph-based representations or retrieval-based or templated dialog fragments, along with generated responses. The Third Socialbot Grand Challenge included an improved version of the CoBot (conversational bot) toolkit from the prior competition, along with topic and dialog act detection models, conversation evaluators, and a sensitive content detection model so that the competing teams could...
ArXiv, 2021
Incorporating external knowledge sources effectively in conversations is a longstanding problem i... more Incorporating external knowledge sources effectively in conversations is a longstanding problem in open-domain dialogue research. The existing literature on open-domain knowledge selection is limited and makes certain brittle assumptions on knowledge sources to simplify the overall task, such as the existence of a single relevant knowledge sentence per context. In this work, we evaluate the existing state of open-domain conversation knowledge selection, showing where the existing methodologies regarding data and evaluation are flawed. We then improve on them by proposing a new framework for collecting relevant knowledge, and create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus, which we call WOW++. WOW++ averages 8 relevant knowledge sentences per dialogue context, embracing the inherent ambiguity of open-domain dialogue knowledge selection. We then benchmark various knowledge ranking algorithms on this augmented dataset with both intrinsic evaluation and extrin...
The retrieval and browsing of diagrammatic information extracted from hand-drawn diagrams would o... more The retrieval and browsing of diagrammatic information extracted from hand-drawn diagrams would open up a rich form of information interaction. However, such sketches currently require handannotations in order to be understood by the computer. While improvements in sketch recognition algorithms have enabled automatic recognition for Tablet PCsketched diagrams, such progress has been constrained to online algorithms. As a result, offline algorithms that are relevant to diagrams sketched on paper remain dominantly domain-dependent, and are also restrictive in the number of diagrams that can be understood. In this paper, we discuss our research aims for providing users with information interaction that exploit the advantages of automatic correction capabilities found in online sketch recognition algorithms, with the low-cost advantages found in paper usage.
Sketch Based Interfaces and Modeling, 2008
Sketching is a way of conveying ideas to people of diverse backgrounds and culture without any li... more Sketching is a way of conveying ideas to people of diverse backgrounds and culture without any linguistic medium. With the advent of inexpensive tablet PCs, online sketches have become more common, allowing for stroke-based sketch recognition techniques, more powerful editing techniques, and automatic simulation of recognized diagrams. Online sketches provide significantly more information than paper sketches, but they still do not provide the flexibility, naturalness, and simplicity of a simple piece of paper. Recognition methods exist for paper sketches, but they tend to be domain specific and don't benefit from the advances of stroke-based sketch recognition. Our goal is to combine the power of stroke-based sketch recognition with the flexibility and ease of use of a piece of paper. In this paper we will present a stroke-tracing algorithm that can be used to extract stroke data from the pixilated image of the sketch drawn on paper. The presented method handles overlapping strokes and also attempts to capture sequencing information, which is helpful in many sketch recognition techniques. We present preliminary results of our algorithm on several paper-drawn, hand-sketched, scanned-in pixilated images. Categories and Subject Descriptors (according to ACM CCS): I.4.6 [Image processing and Computer Vision]: Edge and Feature Detection).
CHI '08 Extended Abstracts on Human Factors in Computing Systems, 2008
Sketch recognition techniques have generally fallen into two camps. Gesture-based techniques, suc... more Sketch recognition techniques have generally fallen into two camps. Gesture-based techniques, such as those used by the Palm Pilot's Graffiti, can provide highaccuracy, but require the user to learn a particular drawing style in order for shapes to be recognized. Free-sketch recognition allows users to draw shapes as they would naturally, but most current techniques have low accuracies or require significant domain-level tweaking to make them usable. Our goal is to recognize free-hand sketches with high accuracy by developing generalized techniques that work for a variety of domains, including design and education. This is a work-in-progress, but we have made significant advancements toward our over-arching goal.
The goal of our research is to combine the power of stroke-based sketch recognition with the flex... more The goal of our research is to combine the power of stroke-based sketch recognition with the flexibility and ease of use of a piece of paper. In this paper we will present preliminary results of our algorithm integrated with an online sketch recognition system built with LADDER. We have also presented a comparison of our paper based interface with tablet based sketching interface.
As pen-based interfaces become more popular in today's applications, the need for algorithms to a... more As pen-based interfaces become more popular in today's applications, the need for algorithms to accurately recognize hand-drawn sketches and shapes has increased. In many cases, complex shapes can be constructed hierarchically as a combination of smaller primitive shapes meeting certain geometric constraints. However, in order to construct higher level shapes, it is imperative to accurately recognize the lower-level primitives. Two approaches have become widespread in the sketch recognition field for recognizing lower-level primitives: gesture-based recognition and geometric-based recognition. Our goal is to use a hybrid approach that combines features from both traditional gesturebased recognition systems and geometric-based recognition systems. In this paper, we show that we can produce a system with high recognition rates while providing the added benefit of being able to produce normalized confidence values for alternative interpretations; something most geometric-based recognizers lack. More significantly, results from feature subset selection indicate that geometric features aid the recognition process more than gesture-based features when given naturally sketched data.
Journal of Cultural Heritage, 2010
One essential approach in preserving architectural heritage is the documentation of 3D geometries... more One essential approach in preserving architectural heritage is the documentation of 3D geometries and surface textures of historic buildings. For example, precise colour information, excluding lighting effects, is an intrinsic property of the surface materials of building interiors and exteriors. However, while colour information has been recorded for small sample areas, it has not been accurately documented on the scale of entire building surfaces. This is critical, because building materials decay and their colours fade with time. The goal of this project is to develop a method to assist in recording and documenting the chromatic information of interiors and exteriors of historic buildings with low cost and high efficiency. The method takes advantage of emerging high dynamic range imaging (HDRI) technology, which can store rich information about colour and illumination through digital photography. By recording the colour information, in addition to the geometry and texture information obtained through other existing technologies, we can achieve more complete documentation for architectural heritage. In this paper, we discuss an overview of the problem and present our algorithms for utilizing computer vision techniques to retrieve chromatic information of historic buildings. We also present and discuss our experiments and results of applying our method to studies of lab objects and the Hall of Supreme Harmony in the Forbidden City, Beijing.
arXiv (Cornell University), May 31, 2022
International Conference on Natural Language Generation, 2021
Incorporating external knowledge sources effectively in conversations is a longstanding problem i... more Incorporating external knowledge sources effectively in conversations is a longstanding problem in open-domain dialogue research. The existing literature on open-domain knowledge selection is limited and makes certain brittle assumptions on knowledge sources to simplify the overall task (Dinan et al., 2019), such as the existence of a single relevant knowledge sentence per context. In this work, we evaluate the existing state of open-domain conversation knowledge selection, showing where the existing methodologies regarding data and evaluation are flawed. We then improve on them by proposing a new framework for collecting relevant knowledge, and create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus, which we call WOW++. WOW++ averages 8 relevant knowledge sentences per dialogue context, embracing the inherent ambiguity of open-domain dialogue knowledge selection. We then benchmark various knowledge ranking algorithms on this augmented dataset with both intrinsic evaluation and extrinsic measures of response quality, showing that neural rerankers that use WOW++ can outperform rankers trained on standard datasets.
arXiv (Cornell University), Sep 13, 2022
Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students ... more Since its inception in 2016, the Alexa Prize program has enabled hundreds of university students to explore and compete to develop conversational agents through the SocialBot Grand Challenge. The goal of the challenge is to build agents capable of conversing coherently and engagingly with humans on popular topics for 20 minutes, while achieving an average rating of at least 4.0/5.0. However, as conversational agents attempt to assist users with increasingly complex tasks, new conversational AI techniques and evaluation platforms are needed. The Alexa Prize TaskBot challenge, established in 2021, builds on the success of the SocialBot challenge by introducing the requirements of interactively assisting humans with real-world Cooking and Do-It-Yourself tasks, while making use of both voice and visual modalities. This challenge requires the TaskBots to identify and understand the user's need, identify and integrate task and domain knowledge into the interaction, and develop new ways of engaging the user without distracting them from the task at hand, among other challenges. This paper provides an overview of the TaskBot challenge, describes the infrastructure support provided to the teams with the CoBot Toolkit, and summarizes the approaches the participating teams took to overcome the research challenges. Finally, it analyzes the performance of the competing TaskBots during the first year of the competition.
Lecture notes in electrical engineering, 2021
User ratings play a significant role in spoken dialogue systems. Typically, such ratings tend to ... more User ratings play a significant role in spoken dialogue systems. Typically, such ratings tend to be averaged across all users and then utilized as feedback to improve the system or personalize its behavior. While this method can be useful to understand broad, general issues with the system and its behavior, it does not take into account differences between users that affect their ratings. In this work, we conduct a study to better understand how people rate their interactions with conversational agents. One macro-level characteristic that has been shown to correlate with how people perceive their inter-personal communication is personality [1, 2, 12]. We specifically focus on agreeableness and extraversion as variables that may explain variation in ratings and therefore provide a more meaningful signal for training or personalization. In order to elicit those personality traits during an interaction with a conversational agent, we designed and validated a fictional story, grounded in prior work in psychology. We then implemented the story into an experimental conversational agent that allowed users to opt-in to hearing the story. Our results suggest that for human-conversational agent interactions, extraversion may play a role in user ratings, but more data is needed to determine if the relationship is significant. Agreeableness, on the other hand, plays a statistically significant role in conversation ratings: users who are more agreeable are more likely to provide a higher rating for their interaction. In addition, we found that users who opted to hear the story were, in general, more likely to rate their conversational experience higher than those who did not.
With the advancement in the computing technology, the field of Artificial Neural Networks, replic... more With the advancement in the computing technology, the field of Artificial Neural Networks, replicating the logics of brain, gained vast interest during 80’s. With the growing interest in unexplored field of Neural Network and rapid growth of related technology, tremendous work was undertaken in the field of development of intelligent machines and Robotics. Most of the earlier research works are focused to transfer the skill from humans to machine. This paper presents Neural Network as a learner in the 1 st step and as a teacher in the second step. This paper demonstrates a methodology of transferring the skills from an expert to a less skilled apprentice using Neural Network as an intermediate transferring medium. The network is trained with random selection of input and output datasets generated during the expert’s performance on the task.
2009 Seventh International Conference on Advances in Pattern Recognition, 2009
Shadows are the natural accomplice of objects. As such, they have affected various algorithms dea... more Shadows are the natural accomplice of objects. As such, they have affected various algorithms dealing with image segmentation, object tracking and recognition. A lot of research has been focused on removing shadows from images while preserving the information available in the shadow region. In this paper, we present a simple yet robust algorithm for cast shadow removal utilizing images taken
Building open domain conversational systems that allow users to have engaging conversations on to... more Building open domain conversational systems that allow users to have engaging conversations on topics of their choice is a challenging task. The Alexa Prize Socialbot Grand Challenge was launched in 2016 to tackle the problem of achieving natural, sustained, coherent and engaging open-domain dialogs. In the third iteration of the competition, university teams have moved the needle on the state of the art, bringing together common sense knowledge representations, neural response generation models, NLU systems enhanced by large-scale transformer models and improved dialog policies to switch between graph-based representations or retrieval-based or templated dialog fragments, along with generated responses. The Third Socialbot Grand Challenge included an improved version of the CoBot (conversational bot) toolkit from the prior competition, along with topic and dialog act detection models, conversation evaluators, and a sensitive content detection model so that the competing teams could...
ArXiv, 2021
Incorporating external knowledge sources effectively in conversations is a longstanding problem i... more Incorporating external knowledge sources effectively in conversations is a longstanding problem in open-domain dialogue research. The existing literature on open-domain knowledge selection is limited and makes certain brittle assumptions on knowledge sources to simplify the overall task, such as the existence of a single relevant knowledge sentence per context. In this work, we evaluate the existing state of open-domain conversation knowledge selection, showing where the existing methodologies regarding data and evaluation are flawed. We then improve on them by proposing a new framework for collecting relevant knowledge, and create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus, which we call WOW++. WOW++ averages 8 relevant knowledge sentences per dialogue context, embracing the inherent ambiguity of open-domain dialogue knowledge selection. We then benchmark various knowledge ranking algorithms on this augmented dataset with both intrinsic evaluation and extrin...
The retrieval and browsing of diagrammatic information extracted from hand-drawn diagrams would o... more The retrieval and browsing of diagrammatic information extracted from hand-drawn diagrams would open up a rich form of information interaction. However, such sketches currently require handannotations in order to be understood by the computer. While improvements in sketch recognition algorithms have enabled automatic recognition for Tablet PCsketched diagrams, such progress has been constrained to online algorithms. As a result, offline algorithms that are relevant to diagrams sketched on paper remain dominantly domain-dependent, and are also restrictive in the number of diagrams that can be understood. In this paper, we discuss our research aims for providing users with information interaction that exploit the advantages of automatic correction capabilities found in online sketch recognition algorithms, with the low-cost advantages found in paper usage.
Sketch Based Interfaces and Modeling, 2008
Sketching is a way of conveying ideas to people of diverse backgrounds and culture without any li... more Sketching is a way of conveying ideas to people of diverse backgrounds and culture without any linguistic medium. With the advent of inexpensive tablet PCs, online sketches have become more common, allowing for stroke-based sketch recognition techniques, more powerful editing techniques, and automatic simulation of recognized diagrams. Online sketches provide significantly more information than paper sketches, but they still do not provide the flexibility, naturalness, and simplicity of a simple piece of paper. Recognition methods exist for paper sketches, but they tend to be domain specific and don't benefit from the advances of stroke-based sketch recognition. Our goal is to combine the power of stroke-based sketch recognition with the flexibility and ease of use of a piece of paper. In this paper we will present a stroke-tracing algorithm that can be used to extract stroke data from the pixilated image of the sketch drawn on paper. The presented method handles overlapping strokes and also attempts to capture sequencing information, which is helpful in many sketch recognition techniques. We present preliminary results of our algorithm on several paper-drawn, hand-sketched, scanned-in pixilated images. Categories and Subject Descriptors (according to ACM CCS): I.4.6 [Image processing and Computer Vision]: Edge and Feature Detection).
CHI '08 Extended Abstracts on Human Factors in Computing Systems, 2008
Sketch recognition techniques have generally fallen into two camps. Gesture-based techniques, suc... more Sketch recognition techniques have generally fallen into two camps. Gesture-based techniques, such as those used by the Palm Pilot's Graffiti, can provide highaccuracy, but require the user to learn a particular drawing style in order for shapes to be recognized. Free-sketch recognition allows users to draw shapes as they would naturally, but most current techniques have low accuracies or require significant domain-level tweaking to make them usable. Our goal is to recognize free-hand sketches with high accuracy by developing generalized techniques that work for a variety of domains, including design and education. This is a work-in-progress, but we have made significant advancements toward our over-arching goal.
The goal of our research is to combine the power of stroke-based sketch recognition with the flex... more The goal of our research is to combine the power of stroke-based sketch recognition with the flexibility and ease of use of a piece of paper. In this paper we will present preliminary results of our algorithm integrated with an online sketch recognition system built with LADDER. We have also presented a comparison of our paper based interface with tablet based sketching interface.
As pen-based interfaces become more popular in today's applications, the need for algorithms to a... more As pen-based interfaces become more popular in today's applications, the need for algorithms to accurately recognize hand-drawn sketches and shapes has increased. In many cases, complex shapes can be constructed hierarchically as a combination of smaller primitive shapes meeting certain geometric constraints. However, in order to construct higher level shapes, it is imperative to accurately recognize the lower-level primitives. Two approaches have become widespread in the sketch recognition field for recognizing lower-level primitives: gesture-based recognition and geometric-based recognition. Our goal is to use a hybrid approach that combines features from both traditional gesturebased recognition systems and geometric-based recognition systems. In this paper, we show that we can produce a system with high recognition rates while providing the added benefit of being able to produce normalized confidence values for alternative interpretations; something most geometric-based recognizers lack. More significantly, results from feature subset selection indicate that geometric features aid the recognition process more than gesture-based features when given naturally sketched data.
Journal of Cultural Heritage, 2010
One essential approach in preserving architectural heritage is the documentation of 3D geometries... more One essential approach in preserving architectural heritage is the documentation of 3D geometries and surface textures of historic buildings. For example, precise colour information, excluding lighting effects, is an intrinsic property of the surface materials of building interiors and exteriors. However, while colour information has been recorded for small sample areas, it has not been accurately documented on the scale of entire building surfaces. This is critical, because building materials decay and their colours fade with time. The goal of this project is to develop a method to assist in recording and documenting the chromatic information of interiors and exteriors of historic buildings with low cost and high efficiency. The method takes advantage of emerging high dynamic range imaging (HDRI) technology, which can store rich information about colour and illumination through digital photography. By recording the colour information, in addition to the geometry and texture information obtained through other existing technologies, we can achieve more complete documentation for architectural heritage. In this paper, we discuss an overview of the problem and present our algorithms for utilizing computer vision techniques to retrieve chromatic information of historic buildings. We also present and discuss our experiments and results of applying our method to studies of lab objects and the Hall of Supreme Harmony in the Forbidden City, Beijing.