Sadhana Kumaravel - Academia.edu (original) (raw)

Uploads

Papers by Sadhana Kumaravel

Research paper thumbnail of Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Word Embeddings and the Implications to Representation Learning

Proceedings of the ... AAAI Conference on Artificial Intelligence, May 18, 2021

Research paper thumbnail of Beyond Backprop: Alternating Minimization with co-Activation Memory

We propose a novel online algorithm for training deep feedforward neural networks that employs al... more We propose a novel online algorithm for training deep feedforward neural networks that employs alternating minimization (block-coordinate descent) between the weights and activation variables. It extends off-line alternating minimization approaches to online, continual learning, and improves over stochastic gradient descent (SGD) with backpropagation in several ways: it avoids the vanishing gradient issue, it allows for non-differentiable nonlinearities, and it permits parallel weight updates across the layers. Unlike SGD, our approach employs co-activation memory inspired by the online sparse coding algorithm of [17]. Furthermore, local iterative optimization with explicit activation updates is a potentially more biologically plausible learning mechanism than backpropagation. We provide theoretical convergence analysis and promising empirical results on several datasets.

Research paper thumbnail of Beyond backprop: Online alternating minimization with auxiliary variables

International Conference on Machine Learning, May 24, 2019

Research paper thumbnail of Human-AI Collaboration in a Cooperative Game Setting

Proceedings of the ACM on Human-Computer Interaction, 2020

Research paper thumbnail of DocAMR: Multi-Sentence AMR Representation and Evaluation

ArXiv, 2021

Despite extensive research on parsing of English sentences into Abstraction Meaning Representatio... more Despite extensive research on parsing of English sentences into Abstraction Meaning Representation (AMR) graphs, which are compared to gold graphs via the Smatch metric, full-document parsing into a unified graph representation lacks well-defined representation and evaluation. Taking advantage of a supersentential level of coreference annotation from previous work, we introduce a simple algorithm for deriving a unified graph representation, avoiding the pitfalls of information loss from over-merging and lack of coherence from under-merging. Next, we describe improvements to the Smatch metric to make it tractable for comparing document-level graphs, and use it to re-evaluate the best published documentlevel AMR parser. We also present a pipeline approach combining the top performing AMR parser and coreference resolution systems, providing a strong baseline for future research.

Research paper thumbnail of Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Embeddings and the Implications to Representation Learning

Human judgments of word similarity have been a popular method of evaluating the quality of word e... more Human judgments of word similarity have been a popular method of evaluating the quality of word embedding. But it fails to measure the geometry properties such as asymmetry. For example, it is more natural to say "Ellipses are like Circles" than "Circles are like Ellipses". Such asymmetry has been observed from a psychoanalysis test called word evocation experiment, where one word is used to recall another. Although useful, such experimental data have been significantly understudied for measuring embedding quality. In this paper, we use three well-known evocation datasets to gain insights into asymmetry encoding of embedding. We study both static embedding as well as contextual embedding, such as BERT. Evaluating asymmetry for BERT is generally hard due to the dynamic nature of embedding. Thus, we probe BERT's conditional probabilities (as a language model) using a large number of Wikipedia contexts to derive a theoretically justifiable Bayesian asymmetry sco...

Research paper thumbnail of Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, ... more Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform look-ahead planning to determine how current actions might affect future world states. We design a new text-based gaming environment called TextWorld Commonsense (TWC) for training and evaluating RL agents with a specific kind of commonsense knowledge about objects, their attributes, and affordances. We also introduce several baseline RL agents which track the sequential context and dynamically retrieve the relevant commonsense knowledge from ConceptNet. We show that agents which incorporate commonsense knowledge in TWC perform better, while acting more efficiently. We conduct user-studies to...

Research paper thumbnail of Beyond Backprop: Alternating Minimization with co-Activation Memory

ArXiv, 2018

We propose a novel online algorithm for training deep feedforward neural networks that employs al... more We propose a novel online algorithm for training deep feedforward neural networks that employs alternating minimization (block-coordinate descent) between the weights and activation variables. It extends off-line alternating minimization approaches to online, continual learning, and improves over stochastic gradient descent (SGD) with backpropagation in several ways: it avoids the vanishing gradient issue, it allows for non-differentiable nonlinearities, and it permits parallel weight updates across the layers. Unlike SGD, our approach employs co-activation memory inspired by the online sparse coding algorithm of [17]. Furthermore, local iterative optimization with explicit activation updates is a potentially more biologically plausible learning mechanism than backpropagation. We provide theoretical convergence analysis and promising empirical results on several datasets.

Research paper thumbnail of Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

Despite significant recent advances in deep neural networks, training them remains a challenge du... more Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function. State-of-the-art methods rely on error backpropagation, which suffers from several well-known issues, such as vanishing and exploding gradients, inability to handle non-differentiable nonlinearities and to parallelize weight-updates across layers, and biological implausibility. These limitations continue to motivate exploration of alternative training algorithms, including several recently proposed auxiliary-variable methods which break the complex nested objective function into local subproblems. However, those techniques are mainly offline (batch), which limits their applicability to extremely large datasets, as well as to online, continual or reinforcement learning. The main contribution of our work is a novel online (stochastic/mini-batch) alternating minimization (AM) approach for training deep neural networks, together wi...

Research paper thumbnail of Effects of Communication Directionality and AI Agent Differences in Human-AI Interaction

Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

In Human-AI collaborative settings that are inherently interactive, direction of communication pl... more In Human-AI collaborative settings that are inherently interactive, direction of communication plays a role in how users perceive their AI partners. In an AI-driven cooperative game with partially observable information, players (be it the AI or the human player) require their actions to be interpreted accurately by the other player to yield a successful outcome. In this paper, we investigate social perceptions of AI agents with various directions of communication in a cooperative game setting. We measure subjective social perceptions (rapport, intelligence, and likeability) of participants towards their partners when participants believe they are playing with an AI or with a human and the nature of the communication (responsiveness and leading roles). We ran a large scale study on Mechanical Turk (n=199) of this collaborative game and find significant differences in gameplay outcome and social perception across different AI agents, different directions of communication and when the agent is perceived to be an AI/Human. We find that the bias against the AI that has been demonstrated in prior studies varies with the direction of the communication and with the AI agent.

Research paper thumbnail of Human-AI Collaboration in a Cooperative Game Setting: Measuring Social Perception and Outcomes

Human-AI interaction is pervasive across many areas of our day to day lives. In this paper, we in... more Human-AI interaction is pervasive across many areas of our day to day lives. In this paper, we investigate human-AI collaboration in the context of a collaborative AI-driven word association game with partially observable information. In our experiments, we test various dimensions of subjective social perceptions (rapport, intelligence, creativity and likeability) of participants towards their partners when participants believe they are playing with an AI or with a human. We also test subjective social perceptions of participants towards their partners when participants are presented with a variety of confidence levels. We ran a large scale study on Mechanical Turk (n=164) of this collaborative game. Our results show that when participants believe their partners were human, they found their partners to be more likeable, intelligent, creative and having more rapport and use more positive words to describe their partner's attributes than when they believed they were interacting wi...

Research paper thumbnail of Cross Sentence Inference for Process Knowledge

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Research paper thumbnail of Mental Models of AI Agents in a Cooperative Game Setting

Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

As more and more forms of AI become prevalent, it becomes increasingly important to understand ho... more As more and more forms of AI become prevalent, it becomes increasingly important to understand how people develop mental models of these systems. In this work we study people's mental models of AI in a cooperative word guessing game. We run think-aloud studies in which people play the game with an AI agent; through thematic analysis we identify features of the mental models developed by participants. In a large-scale study we have participants play the game with the AI agent online and use a post-game survey to probe their mental model. We find that those who win more often have better estimates of the AI agent's abilities. We present three components for modeling AI systems, propose that understanding the underlying technology is insufficient for developing appropriate conceptual models (analysis of behavior is also necessary), and suggest future work for studying the revision of mental models over time.

Research paper thumbnail of Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Word Embeddings and the Implications to Representation Learning

Proceedings of the ... AAAI Conference on Artificial Intelligence, May 18, 2021

Research paper thumbnail of Beyond Backprop: Alternating Minimization with co-Activation Memory

We propose a novel online algorithm for training deep feedforward neural networks that employs al... more We propose a novel online algorithm for training deep feedforward neural networks that employs alternating minimization (block-coordinate descent) between the weights and activation variables. It extends off-line alternating minimization approaches to online, continual learning, and improves over stochastic gradient descent (SGD) with backpropagation in several ways: it avoids the vanishing gradient issue, it allows for non-differentiable nonlinearities, and it permits parallel weight updates across the layers. Unlike SGD, our approach employs co-activation memory inspired by the online sparse coding algorithm of [17]. Furthermore, local iterative optimization with explicit activation updates is a potentially more biologically plausible learning mechanism than backpropagation. We provide theoretical convergence analysis and promising empirical results on several datasets.

Research paper thumbnail of Beyond backprop: Online alternating minimization with auxiliary variables

International Conference on Machine Learning, May 24, 2019

Research paper thumbnail of Human-AI Collaboration in a Cooperative Game Setting

Proceedings of the ACM on Human-Computer Interaction, 2020

Research paper thumbnail of DocAMR: Multi-Sentence AMR Representation and Evaluation

ArXiv, 2021

Despite extensive research on parsing of English sentences into Abstraction Meaning Representatio... more Despite extensive research on parsing of English sentences into Abstraction Meaning Representation (AMR) graphs, which are compared to gold graphs via the Smatch metric, full-document parsing into a unified graph representation lacks well-defined representation and evaluation. Taking advantage of a supersentential level of coreference annotation from previous work, we introduce a simple algorithm for deriving a unified graph representation, avoiding the pitfalls of information loss from over-merging and lack of coherence from under-merging. Next, we describe improvements to the Smatch metric to make it tractable for comparing document-level graphs, and use it to re-evaluate the best published documentlevel AMR parser. We also present a pipeline approach combining the top performing AMR parser and coreference resolution systems, providing a strong baseline for future research.

Research paper thumbnail of Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Embeddings and the Implications to Representation Learning

Human judgments of word similarity have been a popular method of evaluating the quality of word e... more Human judgments of word similarity have been a popular method of evaluating the quality of word embedding. But it fails to measure the geometry properties such as asymmetry. For example, it is more natural to say "Ellipses are like Circles" than "Circles are like Ellipses". Such asymmetry has been observed from a psychoanalysis test called word evocation experiment, where one word is used to recall another. Although useful, such experimental data have been significantly understudied for measuring embedding quality. In this paper, we use three well-known evocation datasets to gain insights into asymmetry encoding of embedding. We study both static embedding as well as contextual embedding, such as BERT. Evaluating asymmetry for BERT is generally hard due to the dynamic nature of embedding. Thus, we probe BERT's conditional probabilities (as a language model) using a large number of Wikipedia contexts to derive a theoretically justifiable Bayesian asymmetry sco...

Research paper thumbnail of Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, ... more Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform look-ahead planning to determine how current actions might affect future world states. We design a new text-based gaming environment called TextWorld Commonsense (TWC) for training and evaluating RL agents with a specific kind of commonsense knowledge about objects, their attributes, and affordances. We also introduce several baseline RL agents which track the sequential context and dynamically retrieve the relevant commonsense knowledge from ConceptNet. We show that agents which incorporate commonsense knowledge in TWC perform better, while acting more efficiently. We conduct user-studies to...

Research paper thumbnail of Beyond Backprop: Alternating Minimization with co-Activation Memory

ArXiv, 2018

We propose a novel online algorithm for training deep feedforward neural networks that employs al... more We propose a novel online algorithm for training deep feedforward neural networks that employs alternating minimization (block-coordinate descent) between the weights and activation variables. It extends off-line alternating minimization approaches to online, continual learning, and improves over stochastic gradient descent (SGD) with backpropagation in several ways: it avoids the vanishing gradient issue, it allows for non-differentiable nonlinearities, and it permits parallel weight updates across the layers. Unlike SGD, our approach employs co-activation memory inspired by the online sparse coding algorithm of [17]. Furthermore, local iterative optimization with explicit activation updates is a potentially more biologically plausible learning mechanism than backpropagation. We provide theoretical convergence analysis and promising empirical results on several datasets.

Research paper thumbnail of Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

Despite significant recent advances in deep neural networks, training them remains a challenge du... more Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function. State-of-the-art methods rely on error backpropagation, which suffers from several well-known issues, such as vanishing and exploding gradients, inability to handle non-differentiable nonlinearities and to parallelize weight-updates across layers, and biological implausibility. These limitations continue to motivate exploration of alternative training algorithms, including several recently proposed auxiliary-variable methods which break the complex nested objective function into local subproblems. However, those techniques are mainly offline (batch), which limits their applicability to extremely large datasets, as well as to online, continual or reinforcement learning. The main contribution of our work is a novel online (stochastic/mini-batch) alternating minimization (AM) approach for training deep neural networks, together wi...

Research paper thumbnail of Effects of Communication Directionality and AI Agent Differences in Human-AI Interaction

Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

In Human-AI collaborative settings that are inherently interactive, direction of communication pl... more In Human-AI collaborative settings that are inherently interactive, direction of communication plays a role in how users perceive their AI partners. In an AI-driven cooperative game with partially observable information, players (be it the AI or the human player) require their actions to be interpreted accurately by the other player to yield a successful outcome. In this paper, we investigate social perceptions of AI agents with various directions of communication in a cooperative game setting. We measure subjective social perceptions (rapport, intelligence, and likeability) of participants towards their partners when participants believe they are playing with an AI or with a human and the nature of the communication (responsiveness and leading roles). We ran a large scale study on Mechanical Turk (n=199) of this collaborative game and find significant differences in gameplay outcome and social perception across different AI agents, different directions of communication and when the agent is perceived to be an AI/Human. We find that the bias against the AI that has been demonstrated in prior studies varies with the direction of the communication and with the AI agent.

Research paper thumbnail of Human-AI Collaboration in a Cooperative Game Setting: Measuring Social Perception and Outcomes

Human-AI interaction is pervasive across many areas of our day to day lives. In this paper, we in... more Human-AI interaction is pervasive across many areas of our day to day lives. In this paper, we investigate human-AI collaboration in the context of a collaborative AI-driven word association game with partially observable information. In our experiments, we test various dimensions of subjective social perceptions (rapport, intelligence, creativity and likeability) of participants towards their partners when participants believe they are playing with an AI or with a human. We also test subjective social perceptions of participants towards their partners when participants are presented with a variety of confidence levels. We ran a large scale study on Mechanical Turk (n=164) of this collaborative game. Our results show that when participants believe their partners were human, they found their partners to be more likeable, intelligent, creative and having more rapport and use more positive words to describe their partner's attributes than when they believed they were interacting wi...

Research paper thumbnail of Cross Sentence Inference for Process Knowledge

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Research paper thumbnail of Mental Models of AI Agents in a Cooperative Game Setting

Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

As more and more forms of AI become prevalent, it becomes increasingly important to understand ho... more As more and more forms of AI become prevalent, it becomes increasingly important to understand how people develop mental models of these systems. In this work we study people's mental models of AI in a cooperative word guessing game. We run think-aloud studies in which people play the game with an AI agent; through thematic analysis we identify features of the mental models developed by participants. In a large-scale study we have participants play the game with the AI agent online and use a post-game survey to probe their mental model. We find that those who win more often have better estimates of the AI agent's abilities. We present three components for modeling AI systems, propose that understanding the underlying technology is insufficient for developing appropriate conceptual models (analysis of behavior is also necessary), and suggest future work for studying the revision of mental models over time.