Ross Knepper - Profile on Academia.edu (original) (raw)

Papers by Ross Knepper

Modern robots have to interact with their environment, search for objects, and move them around. ... more Modern robots have to interact with their environment, search for objects, and move them around. Yet, for a robot to pick up an object, it needs to identify the object's orientation and locate it to within centimeter-scale accuracy. Existing systems that provide such information are either very expensive (e.g., the VICON motion capture system valued at hundreds of thousands of dollars) and/or suffer from occlusion and narrow field of view (e.g., computer vision approaches). This paper presents RF-Compass, an RFID-based system for robot navigation and object manipulation. RFIDs are low-cost and work in non-line-of-sight scenarios, allowing them to address the limitations of existing solutions. Given an RFID-tagged object, RF-Compass accurately navigates a robot equipped with RFIDs toward the object. Further, it locates the center of the object to within a few centimeters and identifies its orientation so that the robot may pick it up. RF-Compass's key innovation is an iterative algorithm formulated as a convex optimization problem. The algorithm uses the RFID signals to partition the space and keeps refining the partitions based on the robot's consecutive moves. We have implemented RF-Compass using USRP software radios and evaluated it with commercial RFIDs and a KUKA youBot robot. For the task of furniture assembly, RF-Compass can locate furniture parts to a median of 1.28 cm, and identify their orientation to a median of 3.3 degrees.

The International Journal of Robotics Research, Jul 24, 2015

We present algorithms and experiments for multi-scale assembly of complex structures by multi-rob... more We present algorithms and experiments for multi-scale assembly of complex structures by multi-robot teams. We focus on tasks where successful completion requires multiple types of assembly operations with a range of precision requirements. We develop a hierarchical planning approach to multi-scale perception in support of multi-scale manipulation, in which the resolution of the perception operation is

2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Mar 1, 2013

We describe an approach for enabling robots to recover from failures by asking for help from a hu... more We describe an approach for enabling robots to recover from failures by asking for help from a human partner. For example, if a robot fails to grasp a needed part during a furniture assembly task, it might ask a human partner to "Please hand me the white table leg near you." After receiving the part from the human, the robot can recover from its grasp failure and continue the task autonomously. This paper describes an approach for enabling a robot to automatically generate a targeted natural language request for help from a human partner. The robot generates a natural language description of its need by minimizing the entropy of the command with respect to its model of language understanding for the human partner, a novel approach to grounded language generation. Our long-term goal is to compare targeted requests for help to more open-ended requests where the robot simply asks "Help me," demonstrating that targeted requests are more easily understood by human partners.

Humans expect their collaborators to look beyond the explicit interpretation of their words. Impl... more Humans expect their collaborators to look beyond the explicit interpretation of their words. Implicature is a common form of implicit communication that arises in natural language discourse when an utterance leverages context to imply information beyond what the words literally convey. Whereas computational methods have been proposed for interpreting and using different forms of implicature, its role in human and artificial agent collaboration has not yet been explored in a concrete domain. The results of this paper provide insights to how artificial agents should be structured to facilitate natural and efficient communication of actionable information with humans. We investigated implicature by implementing two strategies for playing Hanabi, a cooperative card game that relies heavily on communication of actionable implicit information to achieve a shared goal. In a user study with 904 completed games and 246 completed surveys, human players randomly paired with an implicature AI are 71% more likely to think their partner is human than players paired with a non-implicature AI. These teams demonstrated game performance similar to other state of the art approaches.

arXiv (Cornell University), May 31, 2018

We introduce a method for following high-level navigation instructions by mapping directly from i... more We introduce a method for following high-level navigation instructions by mapping directly from images, instructions and pose estimates to continuous low-level velocity commands for real-time control. The Grounded Semantic Mapping Network (GSMN) is a fully-differentiable neural network architecture that builds an explicit semantic map in the world reference frame by incorporating a pinhole camera projection model within the network. The information stored in the map is learned from experience, while the local-to-world transformation is computed explicitly. We train the model using DAGGERFM, a modified variant of DAGGER that trades tabular convergence guarantees for improved training speed and memory use. We test GSMN in virtual environments on a realistic quadcopter simulator and show that incorporating an explicit mapping and grounding modules allows GSMN to outperform strong neural baselines and almost reach an expert policy performance. Finally, we analyze the learned map representations and show that using an explicit map leads to an interpretable instruction-following model.

arXiv (Cornell University), Nov 14, 2020

We study the problem of learning a robot policy to follow natural language instructions that can ... more We study the problem of learning a robot policy to follow natural language instructions that can be easily extended to reason about new objects. We introduce a few-shot language-conditioned object grounding method trained from augmented reality data that uses exemplars to identify objects and align them to their mentions in instructions. We present a learned map representation that encodes object locations and their instructed use, and construct it from our few-shot grounding output. We integrate this mapping approach into an instructionfollowing policy, thereby allowing it to reason about previously unseen objects at test-time by simply adding exemplars. We evaluate on the task of learning to map raw observations and instructions to continuous control of a physical quadcopter. Our approach significantly outperforms the prior state of the art in the presence of new objects, even when the prior approach observes all objects during training.

Robot navigation in human spaces today largely relies on the construction of precise geometric ma... more Robot navigation in human spaces today largely relies on the construction of precise geometric maps and a global motion plan. In this work, we navigate with only local sensing by using available signage-as designed for humansin human-made environments such as airports. We propose a formalization of "signage" and define 4 levels of signage that we call complete, fully-specified, consistent and valid. The signage formalization can be used on many space skeletonizations, but we specifically provide an approach for navigation on the medial axis. We prove that we can achieve global completeness guarantees without requiring a global map to plan. We validate with two sets of experiments: (1) with real-world airports and their real signs and (2) real New York City neighborhoods. In (1) we show we can use real-world airport signage to improve on a simple random-walk approach, and we explore augmenting signage to further explore signs' impact on trajectory length. In (2), we navigate in varied sized subsets of New York City to show that, since we only use local sensing, our approach scales linearly with trajectory length rather than freespace area.

ACM transactions on human-robot interaction, Feb 8, 2022

Mobile robots struggle to integrate seamlessly in crowded environments such as pedestrian scenes,... more Mobile robots struggle to integrate seamlessly in crowded environments such as pedestrian scenes, often disrupting human activity. One obstacle preventing their smooth integration is our limited understanding of how humans may perceive and react to robot motion. Motivated by recent studies highlighting the benefits of intent-expressive motion for robots operating close to humans, we describe Social Momentum (SM), a planning framework for legible robot motion generation in multiagent domains. We investigate the properties of motion generated by SM via two large-scale user studies: an online, video-based study (N = 180) focusing on the legibility of motion produced by SM and a lab study (N = 105) focusing on the perceptions of users navigating next to a robot running SM in a crowded space. Through statistical and thematic analyses of collected data, we present evidence suggesting that (a) motion generated by SM enables quick inference of the robot's navigation strategy; (b) humans navigating close to a robot running SM follow comfortable, lowacceleration paths; and (c) robot motion generated by SM is positively perceived and indistinguishable from a teleoperated baseline. Through the discussion of experimental insights and lessons learned, this article aspires to inform future algorithmic and experimental design for social robot navigation. CCS Concepts: • Computer systems organization → Robotics; Robotic control; • Computing methodologies → Robotic planning; • Human-centered computing → User studies;

Springer proceedings in advanced robotics, 2020

We present a novel planning framework for navigation in dynamic, multi-agent environments with no... more We present a novel planning framework for navigation in dynamic, multi-agent environments with no explicit communication among agents, such as pedestrian scenes. Inspired by the collaborative nature of human navigation, our approach treats the problem as a coordination game, in which players coordinate to avoid each other as they move towards their destinations. We explicitly encode the concept of coordination into the agents' decision making process through a novel inference mechanism about future joint strategies of avoidance. We represent joint strategies as equivalence classes of topological trajectory patterns using the formalism of braids. This topological representation naturally generalizes to any number of agents and provides the advantage of adaptability to different environments, in contrast to the majority of existing approaches. At every round, the agents simultaneously decide on their next action that contributes collisionfree progress towards their destination but also towards a global joint strategy that appears to be in compliance with all agents' preferences, as inferred from their past behaviors. This policy leads to a smooth and rapid uncertainty decrease regarding the emerging joint strategy that is promising for real world scenarios. Simulation results highlight the importance of reasoning about joint strategies and demonstrate the efficacy of our approach.

Springer eBooks, Jan 29, 2008

A smooth-primitive constrained-optimization-based path-tracking algorithm for mobile robots that ... more A smooth-primitive constrained-optimization-based path-tracking algorithm for mobile robots that compensates for rough terrain, predictable vehicle dynamics, and vehicle mobility constraints has been developed, implemented, and tested on the DARPA LAGR platform. Traditional methods for the geometric path following control problem involve trying to meet position constraints at fixed or velocity dependent look-ahead distances using arcs. We have reformulated the problem as an optimal control problem, using a trajectory generator that can meet arbitrary boundary state constraints. The goal state along the target path is determined dynamically by minimizing a utility function based on corrective trajectory feasibility and cross-track error. A set of field tests compared the proposed method to an implementation of the pure pursuit algorithm and showed that the smooth corrective trajectory constrained optimization approach exhibited higher performance than pure pursuit by achieving rough four times lower average cross-track error and two times lower heading error.

We present a novel, data-driven framework for planning socially competent robot behaviors in crow... more We present a novel, data-driven framework for planning socially competent robot behaviors in crowded environments. The core of our approach is a topological model of collective navigation behaviors, based on braid groups. This model constitutes the basis for the design of a human-inspired probabilistic inference mechanism that predicts the topology of multiple agents' future trajectories, given observations of the context. We derive an approximation of this mechanism by employing a neural network learning architecture on synthetic data of collective navigation behaviors. Our planner makes use of this mechanism as a tool for interpreting the context and understanding what future behaviors are in compliance with it. The planning agent makes use of this understanding to determine a personal action that contributes to the context in the most clear way possible, while ensuring progress to its destination. Our simulations provide evidence that our planning framework results in socially competent navigation behaviors not only for the planning agent, but also for interacting naive agents. Performance benefits include (1) early conflict resolutions and (2) faster uncertainty decrease for the other agents in the scene.

The goal of motion planning is to find a feasible path that connects two positions and is free fr... more The goal of motion planning is to find a feasible path that connects two positions and is free from collision with obstacles. Path sets are a robust approach to this problem in the face of real-world complexity and uncertainty. A path set is a collection of feasible paths and their corresponding control sequences. A path-set-based planner navigates by repeatedly testing each of these robot-fixed paths for collision with obstacles. A heuristic function selects which of the surviving paths to follow next. At each step, the robot follows a small piece of each path selected while simultaneously planning the subsequent trajectory. A path set possesses high path diversity if it performs well at obstacle-avoidance and goal-seeking behaviors. Previous work in path diversity has tacitly assumed that a correlation exists between this dynamic planning problem and a simpler, static path diversity problem: a robot placed randomly into an obstacle field evaluates its path set for collision a single time before following the chosen path in entirety. Although these problems might intuitively appear to be linked, this paper shows that static and dynamic path diversity are two distinct properties. After empirically demonstrating this fact, we discuss some of the factors that differentiate the two problems.

We present heuristic algorithms for pruning large sets of candidate paths or trajectories down to... more We present heuristic algorithms for pruning large sets of candidate paths or trajectories down to smaller subsets that maintain desirable characteristics in terms of overall reachability and path length. Consider the example of a set of candidate paths in an environment that is the result of a forward search tree built over a set of actions or behaviors. The tree is precomputed and stored in memory to be used online to compute collision-free paths from the root of the tree to a particular goal node. In general, such a set of paths may be quite large, growing exponentially in the depth of the search tree. In practice, however, many of these paths may be close together and could be pruned without a loss to the overall problem of path-finding. The best such pruning for a given resulting tree size is the one that maximizes path diversity, which is quantified as the probability of the survival of paths, averaged over all possible obstacle environments. We formalize this notion and provide formulas for computing it exactly. We also present experimental results for two approximate algorithms for path set reduction that are efficient and yield desirable properties in terms of overall path diversity. The exact formulas and approximate algorithms generalize to the computation and maximization of spatio-temporal diversity for trajectories.

This paper presents a solution to the problem of finding an effective yet admissible heuristic fu... more This paper presents a solution to the problem of finding an effective yet admissible heuristic function for A* by precomputing a look-up table of solutions. This is necessary because traditional heuristic functions such as Euclidean distance often produce poor performance for certain problems. In this case, the technique is applied to the state lattice, which is used for full state space motion planning. However, the approach is applicable to many applications of heuristic search algorithms. The look-up table is demonstrated to be feasible to generate and store. A principled technique is presented for selecting which queries belong in the table. Finally, the results are validated through testing on a variety of path planning problems.

Herb 2.0: Lessons Learned From Developing a Mobile Manipulator for the Home

Proceedings of the IEEE, Aug 1, 2012

arXiv (Cornell University), Sep 2, 2020

Robots can be used to collect environmental data in regions that are difficult for humans to trav... more Robots can be used to collect environmental data in regions that are difficult for humans to traverse. However, limitations remain in the size of region that a robot can directly observe per unit time. We introduce a method for selecting a limited number of observation points in a large region, from which we can predict the state of unobserved points in the region. We combine a low rank model of a target attribute with an information-maximizing path planner to predict the state of the attribute throughout a region. Our approach is agnostic to the choice of target attribute and robot monitoring platform. We evaluate our method in simulation on two real-world environment datasets, each containing observations from one to two million possible sampling locations. We compare against a random sampler and four variations of a baseline sampler from the ecology literature. Our method outperforms the baselines in terms of average Fisher information gain per samples taken and performs comparably for average reconstruction error in most trials.

arXiv (Cornell University), Jun 16, 2019

Despite recent major advances in robotics research, massive injections of capital into robotics s... more Despite recent major advances in robotics research, massive injections of capital into robotics startups, and significant market appetite for robotic solutions, large-scale real-world deployments of robotic systems remain relatively scarce outside of heavy industry and (recently) warehouse logistics. In this paper, we posit that this scarcity comes from the difficulty of building even merely functional, first-pass robotic applications without a dizzying breadth and depth of expertise, in contrast to the relative ease with which non-experts in cloud computing can build complex distributed applications that function reasonably well. We trace this difficulty in application building to the paucity of good systems research in robotics, and lay out a path toward enabling application building by centering usability in systems research in two different ways: privileging the usability of the abstractions defined in systems research, and ensuring that the research itself is usable by application developers in the context of evaluating it for its applicability to their target domain by following principles of realism, empiricism, and exhaustive explication. In addition, we make some suggestions for community-level changes, incentives, and initiatives to create a better environment for systems work in robotics.

arXiv (Cornell University), Jun 8, 2017

This paper presents a machine learning approach to map outputs from an embedded array of sensors ... more This paper presents a machine learning approach to map outputs from an embedded array of sensors distributed throughout a deformable body to continuous and discrete virtual states, and its application to interpret human touch in soft interfaces. We integrate stretchable capacitors into a rubber membrane, and use a passive addressing scheme to probe sensor arrays in real-time. To process the signals from this array, we feed capacitor measurements into convolutional neural networks that classify and localize touch events on the interface. We implement this concept with a device called OrbTouch. To modularize the system, we use a supervised learning approach wherein a user defines a set of touch inputs and trains the interface by giving it examples; we demonstrate this by using OrbTouch to play the popular game Tetris. Our regression model localizes touches with mean test error of 0.09 mm, while our classifier recognizes gestures with a mean test accuracy of 98.8%. In a separate demonstration, we show that OrbTouch can discriminate between different users with a mean test accuracy of 97.6%. At test time, we feed the outputs of these models into a debouncing algorithm to provide a nearly error-free experience.

Conference on Robot Learning, Oct 21, 2019

We propose a joint simulation and real-world learning framework for mapping navigation instructio... more We propose a joint simulation and real-world learning framework for mapping navigation instructions and raw first-person observations to continuous control. Our model estimates the need for environment exploration, predicts the likelihood of visiting environment positions during execution, and controls the agent to both explore and visit high-likelihood positions. We introduce Supervised Reinforcement Asynchronous Learning (SuReAL). Learning uses both simulation and real environments without requiring autonomous flight in the physical environment during training, and combines supervised learning for predicting positions to visit and reinforcement learning for continuous control. We evaluate our approach on a natural language instruction-following task with a physical quadcopter, and demonstrate effective execution and exploration behavior.

Conference on Robot Learning, Oct 23, 2018

We propose an approach for mapping natural language instructions and raw observations to continuo... more We propose an approach for mapping natural language instructions and raw observations to continuous control of a quadcopter drone. Our model predicts interpretable position-visitation distributions indicating where the agent should go during execution and where it should stop, and uses the predicted distributions to select the actions to execute. This two-step model decomposition allows for simple and efficient training using a combination of supervised learning and imitation learning. We evaluate our approach with a realistic drone simulator, and demonstrate absolute task-completion accuracy improvements of 16.85% over two stateof-the-art instruction-following methods.