An Adaptable Approach to Learn Realistic Legged Locomotion without Examples (original) (raw)

Learning from demonstration and adaptation of biped locomotion

2004

In this paper, we introduce a framework for learning biped locomotion using dynamical movement primitives based on non-linear oscillators. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a central pattern generator (CPG) of a biped robot, an approach we have previously proposed for learning and encoding complex human movements.

Toward Fast Policy Search for Learning Legged Locomotion

Legged locomotion is one of the most versatile forms of mobility. However, despite the importance of legged locomotion and the large number of legged robotics studies, no biped or quadruped matches the agility and versatility of their biological counterparts to date. Approaches to designing controllers for legged locomotion systems are often based on either the assumption of perfectly known dynamics or mechanical designs that substantially reduce the dimensionality of the problem. The few existing approaches for learning controllers for legged systems either require exhaustive real-world data or they improve controllers only conservatively, leading to slow learning. We present a data-efficient approach to learning feedback controllers for legged locomotive systems, based on learned probabilistic forward models for generating walking policies. On a compass walker, we show that our approach allows for learning gait policies from very little data. Moreover, we analyze learned locomotion models of a biomechanically inspired biped. Our approach has the potential to scale to highdimensional humanoid robots with little loss in efficiency.

Learning hybrid locomotion skills—Learn to exploit residual actions and modulate model-based gait control

Frontiers in Robotics and AI, 2023

This work has developed a hybrid framework that combines machine learning and control approaches for legged robots to achieve new capabilities of balancing against external perturbations. The framework embeds a kernel which is a model-based, full parametric closed-loop and analytical controller as the gait pattern generator. On top of that, a neural network with symmetric partial data augmentation learns to automatically adjust the parameters for the gait kernel, and also generate compensatory actions for all joints, thus significantly augmenting the stability under unexpected perturbations. Seven Neural Network policies with different configurations were optimized to validate the effectiveness and the combined use of the modulation of the kernel parameters and the compensation for the arms and legs using residual actions. The results validated that modulating kernel parameters alongside the residual actions have improved the stability significantly. Furthermore, The performance of the proposed framework was evaluated across a set of challenging simulated scenarios, and demonstrated considerable improvements compared to the baseline in recovering from large external forces (up to 118%). Besides, regarding measurement noise and model inaccuracies, the robustness of the proposed framework has been assessed through simulations, which demonstrated the robustness in the presence of these uncertainties. Furthermore, the trained policies were validated across a set of unseen scenarios and showed the generalization to dynamic walking.

A framework for learning biped locomotion with dynamical movement primitives

2004

This article summarizes our framework for learning biped locomotion using dynamical movement primitives based on nonlinear oscillators. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a central pattern generator (CPG) of a biped robot, an approach we have previously proposed for learning and encoding complex human movements.

Learning Hybrid Locomotion Skills -- Learn to Exploit Residual Dynamics and Modulate Model-based Gait Control

arXiv (Cornell University), 2020

This work aims to combine machine learning and control approaches for legged robots, and developed a hybrid framework to achieve new capabilities of balancing against external perturbations. The framework embeds a kernel which is a fully parametric closed-loop gait generator based on analytical control. On top of that, a neural network with symmetric partial data augmentation learns to automatically adjust the parameters for the gait kernel and to generate compensatory actions for all joints as the residual dynamics, thus significantly augmenting the stability under unexpected perturbations. The performance of the proposed framework was evaluated across a set of challenging simulated scenarios. The results showed considerable improvements compared to the baseline in recovering from large external forces. Moreover, the produced behaviours are more natural, human-like and robust against noisy sensing.

Autonomous learning of stable quadruped locomotion

2007

A fast gait is an essential component of any successful team in the RoboCup 4-legged league. However, quickly moving quadruped robots, including those with learned gaits, often move in such a way so as to cause unsteady camera motions which degrade the robot's visual capabilities. This paper presents an implementation of the policy gradient machine learning algorithm that searches for a parameterized walk while optimizing for both speed and stability.

Learning Memory-Based Control for Human-Scale Bipedal Locomotion

Robotics: Science and Systems XVI, 2020

Controlling a non-statically stable biped is a difficult problem largely due to the complex hybrid dynamics involved. Recent work has demonstrated the effectiveness of reinforcement learning (RL) for simulation-based training of neural network controllers that successfully transfer to real bipeds. The existing work, however, has primarily used simple memoryless network architectures, even though more sophisticated architectures, such as those including memory, often yield superior performance in other RL domains. In this work, we consider recurrent neural networks (RNNs) for sim-to-real biped locomotion, allowing for policies that learn to use internal memory to model important physical properties. We show that while RNNs are able to significantly outperform memoryless policies in simulation, they do not exhibit superior behavior on the real biped due to overfitting to the simulation physics unless trained using dynamics randomization to prevent overfitting; this leads to consistently better sim-to-real transfer. We also show that RNNs could use their learned memory states to perform online system identification by encoding parameters of the dynamics into memory.

Learning from Demonstration and Adaptation of Biped Locomotion with Dynamical Movement Primitives

Robotics and Autonomous Systems, 2003

In this paper, we report on our research for learning biped locomotion from human demonstration. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a CPG of a biped robot, an approach we have previously proposed for learning and encoding complex human movements. Demonstrated

Imitation learning of humanoid locomotion using the direction of landing foot

International Journal of Control, Automation and Systems, 2009

Since it is quite difficult to create motions for humanoid robots having fairly large numbers of degrees of freedom, it would be very convenient indeed if robots could observe and imitate what they want to create. Toward this end, this paper discusses how humanoid robots learn through imitation considering that demonstrator and imitator robots may have different kinematics and dynamics. As part of a wider interest in humanoid motion generation in general, this work mainly investigates how imitator robots adapt a reference locomotion gait captured from a demonstrator robot. Specifically, the selfadjusting adaptor is proposed, where the perceived locomotion pattern is modified to keep the direction of lower leg contacting the ground identical between the demonstrator and the imitator, and to sustain the dynamic stability by controlling the position of the center of mass. The validity of the proposed scheme is verified through simulations on OpenHRP and real experiments.