An advanced reinforcement learning control method for quadruped robots in typical urban terrains (original) (raw)

Abstract

Quadruped robots, with their exceptional flexibility and stable structure, are highly suitable for traversing the complex unstructured terrains in urban environments. However, the current flexibility and stability of quadruped robots based on reinforcement learning are still not ideal in these terrains. To address this limitation, a large-scale parallel technology-based end-to-end teacher-student learning network framework is proposed, where the Gated Recurrent Unit achieves a potential estimation of the heights surrounding the robot. Meanwhile, by introducing an omnidirectional terrain learning curriculum, the robot can move in any commanded direction, achieving smooth output and tracking of motor joint angles. By utilizing state machines, the model trained from the simulation is deployed in the Unitree Go1 robot via zero-shot learning. Simulation and real-world experiments have demonstrated that this approach significantly enhances the robot’s adaptability and mobility across various urban terrains such as gravel, grass, slopes, and steps.

Access this article

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Data availability

Partial code is publicly available at https://github.com/dstx123/unitree_rl. Some demonstration videos showcasing the results can be accessed at https://dstx123.github.io/RL-control/. These resources are freely available for further research purposes.

Change history

30 January 2025

The original online version of this article was revised due to change in order of affiliation.

18 February 2025

A Correction to this paper has been published: https://doi.org/10.1007/s13042-025-02564-6

References

Peng XB, Coumans E, Zhang T, Lee T-WE, Tan J, Levine S (2020) Learning agile robotic locomotion skills by imitating animals. In: Robotics: Science and Systems
Nahrendra I, Oh M, Yu B, Lim H, Myung H (2023) Robust recovery motion control for quadrupedal robots via learned terrain imagination. arXiv preprint arXiv:2306.12712
Bledt G, Powell MJ, Katz B, Di Carlo J, Wensing PM, Kim S (2018) Mit cheetah 3: Design and control of a robust, dynamic quadruped robot. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2245–2252
Di Carlo J, Wensing PM, Katz B, Bledt G, Kim S (2018) Dynamic locomotion in the mit cheetah 3 through convex model-predictive control. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 1–9
Kim D, Di Carlo J, Katz B, Bledt G, Kim S (2019) Highly dynamic quadruped locomotion via whole-body impulse control and model predictive control. arXiv preprint arXiv:1909.06586
Katz B, Di Carlo J, Kim S (2019) Mini cheetah: A platform for pushing the limits of dynamic quadruped control. In: 2019 International Conference on Robotics and Automation (ICRA), IEEE, 6295–6301
Hoeller D, Rudin N, Sako D, Hutter M (2024) Anymal parkour: Learning agile navigation for quadrupedal robots. Sci Robot 9(88):7566
Article Google Scholar
Schneider L, Frey J, Miki T, Hutter M (2023) Learning risk-aware quadrupedal locomotion using distributional reinforcement learning. arXiv preprint arXiv:2309.14246
Vollenweider E, Bjelonic M, Klemm V, Rudin N, Lee J, Hutter M (2023) Advanced skills through multiple adversarial motion priors in reinforcement learning. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 5120–5126
Margolis GB, Yang G, Paigwar K, Chen T, Agrawal P (2022) Rapid locomotion via reinforcement learning. Int J Robot Res 43:572–587
Article Google Scholar
Yang R, Zhang M, Hansen N, Xu H, Wang X (2021) Learning vision-guided quadrupedal locomotion end-to-end with cross-modal transformers. arXiv preprint arXiv:2107.03996
Miki T, Lee J, Hwangbo J, Wellhausen L, Koltun V, Hutter M (2022) Learning robust perceptive locomotion for quadrupedal robots in the wild. Sci Robot 7(62):2822
Article Google Scholar
Loquercio A, Kumar A, Malik J (2023) Learning visual locomotion with cross-modal supervision. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 7295–7302
Cheng X, Shi K, Agarwal A, Pathak D (2023) Extreme parkour with legged robots. arXiv preprint arXiv:2309.14341
Lee J, Hwangbo J, Wellhausen L, Koltun V, Hutter M (2020) Learning quadrupedal locomotion over challenging terrain. Sci Robot 5(47):5986
Article Google Scholar
Wu P, Escontrela A, Hafner D, Abbeel P, Goldberg K (2023) Daydreamer: World models for physical robot learning. In: Conference on Robot Learning, PMLR, 2226–2240
Smith L, Kostrikov I, Levine S (2022) A walk in the park: Learning to walk in 20 minutes with model-free reinforcement learning. arXiv preprint arXiv:2208.07860
Nahrendra IMA, Yu B, Myung H (2023) Dreamwaq: Learning robust quadrupedal locomotion with implicit terrain imagination via deep reinforcement learning. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 5078–5084
Kumar A, Fu Z, Pathak D, Malik J (2021) Rma: Rapid motor adaptation for legged robots. arXiv preprint arXiv:2107.04034
Margolis, G.B., Agrawal, P.: Walk these ways: Tuning robot control for generalization with multiplicity of behavior. In: Conference on Robot Learning, pp. 22–31 (2023). PMLR
Wu J, Xin G, Qi C, Xue Y (2023) Learning robust and agile legged locomotion using adversarial motion priors. IEEE Robot Autom Lett 8:4975
Article Google Scholar
Rudin N, Hoeller D, Reist P, Hutter M (2022) Learning to walk in minutes using massively parallel deep reinforcement learning. In: Conference on Robot Learning, PMLR, 91–100
Yu W, Yang C, McGreavy C, Triantafyllidis E, Bellegarda G, Shafiee M, Ijspeert AJ, Li Z (2023) Identifying important sensory feedback for learning locomotion skills. Nat Mach Intell 5(8):919–932
Article Google Scholar
Schulman J, Wolski F, Dhariwal P, Radford A (2017) Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Long J, Wang Z, Li Q, Gao J, Cao L, Pang J (2023) Hybrid internal model: Learning agile legged locomotion with simulated robot response. ArXiv

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Grant No. U2013601), in part by Anhui Province Natural Science Funds for Distinguished Young Scholar (Grant No. 2308085J02), and in part by Innovation Leading Talent of Anhui Province TeZhi plan.

Author information

Author notes

Chi Yan and Ning Wang: These authors contributed equally to this work.

Authors and Affiliations

School of Information Science and Technology, University of Science and Technology of China, No. 96 Jinzhai Road, Hefei, 230026, Anhui, China
Chi Yan, Hongbo Gao, Xinmiao Wang, Chao Tang & Lin Zhou
School of Information and Security, Chongqing College of Mobile Communication, No. 36 Dengying Avenue, Qijiang District, Chongqing, 401520, Sichuan, China
Ning Wang
Institute of Advanced Technology, University of Science and Technology of China, No. 5089 Wangjiang West Road, Hefei, 230088, Anhui, China
Hongbo Gao
School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Nanyang, Singapore
Hongbo Gao
Zhejiang Lab, Kechuang Avenue, Zhongtai Sub-District, Hangzhou, 311121, Zhejiang, China
Yuehua Li
College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China
Yue Wang

Authors

Chi Yan
Ning Wang
Hongbo Gao
Xinmiao Wang
Chao Tang
Lin Zhou
Yuehua Li
Yue Wang

Corresponding author

Correspondence toHongbo Gao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised due to change in order of affiliation.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yan, C., Wang, N., Gao, H. et al. An advanced reinforcement learning control method for quadruped robots in typical urban terrains.Int. J. Mach. Learn. & Cyber. 16, 3747–3757 (2025). https://doi.org/10.1007/s13042-024-02478-9

Download citation

Received: 09 April 2024
Accepted: 21 November 2024
Published: 03 December 2024
Version of record: 03 December 2024
Issue date: June 2025
DOI: https://doi.org/10.1007/s13042-024-02478-9

An advanced reinforcement learning control method for quadruped robots in typical urban terrains (original) (raw)

Abstract

Access this article

Buy Now

Similar content being viewed by others

Data availability

Change history

30 January 2025

18 February 2025

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

An advanced reinforcement learning control method for quadruped robots in typical urban terrains (original) (raw)

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Data availability

Change history

30 January 2025

18 February 2025

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords