Humanoid Foundation (original) (raw)
Scalable, Portable, and Holistic Humanoid Data Collection System
Yanjie Ze, Siheng Zhao, Weizhuo Wang
Angjoo Kanazawa†, Rocky Duan†, Pieter Abbeel†, Guanya Shi†, Jiajun Wu†, C. Karen Liu†
arXiv 2025 | Website | arXiv | Code | Twitter |YouTube |Bilibili
TWIST2 is a scalable, portable, and holistic whole-body humanoid data collection system. It enables efficient egocentric teleoperation and large-scale loco-manipulation data gathering across diverse environments and subjects, supporting downstream visuomotor policy learning.
Your browser does not support the video tag.
With TWIST2, we can collect 50 demonstrations in 15 minutes.
Visual Humanoid Loco-Manipulation via Motion Tracking and Generation
Shaofeng Yin*, Yanjie Ze*, Hong-Xing Yu, C. Karen Liu†, Jiajun Wu†
arXiv 2025 | Website | Code | Twitter
VisualMimic enables generalizable visuomotor skills across time & space, allowing humanoid robots to perform complex loco-manipulation tasks by learning from visual demonstrations. The system demonstrates remarkable robustness across different lighting conditions (morning, dusk, evening, midnight) and diverse environments (Hover Tower, Engineering Building, Robotics Center, Memorial Church).
Your browser does not support the video tag.
VisualMimic enables humanoid robots to learn complex loco-manipulation behaviors with whole-body dexterity.
The framework combines motion tracking and generation capabilities to transfer human-like behaviors to humanoid robots, enabling them to perform tasks that require both locomotion and manipulation skills using hands, feet, shoulders, and other body parts. VisualMimic achieves this through a novel interface design for high-level/low-level communication and teacher-student distillation for training low-level trackers.
General Motion Retargeting
Yanjie Ze, João Pedro Araújo, Jiajun Wu, and C. Karen Liu
Open-Source Software | Code | Twitter
Human motion serves as a unified form factor that provides scalable data for controlling diverse robotic systems, eliminating the need for extensive robot-specific motion collection. General Motion Retargeting (GMR) retargets human motions to diverse humanoid robots via real-time multi-objective inverse kinematics, jointly solving for rotation and position constraints to preserve rich spatial information from humans. Built upon mink, GMR enables real-time inference and is utilized in TWIST for real-time whole-body teleoperation.
Your browser does not support the video tag.
GMR transfers human motions to diverse humanoid robots in real-time.
Teleoperated Whole-Body Imitation System
Yanjie Ze*, Zixuan Chen*, João Pedro Araújo*, Zi-ang Cao,
Xue Bin Peng, Jiajun Wu†, C. Karen Liu†
CoRL 2025 | Paper | Code | Website |Twitter | TechXplore
We want humanoids to have the same level of whole-body dexterity as humans. Imagine a messy kitchen, humans can hold things with two hands and use their feet to move obstacles, such as a basket on the ground; humans can also open the door using the sides of their bodies or their elbows. We want to make humanoids achieve the same by imitating humans directly.
Your browser does not support the video tag.
Unprecedented human-like loco-manipulation abilities on a real humanoid robot.
TWIST, the Teleoperated Whole-Body Imitation System, utilizes data captured by MoCap devices to precisely track the body movements of humans. Compared to many teleoperation systems introduced in the past, TWIST leverages joints across the entire bodies of humanoid robots to closely replicate human movements, while also ensuring that the motions of different limbs are coordinated.