Embodied AI Workshop (original) (raw)

Overview

Minds live in bodies, and bodies move through a changing world. The goal of embodied artificial intelligence is to create agents, such as robots, which learn to creatively solve challenging tasks requiring interaction with the environment. While this is a tall order, fantastic advances in deep learning, the explosive growth of large language models, and the increasing availability of large datasets like ImageNet have enabled superhuman performance on a variety of AI tasks previously thought intractable. Computer vision, speech recognition and natural language processing have experienced transformative revolutions at passive input-output tasks like language translation and image processing, and reinforcement learning has similarly achieved world-class performance at interactive tasks like games. These advances have supercharged embodied AI, enabling a growing collection of researchers to make rapid progress towards intelligent agents which can:

See: perceive their environment through vision or other senses.
Talk: hold a natural language dialog grounded in their environment.
Listen: understand and react to audio input anywhere in a scene.
Act: navigate and interact with their environment to accomplish goals.
Reason: consider and plan for the long-term consequences of their actions.

The goal of the Embodied AI workshop is to bring together researchers from computer vision, language, graphics, and robotics to share and discuss the latest advances in embodied intelligent agents. EAI 2026’s overaching theme is World Models for Embodied AI: embodied AI agents that create models of the world to help them imagine and act, or to help researchers to test and evaluate them. This umbrella theme is divided into three topics:

World Models for Action and Evaluation Explores both dynamics models which incorporate physics and geometry, and video models where dynamics are implicit.
The Resurgence of Classic Methods Examining new applications of techniques such as reinforcement learning and model-predictive control to embodied AI.
Long-Horizon Embodied Intelligence Explores benchmarks and methods for multi-step tasks, robust testing, and, in particular, safe operation.

For more information on the Embodied AI Workshop series, see our

Retrospectives

paper on the first three years of the workshop. For the latest updates, follow the Embodied AI Medium blog at

medium.com/embodied-artificial-intelligence.

Attending

The Embodied AI 2026 workshop was held in conjunction with CVPR 2026 in Denver, Colorado. It featured a host of invited talks covering a variety of topics in Embodied AI, many exciting Embodied AI challenges, a poster session, and panel discussions. The Embodied AI workshop was held in-person with remote options on June 4th from 8:45 to 5:30 MDT:

In-Person: Workshop talks and panels were held in room 107 from 8:45-noon and 1:30-5:30 MDT.
Remote: Zoom info for remote CVPR attendees can be found on our official CVPR workshop page.
Questions: We will have a microphone; also questions (in-person or remote) can be asked via Slack at:
Posters: Posters were held in Exhibit Hall A from 12:00 PM to 1:30 PM MDT at boards 262 - 276. Oral presentations were held in room 107 from 4:30-5:00 PM MDT.
Printing: Information on poster printing was available on CVPR's website.

For late-breaking updates from CVPR, see the workshop's CVPR page.

Timeline

Workshop Announced

February 2nd, 2026

Paper Submission Deadline

Paper Notification Deadline

~~April 24th~~

May 27th, 2026

Challenge Submission Deadlines

May-June, 2026. Check each challenge for the specific date.

Camera Ready Copy Deadline

Seventh Annual Embodied AI Workshop at CVPR

Challenge Winners Announced

At the workshop. Check each challenge for specifics.

Workshop Schedule

Embodied AI will be a hybrid workshop, with both in-person talks and streaming via zoom.

Workshop Talks: 8:45AM-5:30PM MDT - Room 107
Poster Session: 12:00PM-1:30PM MDT - Exhibit Hall A Boards 262-276
Virtual Sessions: Workshop page available to registered CVPR attendees.

Note an earlier version of the website said CDT, but the timezone is MDT, the same as the rest of CVPR.
Zoom information can be found for CVPR attendees on our official CVPR workshop page when it becomes available.
Remote and in-person attendees are welcome to ask questions via Slack:

Workshop Introduction: Embodied AI
8:45 - 9:00 AM MDT
Location: Room 107
Anthony Francis
Logical Robotics
Challenge Presentations - Winning Methods
9:00 - 10:00 AM MDT
Location: Room 107
Moderator - David Hall
CSIRO
Challenge Q&A
10:00 - 10:30 AM MDT
Location: Room 107
Invited Talk - Siyuan Huang, BIGAI
Title: Understanding the 3D World for General Agents
10:30 - 11:00 AM MDT
Location: Room 107
Bio: Siyuan Huang is a Research Scientist at the Beijing Institute for General Artificial Intelligence (BIGAI), directing the Center of Embodied AI and Robotics. He received his Ph.D. from the Department of Statistics at the University of California, Los Angeles (UCLA). His research aims to build a general robot capable of understanding and interacting with 3D environments like humans. His research has received multiple awards including the best paper award of CoRL2025 and several workshop best papers.
Abstract: While current world models exhibit impressive predictive capabilities, their reliance on 2D image sequences masks a critical lack of genuine geometric, spatial, and physical understanding. For general embodied agents to interact reliably wi... [Expand]
Invited Talk - Stefan Leutenegger, ETH Zurich
Title: Spatial AI and Robot Learning for the Real World
11:00 - 11:30 AM MDT
Location: Room 107
Stefan Leutenegger
ETH Zurich
Bio: Prof. Dr. Stefan Leutenegger is an Associate Professor in the Department of Mechanical and Process Engineering of ETH Zurich.
Invited Talk - Lewis Chiang, Google DeepMind
Title: Why Are Robot Agents So Hard?
11:30 AM - 12:00 PM MDT
Location: Room 107
Lewis Chiang
Google DeepMind
Bio: Lewis Chiang is a Research Scientist at Google DeepMind, where he works on Gemini Robotics. His research focuses on developing real-time robot agents. Prior to joining Google DeepMind, Lewis worked at Waymo, where he worked on motion prediction and planning.
Lunch / Accepted Papers Poster Session
12:00 PM - 1:30 PM MDT
Location: Exhibit Hall A, Boards 262 - 276
Invited Talk - Ruiqi Gao, Google DeepMind
Title: World Models for Embodied AI
1:30 - 2:00 PM MDT
Location: Room 107
Bio: I am a Research Scientist at Google DeepMind. I am mainly interested in generative models and representation learning. My recent research focus is to construct powerful generative AI models that can comprehend, generate, and reason with multi-modal data, including natural language, images, videos and 3D. I obtained my Ph.D. from UCLA advised by Song-Chun Zhu and Ying Nian Wu. Prior to that, I received my B.S. degree of Statistics from Peking University..
Invited Talk - Tapomayukh Bhattacharjee, Cornell University
Title: Embodied Intelligence for Physical Contact with Humans: Towards Safe Caregiving Robots in the Real World
2:30 - 3:00 PM MDT
Location: Room 107
Tapomayukh Bhattacharjee
Cornell University
Bio: Tapomayukh "Tapo" Bhattacharjee is an Assistant Professor in the Department of Computer Science at Cornell University where he directs the EmPRISE Lab (https://emprise.cs.cornell.edu/). He completed his Ph.D. in Robotics from Georgia Institute of Technology and was an NIH Ruth L. Kirschstein NRSA postdoctoral research associate in Computer Science & Engineering at the University of Washington. His primary research interests are in the area of physical robot caregiving and physical human-robot interaction. He is the recipient of TRI Young Faculty Researcher Award'24, NSF CAREER Award'23, AFCEA 40 under 40 Award'22, and his work has won Best Systems Paper Award at HRI’26, Best Paper Award at RSS’25, Best Paper and Student Paper Award Finalist and Best HRI Paper Award Finalist at ICRA’25, Best Systems Paper Award Finalist at HRI'24, Best Demo Award at HRI'24, Best RoboCup Paper Award at IROS’22, Best Paper Award Finalist and ABB Best Student Paper Award Finalist at IROS’22, Best Technical Advances Paper Award at HRI'19, and Best Demonstration Award at NeurIPS’18. His work has also been featured in many media outlets including the BBC, Reuters, New York Times, IEEE Spectrum, and GeekWire and his robot-assisted feeding work was selected to be one of the best interactive designs of 2019 by Fast Company.
Abstract: Physical contact with humans remains one of the most important and underexplored challenges in embodied AI. To operate safely and effectively in real-world environments shared with humans, robots must reason about and adapt to the diverse b... [Expand]
Invited Talk - Yilun Du, Harvard
Title: World Models for Robot Manipulation and Planning
2:30 - 3:00 PM MDT
Location: Room 107
Bio: I am an Assistant Professor at Harvard in the Kempner Institute and CS, where I run the Embodied Minds lab. I received my PhD at MIT EECS, advised by Prof. Leslie Kaelbling, Prof. Tomas Lozano-Perez and Prof. Joshua B. Tenenbaum. Previously, I also obtained my bachelor's degree from MIT, was a research fellow at OpenAI, and a senior research scientist at Google DeepMind. My research focuses on generative models, decision making, robot learning, embodied agents, and the applications of such tools to scientific domains.
Abstract: I'll talk about a couple methods in which world models can be useful for robotics applications. First, I'll talk about how they can be used as policies or imaginations depicting what to do in future steps. I'll talk about how they can be us... [Expand]
Invited Talk - Wayne Wu, UCLA
Title: From Scaling up to Scaling out: Reality World Simulators for Physical AI
3:00 - 3:30 PM MDT
Location: Room 107
Bio: I am an AI Researcher in the Department of Computer Science at the University of California, Los Angeles (UCLA), working with Bolei Zhou, and collaborating with Trevor Darrell (UC Berkeley EECS) and Jiaqi Ma (UCLA CEE). I was a Visiting PhD at Nanyang Technological University, working with Chen Change Loy. I received my Ph.D. from the Department of Computer Science and Technology at Tsinghua University.
Abstract: Recent progress in large language and vision models demonstrates how far we can go by scaling with vast internet-scale data. In contrast, physical AI, agents that perceive and act in the real world, still lags far behind. Today, both academ... [Expand]
Industry Talk - Sarah Parisot, Microsoft Research Cambridge
Title: Building World Models for Creative Use
3:30 - 4:00 PM MDT
Location: Room 107
Sarah Parisot
Microsoft Research Cambridge
Bio: I am a Principal Researcher in the Game Intelligence(opens in new tab) team which develops novel machine learning technology with applications to video games and beyond. My research interests and experience include parameter efficient learning, computer vision and generative AI. My recent work has focused on text-to-image generative models, with an emphasis on controllability and interactivity. Prior to joining Microsoft, I was a Senior Research Scientist and Team Leader at Huawei Noah’s Ark Lab in London.
World models offer a path toward interactive, co‑creative systems that support iteration, exploration, and sustained creative control. To be useful to creators, such models must balance expressiveness with practical constraints such as data efficienc... [Expand]
Invited Talk - Dinesh Jayaraman, UPenn GRASP Lab
Title: Coding Agent-Driven Robot Learning
4:00 - 4:30 PM MDT
Location: Room 107
Bio: I am an associate professor at UPenn’s GRASP lab, with a primary appointment in CIS, and a secondary appointment in ESE. I lead the Perception, Action, and Learning (PennPAL) Research Group, where we work on problems at the intersection of robotics, machine learning, and computer vision.
Dinesh Jayaraman
UPenn GRASP Lab
Accepted Paper Highlights
4:30 - 5:00 PM MDT
Location: Room 107
Debate - Long-Horizon Safety in Embodied AI
5:00 - 5:30 PM MDT
Location: Room 107
Moderator - Anthony Francis
Logical Robotics
Workshop Concludes
5:30 PM MDT
Location: Room 107

Challenges

The Embodied AI 2026 workshop is hosting many exciting challenges covering a wide range of topics. More details regarding data, submission instructions, and timelines can be found on the individual challenge websites.

The workshop organizers will award each first-prize challenge winner a cash prize, sponsored by Logical Robotics and our other sponsors.

Challenge winners may be given the opportunity to present during their challenge's presentation at the workshop. Since many challenges can be grouped into similar tasks, we encourage participants to submit models to more than 1 challenge. The table below describes, compares, and links each challenge.

Call for Papers

We invite high-quality 2-page extended abstracts on embodied AI, especially in areas relevant to the themes of this year's workshop:

Embodied AI Solutions
World Models for Action and Evaluation
Classical Methods for Embodied AI
Long-Horizon Embodied Intelligence

as well as themes related to embodied AI in general:

Visual Navigation
Embodied Mobile Manipulation
Embodied Question Answering
Embodied AI Foundation Models
Embodied Vision & Language
Language Model Planning
Advances in Simulation for Embodied AI

Accepted papers will be presented as posters or spotlight talks at the workshop. These papers will be made publicly available in a non-archival format, allowing future submission to archival journals or conferences. Paper submissions do not have to be anononymized. Per

CVPR rules

regarding workshop papers, at least one author must register for CVPR using an in-person registration.

Submission

The submission deadline will close May 15th, 2026 ( Anywhere on Earth - for clarity, 00:01 in GMT as computed by OpenReview). Papers should be no longer than 2 pages (excluding references) and styled in the CVPR format.

Paper submissions will close May 15th, 2026.
Notifications will be sent May 27th 2026.
Camera-ready copies of accepted papers will be due June 1st, 2026.

Accepted Papers

Note. The order of the papers is randomized each time the page is refreshed.

Organizers

The Embodied AI 2026 workshop is a joint effort by many researchers from a variety of organizations. Each year, a set of lead organizers takes point coordinating with the CVPR conference, backed up by a team of workshop organizers, challenge organizers, and scientific advisors.

Lead Organizers

Anthony Francis
Logical Robotics

Organizing Committee

Challenge Organizers

Xiaodan Liang
SYSU, MBZUAI

Yu Sun
SYSU, X Square Robot

Scientific Advisory Board

Claudia Pérez D’Arpino
NVIDIA

Roberto Martín-Martín
Stanford