Zhuokai Zhao -- Personal Webpage (original) (raw)

Biography

News & Updates

Publications (* indicates co-first authorship, † means joint last author)

2026

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

Yuhang Zhou, Lizhu Zhang, Yifan Wu, Mingyi Wang, Bo Peng, Jiayi Liu, Xiangjun Fan, and Zhuokai Zhao

In submission, 2026

Self-Policy Distillation via Capability-Selective Subspace Projection

Guangya Hao, Yitong Shang, Yunbo Long, and Zhuokai Zhao†, Hanxue Liang†

In submission, 2026

Self-Evolving Multi-Agent Systems via Decentralized Memory

Guangya Hao, Yunbo Long, and Zhuokai Zhao

In submission, 2026

LILO: Bayesian Optimization with Interactive Natural Language Feedback

43rd International Conference on Machine Learning (ICML), 2026

Token-Level LLM Collaboration via FusionRoute

43rd International Conference on Machine Learning (ICML), 2026

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

Jiarui Feng, Hanqing Zeng, Karish Grover, Ruizhong Qiu, Yinglong Xia, Qiang Zhang, Qifan Wang, Ren Chen, Dongqi Fu, Jiayi Liu, Zhuokai Zhao, Xiangjun Fan, Benyu Zhang, Benyu Zhang, and Yixin Chen

43rd International Conference on Machine Learning (ICML), 2026

TARo: Token-level Adaptive Routing for LLM Test-time Alignment

Arushi Rai, Qiang Zhang, Hanqing Zeng, Yunkai Zhang, Dipesh Tamboli, Xiangjun Fan, and Zhuokai Zhao†, Lizhu Zhang†

Findings of The 64th Annual Meeting of the Association for Computational Linguistics (ACL), 2026

Mixture-of-Minds: Multi-Agent Reinforcement Learning for Table Understanding

Yuhang Zhou, Mingrui Zhang, Ke Li, Mingyi Wang, Qiao Liu, Qifei Wang, Jiayi Liu, Fei Liu, Serena Li, Weiwei Li, Mingze Gao, Abhishek Kumar, Xiangjun Fan, and Zhuokai Zhao†, Lizhu Zhang†

The 64th Annual Meeting of the Association for Computational Linguistics (ACL), 2026

Synthetic Sandbox for Training Machine Learning Engineering Agents

Yuhang Zhou*, Lizhu Zhang* Yifan Wu, Jiayi Liu, Xiangjun Fan, and Zhuokai Zhao†, Hong Yan†

In submission, 2026

CircuitProbe: Tracing Visual Temporal Evidence Flow in Video Language Models

Yiming Zhang*, Zhuokai Zhao*, Chengzhang Yu, Kun Wang, Zhendong Chu, Qiankun Li, Zihan Chen, Yang Liu, Zenghui Ding, Yining Sun, and Qingsong Wen

In submission, 2026

Accelerating PDE Surrogates via RL-Guided Mesh Optimization

The 29th International Conference on Artificial Intelligence and Statistics (AISTATS), 2026

Scaling Agent Learning via Experience Synthesis

Zhaorun Chen, Zhuokai Zhao, Kai Zhang, Bo Liu, Qi Qi, Yifan Wu, Tarun Kalluri, Sara Cao, Yuanhao Xiong, Haibo Tong, Huaxiu Yao, Hengduo Li, Jiacheng Zhu, Xian Li, Dawn Song, Bo Li, and Jason Weston†, Dat Huynh†

The Fourteenth International Conference on Learning Representations (ICLR), 2026

2025

Exploring System 1 and 2 Communication for Latent Reasoning in LLMs

Julian Coda-Forno, Zhuokai Zhao, Qiang Zhang, Dipesh Tamboli, Weiwei Li, Xiangjun Fan, Lizhu Zhang, Eric Schulz, and Hsiao-Ping Tseng

39th NeurIPS Workshop on Foundations of Reasoning in Language Models, 2025

Let it Calm: Exploratory Annealed Decoding for Verifiable Reinforcement Learning

In submission, 2025

Thought Communication in Multiagent Collaboration

Advances in Neural Information Processing Systems (NeurIPS) Spotlight, 2025

S'MoRE: Structural Mixture of Residual Experts for LLM Fine-tuning

Hanqing Zeng, Yinglong Xia, Zhuokai Zhao, Gilbert Jiang, Qiang Zhang, Jiayi Liu, Lizhu Zhang, Xiangjun Fan, and Benyu Zhang

Advances in Neural Information Processing Systems (NeurIPS), 2025

MJ-BENCH: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Zhaorun Chen*, Yichao Du*, Zichen Wen*, Yiyang Zhou*, Chenhang Cui, Zhenzhen Weng, Haoqin Tu,

Chaoqi Wang, Zhengwei Tong, Qinglan Huang, Canyu Chen, Qinghao Ye, Zhihong Zhu, Yuqing Zhang, Jiawei Zhou,

Zhuokai Zhao, Rafael Rafailov, Chelsea Finn, and Huaxiu Yao

Advances in Neural Information Processing Systems (NeurIPS), 2025

Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding

Yixiong Fang, Ziran Yang, Zhaorun Chen, Zhuokai Zhao†, and Jiawei Zhou†

Advances in Neural Information Processing Systems (NeurIPS), 2025

StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding

In submission, 2025

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

Yuhang Zhou*, Jing Zhu*, Shengyi Qian, Zhuokai Zhao, Xiyao Wang, Xiaoyu Liu, Ming Li, Paiheng Xu, Wei Ai, and Furong Huang,

Empirical Methods in Natural Language Processing (EMNLP), 2025

Boosting LLM Reasoning via Spontaneous Self-Correction

Xutong Zhao, Tengyu Xu, Xuewei Wang, Zhengxing Chen, Di Jin, Liang Tan, Yen-Ting Lin, Zishun Yu, Zhuokai Zhao, Yun He, Sinong Wang, Han Fang, Sarath Chandar, and Chen Zhu

Conference on Language Modeling (COLM), 2025

RankCLIP: Ranking-Consistent Language-Image Pretraining

Yiming Zhang*, Zhuokai Zhao*, Zhaorun Chen, Zhili Feng, Zenghui Ding, and Yining Sun

International Conference on Computer Vision (ICCV), 2025

Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding

Yiming Zhang, Zhuokai Zhao, Zhaorun Chen, Zenghui Ding, Xianjun Yang, and Yining Sun

International Conference on Computer Vision (ICCV), 2025

Transfer between Modalities with MetaQueries

Xichen Pan, Satya Narayan Shukla, Aashu Singh, Zhuokai Zhao, Shlok Kumar Mishra, Jialiang Wang, Zhiyang Xu, Jiuhai Chen, Kunpeng Li, Felix Juefei-Xu, Ji Hou, and Saining Xie

In submission, 2025

CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning

Hao Yu, Zhuokai Zhao, Shen Yan, Lukasz Korycki, Jianyu Wang, Baosheng He, Jiayi Liu, Lizhu Zhang, Xiangjun Fan, and Hanchao Yu

ICCV Findings, 2025

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Yuhong Zhang*, Guanlin Wu*, Ling-Hao Chen, Zhuokai Zhao, Jing Lin, Xiaoke Jiang, Jiamin Wu, Zhuoheng Li, Hao Frank Yang, Haoqian Wang, and Lei Zhang

The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025

Autonomous Multimodal Reasoning via Implicit Chain-of-Vision

Yiqiao Huang*, He Qi*, Zhaorun Chen, Haopeng Zhang, Hanchao Yu, and Zhuokai Zhao

CVPR Workshop on Multimodal Algorithmic Reasoning Oral Presentation, 2025

Quantifying Generalization Complexity for Large Language Models

International Conference on Learning Representations (ICLR), 2025

Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment

Chaoqi Wang*, Zhuokai Zhao*, Yibo Jiang*, Zhaorun Chen*, Chen Zhu, Yuxin Chen, Jiayi Liu, Lizhu Zhang, Hao Ma, and Sinong Wang

In submission, 2025

2024

Direct Acquisition Optimization for Low-Budget Active Learning

Zhuokai Zhao, Yibo Jiang, and Yuxin Chen

38th NeurIPS Workshop on Bayesian Decision-making and Uncertainty Spotlight Talk, 2024

Evaluating Machine Learning Models with NERO: Non-Equivariance Revealed on Orbits

38th NeurIPS Workshop on Interpretable AI, 2024

EscIRL: Evolving Self-Contrastive IRL for Trajectory Prediction in Autonomous Driving

Siyue Wang*, Zhaorun Chen*, Zhuokai Zhao, Chaoli Mao, Yiyang Zhou, Jiayu He, and Albert Sibo Hu

8th Annual Conference on Robot Learning (CoRL), 2024

Preference Optimization with Multi-Sample Comparisons

In submission, 2024

Multimodal Guidance Network for Missing-Modality Inference in Content Moderation

Zhuokai Zhao, Harish Palani, Tianyi Liu, Lena Evans, and Ruth Toner

IEEE International Conference on Multimedia and Expo (ICME), 2024

PANDORA: Detailed LLM Jailbreaking via Collaborated Phishing Agents with Decomposed Reasoning

12th ICLR Workshop on Secure and Trustworthy Large Language Models, 2024

Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint Safeguards

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding

41st International Conference on Machine Learning (ICML), 2024 Preliminary version appeared in 12th ICLR Workshop on Reliable and Responsible Foundation Models, 2024

AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition

Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024 Preliminary version appeared in ICLR Workshop on Reliable and Responsible Foundation Models, 2024

2023

RELAX: Reinforcement Learning Enabled 2D-LiDAR Autonomous System for Parsimonious UAVs

Guanlin Wu, Zhuokai Zhao, and Yutao He

39th AAAI Workshop on Planning and Reinforcement Learning (PRL), 2023

Breaking the Curse of Quality Saturation with User-Centric Ranking

29th Conference on Knowledge Discovery and Data Mining (KDD), 2023

2020

System and Method for Assisted Patient Positioning

U.S. Patent No. 10,783,655, 2020

Early Feasibility Studies of Augmented Reality Navigation for Lateral Skull Base Surgery

Otology & Neurotology 41(7):p 883-888, 2020

Dissertations

Enhanced Data Utilization for Efficient and Trustworthy Deep Learning

Zhuokai Zhao

Ph.D. in Computer Science, 2024

Utilizing Both Past and Future: Multi-Frame Memory Based Network in Solving Particle Image Velocimetry

Zhuokai Zhao

MS in Computer Science, 2021

Other Projects

OpenChemistry/Stempy: Stable Version

Zenodo, 2024

Trajectory Planning and Control for Nonholonomic Robot Among Onstacles

Nonlinear Control and Planning in Robotics, 2018

Head-Mounted Display Integration for Orthopedic Surgery

Advanced Computer-Integrated Surgery, 2017

Service

Conference Area Chair

Conference Reviewer

Journal Reviewer

Last Updated: June 11, 2026