Xian Liu's Homepage (original) (raw)

Biography

I am a Research Scientist at NVIDIA Research, Deep Imagination Research Group. I am also a final-year Ph.D. at CUHK Multi-Media Lab (MMLab), supervised by Prof. Dahua Lin, Prof. Ziwei Liu, and Prof. Xihui Liu. Before that, I received the Bachelor's degree at Zhejiang University in 2021, advised by Prof. Xiaowei Zhou.

I am fortunate to have extensive industrial experience during Ph.D. study, with multiple internships at several leading research institutes, including NVIDIA Research, Snap Research, Tencent AI Lab, SenseTime Research, and Shanghai AI Lab.

My research interests include computer vision and generative modeling, especially the foundation GenAI pre-training / post-training, vision-language models, multi-modal tokenizers, and their applications in digital humans and physical AI.

I am always open to discussions and collaborations, feel free to drop me an email if you are interested in :)

News

[02/2025] Two papers are accepted to CVPR 2025.
[01/2025] Cosmos won the Best of CES, Best of AI, and Best Overall Awards in CNET 2025!
[01/2025] We release Cosmos, a world foundation model platform for Physical AI. Models open-sourced on Github and HF!
[01/2025] Four papers are accepted to ICLR 2025.
[12/2024] One paper is accepted to AAAI 2025.
[11/2024] We release Cosmos-Tokenizer, a suite of SOTA image/video tokenizers with models available on Github and HF!
[09/2024] Honored to receive ECCV 2024 Outstanding Reviewer Award. Great thanks for the recognition!
[07/2024] Two papers are accepted to ECCV 2024.
[05/2024] One paper is accepted to ICML 2024.
[03/2024] Start my internship at NVIDIA Research. See you in Santa Clara!
[03/2024] Two papers are accepted to CVPR 2024, with HumanGaussian accepted as Highlight (Top 2.8%). See you in Seattle!
[01/2024] One paper is accepted to ICLR 2024, with HyperHuman receiving review score of 6, 6, 8, 10 (Top 1.6%, Rank). * [01/2024] I will intern at GenAI Team @ Meta AI Research in 2024 Fall. See you in Menlo Park!
[11/2023] I will intern at Deep Imagination Research @ NVIDIA Research in 2024 Spring with Ming-Yu Liu. See you in Santa Clara!
[11/2023] A high-quality 3D human generation framework HumanGaussian is released, with all the code and models available!
[10/2023] A hyper-realistic human generation foundation model HyperHuman collaborated with Snap Research is on arXiv!
[07/2023] One paper is accepted to ICCV 2023.
[05/2023] Start my internship at Snap Research. See you in Los Angeles!
[03/2023] Two papers are accepted to CVPR 2023.
[03/2023] One paper is accepted to TMLR 2023.
[09/2022] One paper is accepted to NeurIPS 2022, with ANGIE accepted as Spotlight (Top 5%)!
[07/2022] Three papers are accepted to ECCV 2022, with SSP-NeRF accepted as Oral (Top 2.7%)!
[03/2022] One paper is accepted to CVPR 2022.
[12/2021] One paper is accepted to AAAI 2022.

[Show more]

Industrial Research

Selected Publications [ Full List ] (* indicates equal contribution)

	HMAR: Efficient Hierarchical Masked AutoRegressive Image Generation Hermann Kumbong,Xian Liu,Tsung-Yi Lin,Xihui Liu,Ziwei Liu,Daniel Y. Fu,Ming-Yu Liu,Christopher Ré,David W. Romero. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. CVPR 2025 / Paper (Coming Soon)
	T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation Kaiyue Sun,Kaiyi Huang,Xian Liu,Yue Wu, Zihan Xu,Zhenguo Li,Xihui Liu. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. CVPR 2025 / HuggingFace Leaderboard / Project / Code / arXiv
	3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation Xiao Fu,Xian Liu,Xintao Wang,Sida Peng,Menghan Xia,Xiaoyu Shi,Ziyang Yuan,Pengfei Wan,Di Zhang,Dahua Lin. International Conference on Learning Representations (ICLR), 2025. ICLR 2025 / OpenReview / Project / Code / arXiv / Dataset
	High-Quality Joint Image and Video Tokenization with Causal VAE Dawit Mureja Argaw,Xian Liu,Qinsheng Zhang,Joon Son Chung,Ming-Yu Liu,Fitsum Reda. International Conference on Learning Representations (ICLR), 2025. ICLR 2025 / OpenReview
	Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding Yao Teng, Han Shi,Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li,Xihui Liu. International Conference on Learning Representations (ICLR), 2025. ICLR 2025 / OpenReview / Code / arXiv
	EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation Jiaxiang Tang,Zhaoshuo Li,Zekun Hao,Xian Liu,Gang Zeng,Ming-Yu Liu,Qinsheng Zhang. International Conference on Learning Representations (ICLR), 2025. ICLR 2025 / OpenReview / Project / Code / arXiv / Mesh Tokenizer
	HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting Xian Liu,Xiaohang Zhan,Jiaxiang Tang,Ying Shan,Gang Zeng,Dahua Lin,Xihui Liu,Ziwei Liu. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. (Highlight, Top 2.8%) CVPR 2024 / Project / Paper / arXiv / Demo Video / Code / Models
	HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion Xian Liu,Jian Ren,Aliaksandr Siarohin,Ivan Skorokhodov,Yanyu Li,Dahua Lin,Xihui Liu,Ziwei Liu,Sergey Tulyakov. International Conference on Learning Representations (ICLR), 2024. (Review Score 6, 6, 8, 10, Top 1.6%, Rank) ICLR 2024 / OpenReview / Project / Paper / arXiv / Short Demo (3min) / Long Demo (10min) / Github
	Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation Xian Liu,Yinghao Xu,Qianyi Wu,Hang Zhou,Wayne Wu,Bolei Zhou. European Conference on Computer Vision (ECCV), 2022. (Oral, Top 2.7%) ECCV 2022 / Paper / Project / Code
	Audio-Driven Co-Speech Gesture Video Generation Xian Liu,Qianyi Wu,Hang Zhou,Yuanqi Du,Wayne Wu,Dahua Lin,Ziwei Liu. Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight, Top 5%) NeurIPS 2022 / Paper / Project / Code
	Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation Xian Liu,Qianyi Wu,Hang Zhou,Yinghao Xu,Rui Qian,Xinyi Lin,Xiaowei Zhou,Wayne Wu,Bo Dai,Bolei Zhou. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. Also appears at CVPR 2022 Sight and Sound Workshop. [5-min Invited Talk] (link) CVPR 2022 / Paper / Poster / Project / Code
	Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation Xian Liu,Lingting Zhu, Xuanyu Liu,Rui Qian,Ziwei Liu,Lequan Yu. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. CVPR 2023 / Paper / Code
	Visual Sound Localization in the Wild by Cross-Modal Interference Erasing Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin,Ziwei Liu, Bolei Zhou, Xiaowei Zhou. AAAI Conference on Artificial Intelligence (AAAI*), 2022. AAAI 2022 / Paper / Poster
	TC4D: Trajectory-Conditioned Text-to-4D Generation Sherwin Bahmani,Xian Liu,Yifan Wang,Ivan Skorokhodov,Victor Rong,Ziwei Liu,Xihui Liu,Jeong Joon Park,Sergey Tulyakov,Gordon Wetzstein,Andrea Tagliasacchi,David B. Lindell. European Conference on Computer Vision (ECCV*), 2024. ECCV 2024 / Project / arXiv / Github
	Object-Compositional Neural Implicit Surfaces Qianyi Wu,Xian Liu,Yuedong Chen,Kejie Li,Chuanxia Zheng,Jianfei Cai,Jianmin Zheng. European Conference on Computer Vision (ECCV), 2022. ECCV 2022 / Paper / Project / Code
	BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion Xuan Ju,Xian Liu,Xintao Wang,Yuxuan Bian,Ying Shan,Qiang Xu. European Conference on Computer Vision (ECCV), 2024. ECCV 2024 / Project / arXiv / Demo Video / Dataset / Github / HuggingFace Demo
	TextCraftor: Your Text Encoder Can be Image Quality Controller Yanyu Li,Xian Liu,Anil Kag,Ju Hu,Yerlan Idelbayev,Dhritiman Sagar,Yanzhi Wang,Sergey Tulyakov,Jian Ren. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. CVPR 2024 / Project / Code / Paper / arXiv

Experiences

	Research Scientist. Jun. 2024 - Now NVIDIA Research, Deep Imagination Research Group. Manager: Ming-Yu Liu.
	Generative AI Research Intern, Deep Imagination Research, NVIDIA Research. Mar. 2024 - Jun. 2024 Topic: Image/Video Foundation Models, Tokenizers, Multi-Modal Language Models. Supervised by: Ting-Chun Wang, Yogesh Balaji, Ming-Yu Liu.
	Research Visiting Student, Toronto Computational Imaging Group. Dec. 2023 - Mar. 2024 Topic: Text-to-4D Generation. Hosted by: Sherwin Bahmani, David B. Lindell.
	Research Intern, Tencent AI Laboratory. Sept. 2023 - Dec. 2023 Topic: Text-Driven 3D Human Generation. Supervised by: Xiaohang Zhan, Ying Shan.
	Research Intern, Creative Vision Group, Snap Research. May. 2023 - Sept. 2023 Topic: Human Generation Foundation Model. Supervised by: Jian Ren, Aliaksandr Siarohin, Sergey Tulyakov.
	Research Intern, Digital Content Group, Shanghai AI Laboratory. Jul. 2021 - Feb. 2022 Topic: Digital Human, Gesture Generation. Supervised by: Hang Zhou, Wayne Wu.
	Research Intern, Intelligent Video Group, SenseTime Research. Aug. 2020 - Jun. 2021 Topic: Digital Human, Face Animation. Supervised by: Qianyi Wu, Bo Dai.

Invited Talks

AI TIME: HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion.
2024
AI TIME: Audio-Driven Co-Speech Gesture Video Generation.
2023
TechBeat: Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation.
2022

Professional Services

Conference Program Committee / Reviewer: CVPR, ECCV, ICCV, SIGGRAPH, SIGGRAPH Asia, NeurIPS, ICML, ICLR, AISTATS, AAAI.
Journal Reviewer: TPAMI, IJCV, TVCG, EG, CGF, PG.

Selected Honors & Awards

CNET 2025 Best of CES, Best of AI, and Best Overall.
2025
ECCV Outstanding Reviewer Award.
2024
CVPR Travel Award.
2024
ICLR Travel Award.
2024
National Scholarship.
2019, 2020
Hong Kong Ph.D. Fellowship Scheme (HKPFS).
2021- 2025
Outstanding Graduate of Zhejiang Province.
2021
Outstanding Bachelor Thesis Award of Zhejiang University, Top 1%.
2021
UCLA CSST Scholarship Program.
2020
SenseTime Scholarship.
2020
Tang Lixin Scholarship.
2019
First Class Scholarship for Academic Excellence.
2019, 2020

Teaching Experience

ENGG 1120, Linear Algebra for Engineers.
Spring 2022.
ENGG 2440, Discrete Mathematics for Engineers.
Fall 2021.