Xian Liu's Homepage (original) (raw)

Biography

I am a Research Scientist at NVIDIA Research, Deep Imagination Research Group. I am also a final-year Ph.D. at CUHK Multi-Media Lab (MMLab), supervised by Prof. Dahua Lin, Prof. Ziwei Liu, and Prof. Xihui Liu. Before that, I received the Bachelor's degree at Zhejiang University in 2021, advised by Prof. Xiaowei Zhou.

I am fortunate to have extensive industrial experience during Ph.D. study, with multiple internships at several leading research institutes, including NVIDIA Research, Snap Research, Tencent AI Lab, SenseTime Research, and Shanghai AI Lab.

My research interests include computer vision and generative modeling, especially the foundation GenAI pre-training / post-training, vision-language models, multi-modal tokenizers, and their applications in digital humans and physical AI.

I am always open to discussions and collaborations, feel free to drop me an email if you are interested in :)

News

[Show more]

Industrial Research

Selected Publications [ Full List ] (* indicates equal contribution)

HMAR: Efficient Hierarchical Masked AutoRegressive Image Generation Hermann Kumbong*,Xian Liu*,Tsung-Yi Lin,Xihui Liu,Ziwei Liu,Daniel Y. Fu,Ming-Yu Liu,Christopher RĂ©,David W. Romero. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. CVPR 2025 / Paper (Coming Soon)
T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation Kaiyue Sun,Kaiyi Huang,Xian Liu,Yue Wu, Zihan Xu,Zhenguo Li,Xihui Liu. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. CVPR 2025 / HuggingFace Leaderboard / Project / Code / arXiv
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation Xiao Fu,Xian Liu,Xintao Wang,Sida Peng,Menghan Xia,Xiaoyu Shi,Ziyang Yuan,Pengfei Wan,Di Zhang,Dahua Lin. International Conference on Learning Representations (ICLR), 2025. ICLR 2025 / OpenReview / Project / Code / arXiv / Dataset
High-Quality Joint Image and Video Tokenization with Causal VAE Dawit Mureja Argaw,Xian Liu,Qinsheng Zhang,Joon Son Chung,Ming-Yu Liu,Fitsum Reda. International Conference on Learning Representations (ICLR), 2025. ICLR 2025 / OpenReview
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding Yao Teng, Han Shi,Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li,Xihui Liu. International Conference on Learning Representations (ICLR), 2025. ICLR 2025 / OpenReview / Code / arXiv
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation Jiaxiang Tang,Zhaoshuo Li,Zekun Hao,Xian Liu,Gang Zeng,Ming-Yu Liu,Qinsheng Zhang. International Conference on Learning Representations (ICLR), 2025. ICLR 2025 / OpenReview / Project / Code / arXiv / Mesh Tokenizer
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting Xian Liu,Xiaohang Zhan,Jiaxiang Tang,Ying Shan,Gang Zeng,Dahua Lin,Xihui Liu,Ziwei Liu. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. (Highlight, Top 2.8%) CVPR 2024 / Project / Paper / arXiv / Demo Video / Code / Models
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion Xian Liu,Jian Ren,Aliaksandr Siarohin,Ivan Skorokhodov,Yanyu Li,Dahua Lin,Xihui Liu,Ziwei Liu,Sergey Tulyakov. International Conference on Learning Representations (ICLR), 2024. (Review Score 6, 6, 8, 10, Top 1.6%, Rank) ICLR 2024 / OpenReview / Project / Paper / arXiv / Short Demo (3min) / Long Demo (10min) / Github
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation Xian Liu,Yinghao Xu,Qianyi Wu,Hang Zhou,Wayne Wu,Bolei Zhou. European Conference on Computer Vision (ECCV), 2022. (Oral, Top 2.7%) ECCV 2022 / Paper / Project / Code
Audio-Driven Co-Speech Gesture Video Generation Xian Liu,Qianyi Wu,Hang Zhou,Yuanqi Du,Wayne Wu,Dahua Lin,Ziwei Liu. Advances in Neural Information Processing Systems (NeurIPS), 2022. (Spotlight, Top 5%) NeurIPS 2022 / Paper / Project / Code
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation Xian Liu,Qianyi Wu,Hang Zhou,Yinghao Xu,Rui Qian,Xinyi Lin,Xiaowei Zhou,Wayne Wu,Bo Dai,Bolei Zhou. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. Also appears at CVPR 2022 Sight and Sound Workshop. [5-min Invited Talk] (link) CVPR 2022 / Paper / Poster / Project / Code
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation Xian Liu*,Lingting Zhu*, Xuanyu Liu,Rui Qian,Ziwei Liu,Lequan Yu. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. CVPR 2023 / Paper / Code
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing Xian Liu*, Rui Qian*, Hang Zhou*, Di Hu, Weiyao Lin,Ziwei Liu, Bolei Zhou, Xiaowei Zhou. AAAI Conference on Artificial Intelligence (AAAI), 2022. AAAI 2022 / Paper / Poster
TC4D: Trajectory-Conditioned Text-to-4D Generation Sherwin Bahmani*,Xian Liu*,Yifan Wang*,Ivan Skorokhodov,Victor Rong,Ziwei Liu,Xihui Liu,Jeong Joon Park,Sergey Tulyakov,Gordon Wetzstein,Andrea Tagliasacchi,David B. Lindell. European Conference on Computer Vision (ECCV), 2024. ECCV 2024 / Project / arXiv / Github
Object-Compositional Neural Implicit Surfaces Qianyi Wu,Xian Liu,Yuedong Chen,Kejie Li,Chuanxia Zheng,Jianfei Cai,Jianmin Zheng. European Conference on Computer Vision (ECCV), 2022. ECCV 2022 / Paper / Project / Code
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion Xuan Ju,Xian Liu,Xintao Wang,Yuxuan Bian,Ying Shan,Qiang Xu. European Conference on Computer Vision (ECCV), 2024. ECCV 2024 / Project / arXiv / Demo Video / Dataset / Github / HuggingFace Demo
TextCraftor: Your Text Encoder Can be Image Quality Controller Yanyu Li,Xian Liu,Anil Kag,Ju Hu,Yerlan Idelbayev,Dhritiman Sagar,Yanzhi Wang,Sergey Tulyakov,Jian Ren. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. CVPR 2024 / Project / Code / Paper / arXiv

Experiences

Research Scientist. Jun. 2024 - Now NVIDIA Research, Deep Imagination Research Group. Manager: Ming-Yu Liu.
Generative AI Research Intern, Deep Imagination Research, NVIDIA Research. Mar. 2024 - Jun. 2024 Topic: Image/Video Foundation Models, Tokenizers, Multi-Modal Language Models. Supervised by: Ting-Chun Wang, Yogesh Balaji, Ming-Yu Liu.
Research Visiting Student, Toronto Computational Imaging Group. Dec. 2023 - Mar. 2024 Topic: Text-to-4D Generation. Hosted by: Sherwin Bahmani, David B. Lindell.
Research Intern, Tencent AI Laboratory. Sept. 2023 - Dec. 2023 Topic: Text-Driven 3D Human Generation. Supervised by: Xiaohang Zhan, Ying Shan.
Research Intern, Creative Vision Group, Snap Research. May. 2023 - Sept. 2023 Topic: Human Generation Foundation Model. Supervised by: Jian Ren, Aliaksandr Siarohin, Sergey Tulyakov.
Research Intern, Digital Content Group, Shanghai AI Laboratory. Jul. 2021 - Feb. 2022 Topic: Digital Human, Gesture Generation. Supervised by: Hang Zhou, Wayne Wu.
Research Intern, Intelligent Video Group, SenseTime Research. Aug. 2020 - Jun. 2021 Topic: Digital Human, Face Animation. Supervised by: Qianyi Wu, Bo Dai.

Invited Talks

Professional Services

Selected Honors & Awards

Teaching Experience