Haozhe Zhao (original) (raw)
Hi, I am currently a graduate student in the Institute of Computational Linguistics at Peking University, advised by Prof. Baobao Chang. Before that, I earned my bachelor’s degree from
Tianjin University. I have had the privilege of collaborating with Prof. Minjia Zhang at the University of Illinois Urbana-Champaign and Prof. Junjie Hu at the University of Wisconsin–Madison. I have also engaged in research projects supported by
Microsoft Research Asia, gaining exposure to both academic and industrial AI research settings.
I am fortunate to have collaborated with researchers at PKU,
UIUC,
Wisc,
THU, and
Microsoft Research Asia.
More information available in the 🔗 Resume
My research interests revolve around Foundation Language Models, Multimodal Learning, and Data-centric AI:
- Data-centric Approaches: Constructing and curating synthetic and structured data to enhance foundation model capabilities while reducing training costs.
- Multimodal Integration: Improving vision-language models through multimodal instruction tuning, alignment strategies, and reducing hallucinations.
My research interest includes neural machine translation and computer vision. I have published 8 papers at the top international AI conferences with total google scholar citations .
🔥 News
- 2024.09: 🎉🎉 Our paper, “UltraEdit: Instruction-based Fine-Grained Image Editing at Scale”, has been accepted by NeurIPS 2024!
📝 Publications
ICLR 2024
MMICL: Empowering vision-language model with multi-modal in-context learning
Haozhe Zhao, Zefan Cai, Shuzheng Si, Xiaojian Ma, Kaikai An, Liang Chen, Zixuan Liu, Sheng Wang, Wenjuan Han, Baobao Chang
- The Twelfth International Conference on Learning Representations(ICLR), 2024.
NeurIPS 2024
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
Haozhe Zhao, Xiaojian Ma, Liang Chen, Shuzheng Si, Rujie Wu, Kaikai An, Peiyu Yu, Minjia Zhang, Qing Li, Baobao Chang
- The Thirty-Eighth Annual Conference on Neural Information Processing Systems(NeurIPS), 2024.
preprint
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Liang Chen*, Zekun Wang*, Shuhuai Ren*, Lei Li*, Haozhe Zhao*, Yunshui Li*, Zefan Cai, Hongcheng Guo, Lei Zhang, Yizhe Xiong, Yichi Zhang, Ruoyu Wu, Qingxiu Dong, Ge Zhang, Jian Yang, Lingwei Meng, Shujie Hu, Yulong Chen, Junyang Lin, Shuai Bai, Andreas Vlachos, Xu Tan, Minjia Zhang, Wen Xiao, Aaron Yee, Tianyu Liu, Baobao Chang
- We investigate the emerging trend in multimodal intelligence: the paradigm of next-token prediction.
📖 Educations
- 2022.09 - 2025.06 (Expected), M.S. in Software Engineering, Peking University.
- 2018.09 - 2022.06, Bachelor in Software Engineering, Tianjin University.