Linli Yao (original) (raw)
I'm a PhD student at the Language Computing and Machine Learning Group (Lanco), MOE Key Laboratory of Computational Linguistics, School of Computer Science, Peking University. I am supervised by Prof. Xu Sun.
I received my Master's degree and Bachelor's degree from Renmin University of China (RUC) in 2023 and 2020 respectively, advised by Prof. Qin Jin who directs the AI·M3 Lab.
Research Interests
- Multi-modal Understanding and Generation
- Vision and Language
- Large Multi-modal Models
Publications (Full List)
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
(* indicates equal contribution)
Preprint, arxiv:2504.17343, 2025.
Generative Frame Sampler for Long Video Understanding
Linli Yao, Haoning Wu, Kun Ouyang, Yuanxing Zhang, Caiming Xiong, Bei Chen, Xu Sun, Junnan Li
Preprint, arxiv:2503.09146, 2025.
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Linli Yao, Lei Li, Shuhuai Ren, Lean Wang, Yuanxin Liu, Xu Sun, Lu Hou
Preprint, arxiv:2405.20985, 2024.
Temporal Reasoning Transfer from Text to Video
Lei Li*, Yuanxin Liu*, Linli Yao, Peiyuan Zhang, Chenxin An, Lean Wang, Xu Sun, Lingpeng Kong, Qi Liu
ICLR 2025.
Edit As You Wish: Video Caption Editing with Multi-grained User Control
Linli Yao, Yuanmeng Zhang, Ziheng Wang, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Xu Sun, Qin Jin
ACM MM 2024.
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Shuhuai Ren*, Linli Yao*, Shicheng Li, Xu Sun, Lu Hou
(* indicates equal contribution)
CVPR 2024.
UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos
Yuting Mei, Linli Yao, Qin Jin
ICMR 2024.
LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-text Generation?
Yuchi Wang, Shuhuai Ren, Rundong Gao, Linli Yao, Qingyan Guo, Kaikai An, Jianhong Bai, Xu Sun
NAACL 2024.
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge
Linli Yao, Weijing Chen, Qin Jin
The Web Conference (WWW) 2023.
Rethinking Benchmarks for Cross-modal Image-text Retrieval
Weijing Chen, Linli Yao, Qin Jin
SIGIR 2023, long paper.
Image Difference Captioning with Pre-training and Contrastive Learning
Linli Yao, Weiying Wang, Qin Jin
AAAI 2022 .
Education
2023.09 - Present | PhD Student | School of Computer Science, Peking University |
---|---|---|
2020.09 - 2023.06 | Master | School of Information, Renmin University of China |
2016.09 - 2020.06 | Bachelor | School of Information, Renmin University of China |
Experience
2022.10 - 2023.07
Research Intern
Alimama CV&NLP Group @ Alibaba, Advised by Tiezheng Ge.
2022.04 - 2022.10
Organizer / Workshop Chair
Person in Context (PIC) Workshop @ ACM MM 2022
The MTVG and MDVC tasks attracted participation from 40 teams worldwide, including prestigious institutions such as Tsinghua University, Peking University, and the University of Hong Kong. It also included industry teams like Tencent, JD.com, Xiaomi, and Bilibili.
Awards
2022 | National Scholarship | Ministry of Education of China |
---|---|---|
2023 & 2020 | Outstanding Graduate | Renmin University of China |
2022 & 2021 | 1st Class Grade Scholarship | Renmin University of China |
2021 & 2018 | Merit Student | Renmin University of China |
2019 | 1st Prize of China Undergraduate Mathematical Contest in Modeling (Beijing) | Beijing |
2018 | Meritorious Winner of American Mathematical Contest In Modeling | U.S. |
Academic Service
- Reviewer: AAAI 2023/2024, CVPR 2024/2025, NeurIPS 2024, ACM MM 2024/2025, Transactions on Image Processing.
- Teaching assistant: Spoken Language Processing (RUC, 2020), Multimedia Application Technology (RUC, 2020), Academic Criterion and Writing (RUC, 2022), Human Language and Artificial Intelligence (PKU, 2024).