Popular repositories Loading
- [CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
Jupyter Notebook 144 7
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
Java 75 6
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
Python 66 3
- [ACL 2025] GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent
Python 21 1
- [CVPR 2025] Official Implementation for Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy
17 2
- [CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant
13 1