GitHub - OpenGVLab/Ask-Anything: [CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS. (original) (raw)

🦜 VideoChat Family: Ask-Anything

Open in OpenXLab | | | |
Open in Spaces [VideoChat-7B-8Bit] End2End ChatBOT for video and image. Open in Spaces [InternVideo2-Chat-8B-HD]

δΈ­ζ–‡ README 及 中文亀桁羀 | Paper

⭐️: We are also working on a updated version, stay tuned!

πŸ”₯ Updates

πŸ”¨ Getting Started

Build video chat with:

🎬 [End2End ChatBot]

english.mp4

πŸŽ₯ [Communication with ChatGPT]

intro.mp4

πŸ“„ Citation

If you find this project useful in your research, please consider cite:

@article{2023videochat, title={VideoChat: Chat-Centric Video Understanding}, author={KunChang Li, Yinan He, Yi Wang, Yizhuo Li, Wenhai Wang, Ping Luo, Yali Wang, Limin Wang, and Yu Qiao}, journal={arXiv preprint arXiv:2305.06355}, year={2023} }

@inproceedings{li2024mvbench, title={Mvbench: A comprehensive multi-modal video understanding benchmark}, author={Li, Kunchang and Wang, Yali and He, Yinan and Li, Yizhuo and Wang, Yi and Liu, Yi and Wang, Zun and Xu, Jilan and Chen, Guo and Luo, Ping and others}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={22195--22206}, year={2024} }

🌀️ Discussion Group

If you have any questions during the trial, running or deployment, feel free to join our WeChat group discussion! If you have any ideas or suggestions for the project, you are also welcome to join our WeChat group discussion!

image

We are hiring researchers, engineers and interns in General Vision Group, Shanghai AI Lab. If you are interested in working with us, please contact Yi Wang (wangyi@pjlab.org.cn).