GitHub - microsoft/X-Decoder at xgpt (original) (raw)

X-GPT: Connecting generalist X-Decoder with GPT-3

demo

🔥 News

🎶 Introduction

intro

What are the uniques?

This specific project was started with our previous two demos, all-in-one and instruct x-decoder. Then we were inspired by visual-chatgpt developed by our MSRA collegues to use the langchain to empower a conversational X-Decoder and encompassing all the capacities of our single X-Decoder model.

Our X-GPT has several unique and new features:

In the next, we will:

Getting Started

Installation

set up environment

conda create -n xgpt python=3.8 conda activate xgpt

install dependencies

pip3 install torch==1.13.1 torchvision==0.14.1 --extra-index-url https://download.pytorch.org/whl/cu113 python -m pip install 'git+https://github.com/MaureenZOU/detectron2-xyz.git' pip install git+https://github.com/cocodataset/panopticapi.git python -m pip install -r requirements.txt sh install_cococapeval.sh

download x-decoder tiny model

wget https://projects4jw.blob.core.windows.net/x-decoder/release/xdecoder_focalt_last_novg.pt wget https://projects4jw.blob.core.windows.net/x-decoder/release/xdecoder_focalt_vqa.pt

create a folder for image uploading and image retrieval, note image_folder is for image retrieval, image is for cache

mkdir image & mkdir image_pool

export openai key, please follow https://langchain.readthedocs.io/en/latest/modules/llms/integrations/azure_openai_example.html

export OPENAI_API_TYPE=azure export OPENAI_API_VERSION=2022-12-01 export OPENAI_API_BASE=https://your-resource-name.openai.azure.com export OPENAI_API_KEY=

Run Demo

Simply run this single line and enjoy it!!!

python xgpt.py

Acknowledgement

We are highly inspired by visual-chatgpt in the usage of langchain, and build on top of many great open-sourced projects: HuggingFace Transformers, LangChain, Stable Diffusion, InstructPix2Pix.

Citation

@article{zou2022xdecoder,
  author      = {Zou, Xueyan and Dou, Zi-Yi and Yang, Jianwei and Gan, Zhe and Li, Linjie and Li, Chunyuan and Dai, Xiyang and Wang, Jianfeng and Yuan, Lu and Peng, Nanyun and Wang, Lijuan and Lee, Yong Jae and Gao, Jianfeng},
  title       = {Generalized Decoding for Pixel, Image and Language},
  publisher   = {arXiv},
  year        = {2022},
}

Contact Information

For issues to use our X-GPT, please submit a GitHub issue or contact Jianwei Yang (jianwyan@microsoft.com).