[SDXL and IP2P]: instruction pix2pix XL training and pipeline by kfzyqin · Pull Request #4079 · huggingface/diffusers (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation92 Commits89 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

kfzyqin

Support for Training Instruct Pix2Pix with Stable Diffusion XL

Description:

This pull request introduces support for training Instruct Pix2Pix with Stable Diffusion XL. The changes include two new files:

train_instruct_pix2pix_xl.py: This file contains the necessary code to train the Instruct Pix2Pix model using the Stable Diffusion XL method. It includes the training loop, loss function calculations, and backpropagation steps.

pipeline_stable_diffusion_xl_instruct_pix2pix.py: This file provides the pipeline for inferencing with the trained Instruct Pix2Pix model. It handles the pre-processing of input data, model inference, and post-processing of model outputs.

The addition of these files will allow users to train and inference with the Instruct Pix2Pix model of Stable Diffusion XL, which can potentially lead to improved performance and results.

@kfzyqin

@kfzyqin kfzyqin changed the titleSupport instruction pix2pix sdxl SDXL and IP2P: Support instruction pix2pix sdxl

Jul 13, 2023

@HuggingFaceDocBuilderDev

The documentation is not available anymore as the PR was closed or merged.

@kfzyqin

@kfzyqin

Bugs observed. Fixing now.

@kfzyqin

@kfzyqin

@andreemic

Hey what bugs did you observe? I was working on the same thing, maybe I can help.

@kfzyqin

@kfzyqin

Hey what bugs did you observe? I was working on the same thing, maybe I can help.

Training generates weird artifacts like below.
edited_image_xl
Looking into this now.

@kfzyqin

@kfzyqin kfzyqin changed the titleSDXL and IP2P: Support instruction pix2pix sdxl SDXL and IP2P: Support instruction pix2pix sdxl (WIP)

Jul 16, 2023

@kfzyqin

@kfzyqin

@kfzyqin

Hey what bugs did you observe? I was working on the same thing, maybe I can help.

Training generates weird artifacts like below. Looking into this now.

Now bugs have been fixed. Reasonable images can be generated. Working on cleaning up code now.

@kfzyqin

@kfzyqin

@andreemic

Did you take an SDXL training pipeline as base? Or what did you start with?

@kfzyqin

Did you take an SDXL training pipeline as base? Or what did you start with?

I modified the original instruct pix2pix training pipeline to be compatible with SDXL. You may refer to the code.

@kfzyqin

@kfzyqin

@tchambon @kfzyqin

…e#3996)

@kfzyqin

Hi @sayakpaul and @patrickvonplaten,

I have addressed the code review. Once I have run more experiments, I will give more detailed documentation on training IP2P with SDXL.

I will become very busy starting this week, since the semester has started and I need to teach and supervise students. If possible, can we merge this PR? If we have more interesting thoughts, we can address them in the future when I am not so busy.

@sayakpaul

I have addressed the code review. Once I have run more experiments, I will give more detailed documentation on training IP2P with SDXL.

Having some good results documented (like the one you presented on interior design), preferably with the trained checkpoints, will be very beneficial to the community.

Maybe we could simply write something like WIP on the corresponding pages. @patrickvonplaten WDYT?

@kfzyqin

Sure, we can do this in another PR, maybe this weekend. Let's please complete this one now.

@patrickvonplaten

ew. Once I have run more experiments, I will give more detailed documentation on training IP2P with SDXL.

Ok for me!

@kfzyqin

ew. Once I have run more experiments, I will give more detailed documentation on training IP2P with SDXL.

Ok for me!

@patrickvonplaten Can you merge the code? I believe this one is self-contained :-)

@patrickvonplaten

patrickvonplaten

sayakpaul

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge. I think it's safe to revisit some stuff (documentation, examples, etc.) later. At least with the PR, we can enable the community.

@sayakpaul

@sayakpaul

@harutatsuakiyama I took the liberty of adding mentions of SDXL InstructPix2Pix from the REAMDEs and also applied @patrickvonplaten's suggestions.

Will merge after the CI is green :)

@sayakpaul

Thanks for your contribution!

@blldd

@harutatsuakiyama oh wow! InstructPix2Pix is always very dear to my heart. This is why I worked on this: https://huggingface.co/blog/instruction-tuning-sd.
But anyway, could I get some results from your training runs before I do a deep review of your PR if it's not too much?

Sure. I trained on this dataset: https://www.zheyuanliu.me/CIRR/, since I have trained this dataset with IP2P before.

These are some examples:

Has dining room view, Change stools to long white benches, Add a chair on the right corner reference_0 edited_image_xl_new_1

Has similar layout of bed and pillows, Change bed to pink reference_3 edited_image_xl_new_3_2

The dataset is not very pix2pix, hence the change is big.

It's really great, but I follow the training script in reademe, the loss can't converge well, so I wonder if you can share ckpt?

@kfzyqin

@harutatsuakiyama oh wow! InstructPix2Pix is always very dear to my heart. This is why I worked on this: https://huggingface.co/blog/instruction-tuning-sd.
But anyway, could I get some results from your training runs before I do a deep review of your PR if it's not too much?

Sure. I trained on this dataset: https://www.zheyuanliu.me/CIRR/, since I have trained this dataset with IP2P before.
These are some examples:
Has dining room view, Change stools to long white benches, Add a chair on the right corner reference_0 edited_image_xl_new_1
Has similar layout of bed and pillows, Change bed to pink reference_3 edited_image_xl_new_3_2
The dataset is not very pix2pix, hence the change is big.

It's really great, but I follow the training script in reademe, the loss can't converge well, so I wonder if you can share ckpt?

Will give more details on weekends. You will need to increase the batch size to be 4, which will improve performance. Also, be patient. I found loss may oscillate, but generated results are good.

If you have QQ, you can add me: 290956355. We can discuss more.

@blldd

@harutatsuakiyama oh wow! InstructPix2Pix is always very dear to my heart. This is why I worked on this: https://huggingface.co/blog/instruction-tuning-sd.
But anyway, could I get some results from your training runs before I do a deep review of your PR if it's not too much?

Sure. I trained on this dataset: https://www.zheyuanliu.me/CIRR/, since I have trained this dataset with IP2P before.
These are some examples:
Has dining room view, Change stools to long white benches, Add a chair on the right corner reference_0 edited_image_xl_new_1
Has similar layout of bed and pillows, Change bed to pink reference_3 edited_image_xl_new_3_2
The dataset is not very pix2pix, hence the change is big.

It's really great, but I follow the training script in reademe, the loss can't converge well, so I wonder if you can share ckpt?

Will give more details on weekends. You will need to increase the batch size to be 4, which will improve performance. Also, be patient. I found loss may oscillate, but generated results are good.

If you have QQ, you can add me: 290956355. We can discuss more.

Nice

orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request

Aug 1, 2023

…gface#4079)

Co-authored-by: yiyixuxu <yixu310@gmail,com>

force accelerate to be installed


Co-authored-by: Harutatsu Akiyama kf.zy.qin@gmail.com Co-authored-by: Thomas Chambon 36728882+tchambon@users.noreply.github.com Co-authored-by: YiYi Xu yixu310@gmail.com Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen patrick.v.platen@gmail.com Co-authored-by: Sayak Paul spsayakpaul@gmail.com

orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request

Aug 1, 2023

…gface#4079)

Co-authored-by: yiyixuxu <yixu310@gmail,com>

force accelerate to be installed


Co-authored-by: Harutatsu Akiyama kf.zy.qin@gmail.com Co-authored-by: Thomas Chambon 36728882+tchambon@users.noreply.github.com Co-authored-by: YiYi Xu yixu310@gmail.com Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen patrick.v.platen@gmail.com Co-authored-by: Sayak Paul spsayakpaul@gmail.com

orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request

Aug 1, 2023

…gface#4079)

Co-authored-by: yiyixuxu <yixu310@gmail,com>

force accelerate to be installed


Co-authored-by: Harutatsu Akiyama kf.zy.qin@gmail.com Co-authored-by: Thomas Chambon 36728882+tchambon@users.noreply.github.com Co-authored-by: YiYi Xu yixu310@gmail.com Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen patrick.v.platen@gmail.com Co-authored-by: Sayak Paul spsayakpaul@gmail.com

@bghira

thanks to your great work @harutatsuakiyama i created a great extension for SimpleTuner, for general fine-tuning. you might find some improvements or changes there, that you might like. including min-SNR, which greatly assisted with convergence, and options to use D-adaptation for the same reason.

yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request

Dec 25, 2023

…gface#4079)

Co-authored-by: yiyixuxu <yixu310@gmail,com>

force accelerate to be installed


Co-authored-by: Harutatsu Akiyama kf.zy.qin@gmail.com Co-authored-by: Thomas Chambon 36728882+tchambon@users.noreply.github.com Co-authored-by: YiYi Xu yixu310@gmail.com Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen patrick.v.platen@gmail.com Co-authored-by: Sayak Paul spsayakpaul@gmail.com

AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request

Apr 26, 2024

…gface#4079)

Co-authored-by: yiyixuxu <yixu310@gmail,com>

force accelerate to be installed


Co-authored-by: Harutatsu Akiyama kf.zy.qin@gmail.com Co-authored-by: Thomas Chambon 36728882+tchambon@users.noreply.github.com Co-authored-by: YiYi Xu yixu310@gmail.com Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen patrick.v.platen@gmail.com Co-authored-by: Sayak Paul spsayakpaul@gmail.com