Qwen Image Edit Support by naykun · Pull Request #12164 · huggingface/diffusers (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation22 Commits3 Checks11 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

This pull request introduces support for Qwen Image Edit.

For additional information, please visit the Qwen-Image repository .
If you find our work beneficial, we kindly encourage you to star the repository, which will help accelerate the release of the checkpoint.

cc @yiyixuxu @a-r-r-o-w

add qwen-image-edit support

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.register_buffer("neg_freqs", neg_freqs, persistent=False)

# 是否使用 scale rope
# DO NOT USING REGISTER BUFFER HERE, IT WILL CAUSE COMPLEX NUMBERS LOSE ITS IMAGINARY PART

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sayakpaul for review of rope changes

@naykun We refactored the RoPE logic in #12061 to make it compatible with torch.compile. Would you be able to add the relevant changes for QwenEdit with the current implementation?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the latest commit, I have refactored the code as described in #12061.
Please kindly review and let me know if it looks good. Local testing confirms that it correctly generates images.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aweesome! thank you!

compatible with torch.compile in new rope setting
fix init import
add prompt truncation in img2img and inpaint pipe
remove unused logic and comment
add copy statement
guard logic for rope video shape tuple

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

make fix-copies
update doc

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!
Will merge now!
(Not sure if the RoPE refactor works completely - I think we need to test for recompilation, but that can be updated and reviewed in a follow-up PR, cc @a-r-r-o-w)

hi @naykun, where do you find PREFERRED_QWENIMAGE_RESOLUTIONS? Are you basing it solely on FluxKontext?

hi @naykun, where do you find PREFERRED_QWENIMAGE_RESOLUTIONS? Are you basing it solely on FluxKontext?

Hi @TuanNT-ZenAI , while Qwen-Image-Edit was trained on multiple resolutions, we find that the resolution settings from FluxKontext offer a reliable and consistent baseline for inference. These defaults may be refined in future updates.