Fix unloading of LoRAs when xformers attention procs are in use by isidentical · Pull Request #4179 · huggingface/diffusers (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation7 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
What does this PR do?
Fixes #4178. The unloading lora logic now respects LoRAXFormersAttnProcessor
and I also consolidated all lora related attention processors in a single place so they can be referenced from other parts of the codebase (already found 2 sites that needed it and replaced them).
@@ -464,6 +464,14 @@ def test_lora_unet_attn_processors_with_xformers(self): |
---|
if isinstance(module, Attention): |
self.assertIsInstance(module.processor, LoRAXFormersAttnProcessor) |
# unload lora weights |
sd_pipe.unload_lora_weights() |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to run this test as is (although I've confirmed the fix works with testing it on a production application of ours). If anyone has a machine with the proper requirements and can run it to ensure it is not flaky, i'd be super glad (and if it is failing, let me know, i'll try my best to fix it)!
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with RUN_SLOW=1 pytest python tests/models/test_lora_layers.py
and it passed. So, I guess we're good here.
The documentation is not available anymore as the PR was closed or merged.
if unet_attention_classes.issubset(LORA_ATTENTION_PROCESSORS): |
---|
# Handle attention processors that are a mix of regular attention and AddedKV |
# attention. |
if len(unet_attention_classes) > 1 or LoRAAttnAddedKVProcessor in unet_attention_classes: |
self.unet.set_default_attn_processor() |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still has limited support for AddedKVProcessors with xformers, e.g. it reverts back to the original state. Although I am not super sure if it can ever happen since i haven't seen it in the wild
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing!
@@ -1623,6 +1623,13 @@ def __call__(self, attn: "Attention", hidden_states, encoder_hidden_states=None, |
---|
CustomDiffusionXFormersAttnProcessor, |
] |
LORA_ATTENTION_PROCESSORS = ( |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean!
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the fix!
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request