Fix unloading of LoRAs when xformers attention procs are in use by isidentical · Pull Request #4179 · huggingface/diffusers (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation7 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

isidentical

What does this PR do?

Fixes #4178. The unloading lora logic now respects LoRAXFormersAttnProcessor and I also consolidated all lora related attention processors in a single place so they can be referenced from other parts of the codebase (already found 2 sites that needed it and replaced them).

@isidentical

isidentical

@@ -464,6 +464,14 @@ def test_lora_unet_attn_processors_with_xformers(self):
if isinstance(module, Attention):
self.assertIsInstance(module.processor, LoRAXFormersAttnProcessor)
# unload lora weights
sd_pipe.unload_lora_weights()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to run this test as is (although I've confirmed the fix works with testing it on a production application of ours). If anyone has a machine with the proper requirements and can run it to ensure it is not flaky, i'd be super glad (and if it is failing, let me know, i'll try my best to fix it)!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with RUN_SLOW=1 pytest python tests/models/test_lora_layers.py and it passed. So, I guess we're good here.

@HuggingFaceDocBuilderDev

The documentation is not available anymore as the PR was closed or merged.

isidentical

if unet_attention_classes.issubset(LORA_ATTENTION_PROCESSORS):
# Handle attention processors that are a mix of regular attention and AddedKV
# attention.
if len(unet_attention_classes) > 1 or LoRAAttnAddedKVProcessor in unet_attention_classes:
self.unet.set_default_attn_processor()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still has limited support for AddedKVProcessors with xformers, e.g. it reverts back to the original state. Although I am not super sure if it can ever happen since i haven't seen it in the wild

sayakpaul

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

patrickvonplaten

@@ -1623,6 +1623,13 @@ def __call__(self, attn: "Attention", hidden_states, encoder_hidden_states=None,
CustomDiffusionXFormersAttnProcessor,
]
LORA_ATTENTION_PROCESSORS = (

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean!

patrickvonplaten

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the fix!

orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request

Aug 1, 2023

@isidentical @orpatashnik

orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request

Aug 1, 2023

@isidentical @orpatashnik

orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request

Aug 1, 2023

@isidentical @orpatashnik

yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request

Dec 25, 2023

@isidentical