Add Support for Z-Image Series by JerryWu-code · Pull Request #12703 · huggingface/diffusers (original) (raw)

@yiyixuxu
By the way, while testing the _flash_3 and _flash_varlen_3 backends, we noticed that the current implementation in attention_dispatch.py is incompatible with the latest Flash Attention 3 APIs.

The recent FA3 commit (Dao-AILab/flash-attention@203b9b3) introduced a return_attn_probs argument and changed the default behavior. The functions now return a single output tensor by default (instead of a tuple), which causes the current tuple unpacking logic in diffusers to fail:

We have implemented a fix that handles this while maintaining backward compatibility:

JerryWu-code@de4c6f1#diff-b027e126a86a26981384b125714e0f3bd9923eaa8322f1ae5f6b53fe3e3481c2

Should we include this fix in the current PR, or would you prefer us to open a separate PR for it?