Fix Chroma attention padding order and update docs to use lodestones/Chroma1-HD by josephrocca · Pull Request #12508 · huggingface/diffusers (original) (raw)
What does this PR do?
Chroma inference is currently incorrect, since the padding token should be added for the transformer forward pass, not for the T5 encoder forward pass. The T5 embedding step should use the regular attention mask.
This change fixes that to align with official code and ComfyUI.
I've also snuck in an update to use the final checkpoint in the docs/comments: https://huggingface.co/lodestones/Chroma1-HD
Top is before fix, bottom is after fix. I used lodestones/Chroma1-Base, since it's what I had on hand. Doesn't seem to be a huge difference, except for first column. Might have a stronger effect for shorter prompts, but I didn't test.
