Chop & Learn: Recognizing and Generating Object-State Compositions (original) (raw)
Compositional Image Generation: Given training images of various objects in different states, generate new images of unseen pairs of objects and states.
We consider these methods:
- Stable Diffusion (SD)
- Stable Diffusion + Textual Inversion (SD + TI)
- DreamBooth
- Stable Diffusion + Fine-tuning (FT)
- Stable Diffusion + Textual Inversion + Fine-tuning (SD + TI + FT)
Ground Truth (GT) real images are shown in the first row for reference.
Please select different splits, objects, and states to view the generated images of different compositions.