GitHub - adagorgun/awesome-generative-explainability: A collection of research materials on explainable generative models (original) (raw)
🌀 Awesome Generative Explainability
This repository contains frontier research on explainable generative models, with a strong emphasis on diffusion-based models. As generative AI continues to evolve, understanding how and why these models generate outputs is critical for building transparent, safe, and controllable AI systems.
2025
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models, ICLR 2025
[Paper] [Code]
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
[Paper] [Code]
Emergence and Evolution of Interpretable Concepts in Diffusion Models
[Paper]
SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders
[Paper] [Code]
Interpretable Generative Models through Post-hoc Concept Bottleneck, CVPR 2025
[Paper] [Code]
TIDE: Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation
[Paper]
Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations
[Paper] [Code]
Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models, CVPR 2025
[Paper]
Controlling Language and Diffusion Models by Transporting Activations, ICLR 2025
[Paper] [Code]
Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models
[Paper] [Code]
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models, ICLR 2025
[Paper]
A General Framework for Inference-time Scaling and Steering of Diffusion Models
[Paper] [Code]
2024
Training-Free Layout Control with Cross-Attention Guidance, WACV 2024
[Paper] [Code]
Understanding Hallucinations in Diffusion Models through Mode Interpolation, NeurIPS 2024
[Paper] [Code]
Unified Concept Editing in Diffusion Models, WACV 2024
[Paper] [Code]
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models, ECCV 2024
[Paper] [Code]
Exploring Diffusion Time-steps for Unsupervised Representation Learning, ICLR 2024
[Paper] [Code]
Understanding Hallucinations in Diffusion Models through Mode Interpolation, NeurIPS 2024
[Paper] [Code]
Revelio: Interpreting and leveraging semantic information in diffusion models
[Paper] [Code]
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda
[Paper]
Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models
[Paper]
Concept Bottleneck Generative Models, ICLR 2024
[Paper] [Code]
Generalization in Diffusion Models Arises from Geometry-adaptive Harmonic Representations, ICLR 2024 Oral
[Paper] [Code]
ConceptAttention: Diffusion Transformers Learn Interpretable Features
[Paper] [Code]
ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction, ECCV 2024
[Paper] [Code]
Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation, CVPR 2024
[Paper]
PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control, ECCV 2024
[Paper] [Code]
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models, ECCV 2024
[Paper] [Code]
Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models
[Paper]
Explaining Generative Diffusion Models via Visual Analysis for Interpretable Decision-making Process
[Paper]
STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing from Text-to-Image Diffusion Models, CVPR 2025 Highlight
[Paper] [Code]
Trade-offs in Fine-tuned Diffusion Models Between Accuracy and Interpretability
[Paper]
Revelio: Interpreting and leveraging semantic information in diffusion models
[Paper] [Code]
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders
[Paper] [Code]
Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering
[Paper] [Code]
Discovering interpretable directions in the semantic latent space of diffusion models
[Paper]
Exploring low-dimensional subspaces in diffusion models for controllable image editing
[Paper] [Code]
Unified concept editing in diffusion models, WACV 2024
[Paper] [Code]
Interpreting the Weight Space of Customized Diffusion Models, NeurIPS 2024
[Paper] [Code]
2023
Diffusion Self-Guidance for Controllable Image Generation, NeurIPS 2023
[Paper] [Code]
Null-text Inversion for Editing Real Images using Guided Diffusion Models, CVPR 2023
[Paper] [Code]
Prompt-to-Prompt Image Editing with Cross Attention Control, ACL 2023
[Paper] [Code]
Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models, SIGGRAPH 2023
[Paper] [Code]
Concept Algebra for (Score-Based) Text-Controlled Generative Models, NeurIPS 2023
[Paper] [Code]
Cones: Concept Neurons in Diffusion Models for Customized Generation, ICML 2023 Oral
[Paper]
Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models, ICCV 2023
[Paper]
Your Diffusion Model is Secretly a Zero-Shot Classifier
[Paper] [Code]
Diffusion models already have a semantic latent space, ICLR 2023
[Paper] [Code]
Understanding the latent space of diffusion models through the lens of riemannian geometry, NeurIPS 2023
[Paper] [Code]
Emergent Correspondence from Image Diffusion, NeurIPS 2023
[Paper] [Code]
2022 and Earlier
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
[Paper] [Code]
What the DAAM: Interpreting Stable Diffusion Using Cross Attention
[Paper] [Code]
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders, NeurIPS 2022
[Paper] [Code]
Understanding Diffusion Models: A Unified Perspective, 2022
[Paper]
Discovering Latent Concepts Learned in BERT, ICLR 2022
[Paper] [Code]
GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019
[Paper] [Code]
GANSpace: Discovering Interpretable GAN Controls, NeurIPS 2020
[Paper] [Code]