GitHub - adagorgun/awesome-generative-explainability: A collection of research materials on explainable generative models (original) (raw)

🌀 Awesome Generative Explainability

This repository contains frontier research on explainable generative models, with a strong emphasis on diffusion-based models. As generative AI continues to evolve, understanding how and why these models generate outputs is critical for building transparent, safe, and controllable AI systems.


2025

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models, ICLR 2025
[Paper] [Code]

No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
[Paper] [Code]

Emergence and Evolution of Interpretable Concepts in Diffusion Models
[Paper]

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders
[Paper] [Code]

Interpretable Generative Models through Post-hoc Concept Bottleneck, CVPR 2025
[Paper] [Code]

TIDE: Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation
[Paper]

Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations
[Paper] [Code]

Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models, CVPR 2025
[Paper]

Controlling Language and Diffusion Models by Transporting Activations, ICLR 2025
[Paper] [Code]

Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models
[Paper] [Code]

Restyling Unsupervised Concept Based Interpretable Networks with Generative Models, ICLR 2025
[Paper]

A General Framework for Inference-time Scaling and Steering of Diffusion Models
[Paper] [Code]

2024

Training-Free Layout Control with Cross-Attention Guidance, WACV 2024
[Paper] [Code]

Understanding Hallucinations in Diffusion Models through Mode Interpolation, NeurIPS 2024
[Paper] [Code]

Unified Concept Editing in Diffusion Models, WACV 2024
[Paper] [Code]

Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models, ECCV 2024
[Paper] [Code]

Exploring Diffusion Time-steps for Unsupervised Representation Learning, ICLR 2024
[Paper] [Code]

Understanding Hallucinations in Diffusion Models through Mode Interpolation, NeurIPS 2024
[Paper] [Code]

Revelio: Interpreting and leveraging semantic information in diffusion models
[Paper] [Code]

Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda
[Paper]

Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models
[Paper]

Concept Bottleneck Generative Models, ICLR 2024
[Paper] [Code]

Generalization in Diffusion Models Arises from Geometry-adaptive Harmonic Representations, ICLR 2024 Oral
[Paper] [Code]

ConceptAttention: Diffusion Transformers Learn Interpretable Features
[Paper] [Code]

ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction, ECCV 2024
[Paper] [Code]

Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation, CVPR 2024
[Paper]

PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control, ECCV 2024
[Paper] [Code]

Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models, ECCV 2024
[Paper] [Code]

Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models
[Paper]

Explaining Generative Diffusion Models via Visual Analysis for Interpretable Decision-making Process
[Paper]

STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing from Text-to-Image Diffusion Models, CVPR 2025 Highlight
[Paper] [Code]

Trade-offs in Fine-tuned Diffusion Models Between Accuracy and Interpretability
[Paper]

Revelio: Interpreting and leveraging semantic information in diffusion models
[Paper] [Code]

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders
[Paper] [Code]

Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering
[Paper] [Code]

Discovering interpretable directions in the semantic latent space of diffusion models
[Paper]

Exploring low-dimensional subspaces in diffusion models for controllable image editing
[Paper] [Code]

Unified concept editing in diffusion models, WACV 2024
[Paper] [Code]

Interpreting the Weight Space of Customized Diffusion Models, NeurIPS 2024
[Paper] [Code]

2023

Diffusion Self-Guidance for Controllable Image Generation, NeurIPS 2023
[Paper] [Code]

Null-text Inversion for Editing Real Images using Guided Diffusion Models, CVPR 2023
[Paper] [Code]

Prompt-to-Prompt Image Editing with Cross Attention Control, ACL 2023
[Paper] [Code]

Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models, SIGGRAPH 2023
[Paper] [Code]

Concept Algebra for (Score-Based) Text-Controlled Generative Models, NeurIPS 2023
[Paper] [Code]

Cones: Concept Neurons in Diffusion Models for Customized Generation, ICML 2023 Oral
[Paper]

Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models, ICCV 2023
[Paper]

Your Diffusion Model is Secretly a Zero-Shot Classifier
[Paper] [Code]

Diffusion models already have a semantic latent space, ICLR 2023
[Paper] [Code]

Understanding the latent space of diffusion models through the lens of riemannian geometry, NeurIPS 2023
[Paper] [Code]

Emergent Correspondence from Image Diffusion, NeurIPS 2023
[Paper] [Code]

2022 and Earlier

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
[Paper] [Code]

What the DAAM: Interpreting Stable Diffusion Using Cross Attention
[Paper] [Code]

CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders, NeurIPS 2022
[Paper] [Code]

Understanding Diffusion Models: A Unified Perspective, 2022
[Paper]

Discovering Latent Concepts Learned in BERT, ICLR 2022
[Paper] [Code]

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019
[Paper] [Code]

GANSpace: Discovering Interpretable GAN Controls, NeurIPS 2020
[Paper] [Code]