GitHub - huijieZH/Diffusion-Model-Generalizability (original) (raw)
Understanding Generalizability of Diffusion Models through Low-dimensional Distribution Learning
This is an official implementation of the paper
- The Emergence of Reproducibility and Consistency in Diffusion Models NeurIPS 2023 workshop Best Paper, ICML 2024
- Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering
The codebase mainly focuses on the implementation of three main figures from these two papers, including:
- "Memorization" and "Generalization" regimes for unconditional diffusion models. (Figure 2 in Paper 1)
- Convergence of the optimal denoiser. (Figure 4 Left in Paper 1)
- Similarity among different unconditional diffusion model settings in generalization regime. (Figure 6 and Figure 12 in Paper 1)
- Low-rank property of the denoising autoencoder of trained diffusion models. (Figure 3 in Paper 2)
For the implementation of Figure 1 (Correspondence between the singular vectors of the Jacobian of the DAE and semantic image attributes) in Paper 2, please go through our concurrent work Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing, the codebase could be found here.
Requirements
conda env create -f edm/environment.yml -n generalizability conda activate generalizability
"Memorization" and "Generalization" regimes for unconditional diffusion models.
Slightly different from Figure 2 in Paper 1, the code we release is under a finetuning setting: the training dataset is generated from a pre-trained diffusion model (teacher model).
Create Dataset
Create a dataset of specific dataset size as follows:
generate images from teacher model
python edm/generate.py --outdir=out --seeds=0-49999 --batch=64 --network=https://nvlabs-fi-cdn.nvidia.com/edm/pretrained/edm-cifar10-32x32-uncond-vp.pkl
create dataset with different size
python edm/dataset_tool.py --source=out --max-images=128 --dest=datasets/synthetic-cifar10-32x32-n128.zip
Training
torchrun --standalone --nproc_per_node=1 edm/train.py --outdir=training --data=datasets/synthetic-cifar10-32x32-n128.zip --cond=0 --arch=ddpmpp --duration 50 --batch 128 --snap 500 --dump 500 --precond vp --model_channels 64
Evaluation
All checkpoints we released can be found here, and all training dataset we released can be found here
generate image from diffusion model, the seeds is different from the one (which is 0-49999) used to generate training images from teacher model.
python edm/generate.py --outdir=evaluation/ddpm-dim64-n64 --seeds=100000-109999 --batch=64 --network=training/ckpt/ddpm-dim64-n64.pkl
python edm/generate.py --outdir=evaluation/ddpm-dim128-n64 --seeds=100000-109999 --batch=64 --network=training/ckpt/ddpm-dim128-n64.pkl
Calculate SSCD feature
torchrun --standalone --nproc_per_node=1 edm/sscd.py feature --images ./evaluation/ddpm-dim64-n64 --features ./evaluation/sscd-dim64-n64.npz
torchrun --standalone --nproc_per_node=1 edm/sscd.py feature --images ./evaluation/ddpm-dim128-n64 --features ./evaluation/sscd-dim128-n64.npz
torchrun --standalone --nproc_per_node=1 edm/sscd.py feature --images datasets/synthetic-cifar10-32x32-n64.zip --features ./evaluation/sscd-training-dataset-synthetic-cifar10-32x32-n64.npz
Compute reproducibility score
python edm/sscd.py rpscore --source ./evaluation/sscd-dim128-n64.npz --target ./evaluation/sscd-dim64-n64.npz
Compute generalization score
python edm/sscd.py mscore --source ./evaluation/sscd-dim128-n64.npz --target ./evaluation/sscd-training-dataset-synthetic-cifar10-32x32-n64.npz
Convergence of the optimal denoiser.
We implement the optimal denoiser (derived from the score function of the empirial distribution). And compare the RP score between real diffusion model and the optimal denoiser.
generate image from optimal denoiser
python edm/generate.py --outdir=evaluation/memorization-n64 --seeds=100000-109999 --batch=64 --optimal_denoiser --dataset=datasets/synthetic-cifar10-32x32-n64.zip --network=https://nvlabs-fi-cdn.nvidia.com/edm/pretrained/edm-cifar10-32x32-uncond-vp.pkl
Calculate SSCD feature
torchrun --standalone --nproc_per_node=1 edm/sscd.py feature --images ./evaluation/memorization-n64 --features ./evaluation/sscd-memorization-n64.npz
Compute reproducibility score
python edm/sscd.py rpscore --source ./evaluation/sscd-dim128-n64.npz --target ./evaluation/sscd-memorization-n64.npz
Similarity among different unconditional diffusion model settings in generalization regime.
We provide generated samples from those different diffusion models here. To generate new samples, you need to go through their own github repo and use the same initial noise for generation.
Calculate SSCD feature
torchrun --standalone --nproc_per_node=1 edm/sscd.py feature --images ./samples/ddpmv4 --features ./evaluation/sscd-ddpmv4.npz
torchrun --standalone --nproc_per_node=1 edm/sscd.py feature --images ./samples/ddpmv6 --features ./evaluation/sscd-ddpmv6.npz
Compute reproducibility score
python edm/sscd.py rpscore --source ./evaluation/sscd-ddpmv4.npz --target ./evaluation/sscd-ddpmv6.npz
Low-rank property of the denoising autoencoder of trained diffusion models.
These figures illustrate the low-dimensionality of the jacobian of the denoising autoencoder (DAE) trained on real dataset and Mixture of Low Rank Gaussian distribution (MoLRG).
To training diffusion model with MoLRG:
torchrun --standalone --nproc_per_node=1 edm/trainMoLRG.py --outdir training --path datasets --img_res 4 --class_num 2 --per_class_dim 7 --sample_per_class 350 --embed_channels 128
To Evaluate rank of the jacobian:
torchrun --standalone --nproc_per_node=1 edm/jacobian.py --network_pkl <path/to/pkl>
e.g. torchrun --standalone --nproc_per_node=1 edm/jacobian.py --network_pkl https://nvlabs-fi-cdn.nvidia.com/edm/pretrained/edm-cifar10-32x32-uncond-vp.pkl
Notably, built upon NVlabs/edm, our codebase is compatible with all training ckpts released from their repo, where you could find here and here.
Acknowledgements
This repository is highly based on NVlabs/edm.
BibTeX
@inproceedings{
zhang2024the,
title={The Emergence of Reproducibility and Consistency in Diffusion Models},
author={Huijie Zhang and Jinfan Zhou and Yifu Lu and Minzhe Guo and Peng Wang and Liyue Shen and Qing Qu},
booktitle={Forty-first International Conference on Machine Learning},
year={2024},
url={https://openreview.net/forum?id=HsliOqZkc0}
}
@article{wang2024diffusion,
title={Diffusion models learn low-dimensional distributions via subspace clustering},
author={Wang, Peng and Zhang, Huijie and Zhang, Zekai and Chen, Siyi and Ma, Yi and Qu, Qing},
journal={arXiv preprint arXiv:2409.02426},
year={2024}
}