GitHub - humansensinglab/PATMAT: [ICCV 2023] PATMAT Person Aware Tuning of Mask Aware Transformer for Face Inpainting (original) (raw)

PATMAT enables personalization of Mask-Aware Transformer model, provided refrence images of a face.

Description

Our two-step framework; PAT and MAT build extensively on Pivot Tuning's (PTI) paper + code and MAT's paper + code.

Getting Started

Installation

Clone the repository.
git clone https://github.com/humansensinglab/PATMAT
Install the dependencies.
- Python 3.7
- PyTorch 1.7.1
- Cuda 11.0
- Other packages
  pip install -r requirements.txt

Pretrained Models

Please download the pretrained models from the following links.

Auxiliary Models for PAT

various auxiliary models needed for PAT inversion task.
This includes the StyleGAN generator and pre-trained models used for loss computation.

Path	Description
FFHQ StyleGAN	StyleGAN2-ada model trained on FFHQ with 1024x1024 output resolution.
Dlib alignment	Dlib alignment used for images preproccessing.
FFHQ e4e encoder	Pretrained e4e encoder. Used for StyleCLIP editing.
Glinnt360k can be downloaded from this link: https://drive.google.com/file/d/1pRDYnndOUemVrZaFV6ZGpH3eQowQpQlL/view?usp=sharing

Auxiliary Models for MAT

MAT repo provides models trained on CelebA-HQ, FFHQ and Places365-Standard at 512x512 resolution. Download models from One Drive and put them into the 'pretrained' directory. Note: The StyleGAN model is used directly from the official stylegan2-ada-pytorch implementation. For StyleCLIP pretrained mappers, please see StyleCLIP's official routes

By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models. However, you may use your own paths by changing the necessary values in configs/path_configs.py.

Running PAT

The main training script is PAT/scripts/run_pat.py. The script receives aligned and cropped images from paths configured in the "Input info" subscetion inconfigs/paths_config.py where you can also find out about the data structure and file naming convention. I am hoping to make thgis step more user friendly but for now please bear with me. Results are saved to directories found at "Dirs for output files" under configs/paths_config.py. This includes inversion latent codes and tuned generators. The hyperparametrs for the inversion task can be found at configs/hyperparameters.py.

Running MAT

To inpaint desired images after tuning your network with PAT, you can run:

python generate_image.py --network model_path --dpath data_path --refpath reference_path --outdir out_path [--mpath mask_path]

where model_path is the path to PAT's output model and reference_path is a few reference images of the identity you are inpainting (*can be a subset of PAT's training data).

Credits

Pivot Tuning and implementation:
https://github.com/richzhang/PerceptualSimilarity

MAT model and implementation:
https://github.com/omertov/encoder4editing

StyleGAN2-ada model and implementation:
https://github.com/NVlabs/stylegan2-ada-pytorchCopyright © 2021, NVIDIA Corporation.
Nvidia Source Code License https://nvlabs.github.io/stylegan2-ada-pytorch/license.html

Acknowledgments

This repository structure is based on MAT and Pivot Tuning

Contact

For any inquiry please contact us at our email addresses: sam(dot)motamed(at)insait(dot)ai

Citation

If you use this code for your research, please cite:

@InProceedings{Motamed_2023_ICCV,
    author    = {Motamed, Saman and Xu, Jianjin and Wu, Chen Henry and H\"ane, Christian and Bazin, Jean-Charles and De la Torre, Fernando},
    title     = {PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {22778-22787}
}