GitHub - azrealwang/iwmfdiff (original) (raw)
Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification
Hanrui Wang, Ruoxi Sun, Cunjian Chen, Minhui Xue, Lay-Ki Soon, Shuo Wang, Zhe Jin
Face authentication systems have brought significant convenience and advanced developments, yet they have become unreliable due to their sensitivity to inconspicuous perturbations, such as adversarial attacks. Existing defenses often exhibit weaknesses when facing various attack algorithms and adaptive attacks or compromise accuracy for enhanced security. To address these challenges, we have developed a novel and highly efficient non-deep-learning-based image filter called the Iterative Window Mean Filter (IWMF) and proposed a new framework for adversarial purification, named IWMF-Diff, which integrates IWMF and denoising diffusion models. These methods can function as pre-processing modules to eliminate adversarial perturbations without necessitating further modifications or retraining of the target system. We demonstrate that our proposed methodologies fulfill four critical requirements: preserved accuracy, improved security, generalizability to various threats in different settings, and better resistance to adaptive attacks. This performance surpasses that of the state-of-the-art adversarial purification method, DiffPure.
- This is also an efficient implemntation of diffusion-based purification for face recognition, CIFAR-10, and ImageNet ( --lambda_0 0).
Contents
- Introduction
- Main Requirements
- Installation
- Data Preparation
- Pretrained Models
- Usage
- Results
- Citation
- Acknowledgement
- Contacts
Introduction
The procedure of IWMF defending the authentication system is as follows:
- Step 1: Blur the input image by IWMF. Perturbations on adversarial examples are largely removed, yet facial features are partially distorted.
- Step 2: Restore image by DDRM. Robustness against both genuine images and adversarial examples raises.
- Step 3: Verify the pre-processed image by a regular authentication system. Note that users do not need to re-enroll due to the defense.
Main Requirements
- Python (3.9.18)
- torch (2.1.2+cu118)
- torchvision (0.16.2+cu118)
- PyYAML (6.0.1)
- tqdm (4.66.2)
- facenet-pytorch (2.5.3)
The versions in ()
have been tested.
Installation
git clone https://github.com/azrealwang/iwmfdiff.git
cd iwmfdiff
pip3 install -r requirements.txt
if equipped with GPU:
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
or:
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2
Data Preparation
The image name must satisfy 00000_0.jpg
. 00000
and _0
indicates the image id and user id/class/label, respectively. The image id must be unique and auto-increment from 00000
. .jpg
can be any image file format.
20 target and source images have been prepared in imgs/
for running demos.
Pretrained Models
- InsightFace: iresnet100 pretrained using the CASIA dataset; automatically downloaded
- FaceNet: InceptionResnetV1 pretrained using the VGG2FACE dataset; automatically downloaded
- Denoising diffusion models: pretrained using the CelebA-HQ dataset; automatically downloaded
Sometimes, the download speed of denoising diffusion models is very slow. Then, please manually download the pretrained model from Google Drive and prepare it as the path exp/logs/celeba/celeba_hq.ckpt
.
Usage
Regular attack
python attack.py --attack APGD --norm Linf --eps 0.03 --model insightface --thres 0.6351
Purify adversarial examples
python defense.py --lambda_0 0.25 --sigma_y 0.15 --folder APGD-Linf-0.03-insightface-0.6351 --input imgs/adv --eval_adv --model insightface --thres 0.6351
Purify genuie images
python defense.py --lambda_0 0.25 --sigma_y 0.15 --folder target --input imgs --eval_genuine --model insightface --thres 0.6351
Adaptive attack
python attack.py --attack Adaptive --norm Linf --eps 0.03 --defense 0.25 0.15 --model insightface --thres 0.6351
where the following are partial options:
--model
allowsfacenet
orinsightface
--attack
allowsAPGD
(white-box attack),APGD_EOT
(adaptive attack),Square
(black-box attack), orAdaptive
(strong adaptive)--eval_genuine
runs the task that computes FRR for genuine images before or after purification--eval_adv
runs the task that computes FAR and FRR for adv before or after purification
Other options refer to --help
Import for pre-processing
from fuctions.defense import iwmfdiff
def iwmfdiff(
imgs_input: Tensor,
lambda_0: float,
sigma_y: float,
s: int = 3,
batch: int = 1,
seed: int = None,
data: str = 'celeba_hq', # option for celeba_hq, cifar10, imagenet_256
) -> Tensor:
CIFAR-10 and ImageNet
The defense settings (i.e., lambda_0 and sigma_y) should to be determined accroding to trade-off between clean and robust accuracy.
If the downloading is slow, please manually download the pretrained model from Google Drive and prepare it as the path:
- CIFAR-10:
exp/logs/diffusion_models_converted/ema_diffusion_cifar10_model/model-790000.ckpt
- ImageNet:
exp/logs/imagenet/256x256_diffusion_uncond.pt
Results (%)
Defense | FRR-Genuie | FAR-APGD | FAR-APGD_EOT | FAR-Square | FAR-Adaptive | Time Cost (s) |
---|---|---|---|---|---|---|
Insightface | 0.28 | 100 | 100 | 100 | N/A | N/A |
DiffPure | 5.00 | 17.4 | 17.6 | 20.4 | 99.4 | 3.41 |
IWMF | 6.36 | 9.2 | 7.6 | 28.8 | 80.4 | 0.36 |
IWMF-Diff | 3.22 | 6.6 | 5.0 | 19.8 | 77.4 | 3.79 |
Citation
@article{wang2024iterative,
title={Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification},
author={Wang, Hanrui and Sun, Ruoxi and Chen, Cunjian and Xue, Minhui and Soon, Lay-Ki and Wang, Shuo and Jin, Zhe},
journal={IEEE Transactions on Dependable and Secure Computing},
year={2024},
publisher={IEEE}
}
Acknowledgement
The implementation is partially inspired by:
- AutoAttack (adversarial examples generation)
- DDRM (image restoration)
Contacts
If you have any questions about our work, please do not hesitate to contact us by email.
Hanrui Wang: hanrui_wang@nii.ac.jp