Linear Semantics in Generative Adversarial Networks (original) (raw)

We propose two image editing applications: few-shot Semantic Image Editing and few-shot Semantic Conditional Sampling. Both applications are formulated as an optimization problem: find a latent vector that produces images satisfying given semantic constraints. For SIE, we optimize the initial latent vector to get close to the user-modified semantic mask. For SCS, we optimize several random latent vectors and return the closest one to the user-specific semantic mask.

Figure: comparison of color-based editing, SIE using UNet (baseline), SIE using 8-shot LSE (ours), and SIE using fully trained LSE.

Semantic Image Editing. In many cases, the user may want to control a GAN’s image generation process. For example, they might want to adjust the hair color of a generated facial image from blond to red; and the user may draw a red stroke on the hair to easily specify their intent. However, without explicit notion of semantics, the minimization process may not respect image semantics, leading to undesired changes of shapes and textures. Leveraging LSE, we propose an approached called Semantic Image Editing (SIE) to enable semantic-aware image generation. Our method can edit the images better than the baseline, SIE(UNet). More results are shown in the video below.

Video: more Semantic Image Editing results using our web application.

Semantic Conditional Sampling (SCS) aims to synthesize an image subject to a semantic mask. The baseline for SCS is to use a pretrained segmentation network. The proposed method is to use few-shot LSEs. The comparisons are shown below.