With the release of the new Tensorflow implementation of unsupervised generative network U-GAT-IT, anyone can simply upload a selfie to the ‘Selfie 2 Waifu’ website to create their own AI-generated waifu-style anime character in seconds.
For best results, participants are advised to submit a clear passport-style picture with a plain background. Although the site can also generate male anime results, as the name suggests the system generally performs better on female selfies.
Proposed by researchers from South Korea’s Clova AI Research, NCSOFT, and the Boeing Korea Engineering and Technology Center, U-GAT-IT incorporates a new attention module and new learnable normalization function in an end-to-end setup for unsupervised image-to-image translation.
“Our model guides the translation to focus on more important regions and ignore minor regions by distinguishing between source and target domains based on the attention map obtained by the auxiliary classifier,” the researchers explain. “These attention maps are embedded into the generator and discriminator to focus on semantically important areas, thus facilitating the shape transformation.”
An Adaptive Layer-Instance Normalization (AdaLIN) normalization function helps the model to flexibly control changes in shape and texture without modifying model architecture or hyperparameters. This is essential for translating datasets that contain different amounts of geometry and style changes.
The researchers compared their method with existing models CycleGAN, UNIT, MUNIT, DRIT, AGGAN, and CartoonGAN, evaluating performance on four representative unpaired image translation datasets and a newly created unpaired image dataset consisting of real photos and anime artwork.
In a user study with 135 participants the proposed method scored higher than existing state-of-the-art models on not only style transfer but also object transfiguration.
The paper U-Gat-It: Unsupervised Generative Attentional Networks With Adaptive Layerinstance Normalization for Image-To-Image Translation has been accepted as a conference paper at ICLR 2020 and is on arXiv. The code is on GitHub.
Journalist: Yuan Yuan | Editor: Michael Sarazen