AI Technology

DeNA’s PSGAN Dresses Anime Characters at 1024×1024 Pixels

Founded in 1999, Tokyo-based DeNA has developed popular platforms and services for gaming, E-commerce, automotive, healthcare and entertainment content distribution. As AI continues transforming all things digital, DeNA is expanding its deep learning tech capabilities to support R&D on new techniques.

Founded in 1999, Tokyo-based DeNA has developed popular platforms and services for gaming, E-commerce, automotive, healthcare and entertainment content distribution. As AI continues transforming all things digital, DeNA is expanding its deep learning tech capabilities to support R&D on new techniques.

AI-powered automatic generation of characters is a time-saving and cost-effective solution for video game and anime producers. It can also provide creative inspiration. Although a variety of face image generation techniques have already been explored, processes for generating full-body anime characters have thus far been under-researched. The resolution and quality of existing products is relatively low and does not meet the high standards demanded in anime production.

To tackle these challenges, DeNA has proposed Progressive Structure-conditional Generative Adversarial Network (PSGAN): an AI algorithm tailored to the generation of anime characters’ poses and outfits. DeNA’s previous research in this area had generated 512 x 512 pixel resolution images. The upgraded method improves image outputs to 1024 x 1024 pixels.

image (23)
PSGAN generator and discriminator architecture (NxN white boxes on the right represent learnable convolution layers operating on NxN spatial resolution. NxN gray boxes are non-learnable downsampling layers for structural conditions)

The framework architecture is based on NVDIA’s progressive growing of GANs, boosted by the application of multiscale structural conditions. Starting with low-resolution images, the model first learns the large-scale structure of the image distribution and then switches to fine details with the progressive growth of both the generator and discriminator. This enables a steady and speedy training operation and the creation of impressive high resolution images. DeNA researchers also added novel high-dimensional pose maps with corresponding resolutions to each layer, which significantly improved training stabilization.

The training dataset Avatar Anime-Character contains 47,400 images with background elimination for 69 different costumes, and 600 poses in a wide variety. Each pose is constructed with exact coordinates for 20 keypoints.

image (24).png
PSGAN full-body anime character image generation process

Full-body anime generation involves interpolating latent variables to create outfits for an anime character, and then adding poses or actions to the character. Repeating these two steps results in variously dressed and dynamic anime characters.

psgan anime.gif
Animation of generated characters with pose sequences at 1024×1024

PSGAN has a leg up on other full-body character generation software. It creates more natural images with much higher structure consistency than Progressive GAN, and generates smoother and clearer images than the Pose Guided Person Generation Network (PG2) proposed by Ma et al. PSGAN also outperforms these methods in image resolution (1024 x 1024 vs. 256 x 256).

The DeNA paper Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks was presented at the 2018 European Conference on Computer Vision (ECCV) workshop in Munich last month and is now available on arXiv.

Source: Synced China


Localization: Tingting Cao | Editor: Michael Sarazen

 

0 comments on “DeNA’s PSGAN Dresses Anime Characters at 1024×1024 Pixels

Leave a Reply

Your email address will not be published.

%d bloggers like this: