Image-to-image style translation models have already been impressively applied to transfer natural images into artistic painting styles, to change a scene’s weather or seasonal properties, generate cartoon versions of selfies and so on. Now, researchers from Hacettepe University and Middle East Technical University in Turkey have proposed a novel generator network specialized on the illustrations in children’s books.
Existing approaches for image-to-image transfer include paired and unpaired transfer, optimization-based online methods and offline methods based on convolutional neural networks (CNN) or generative adversarial networks (GAN). However, the current state-of-the-art unpaired image-to-image translation models can struggle to accurately transfer image content and style simultaneously. The new GANILLA (Generative Adversarial Networks for Image to Illustration Translation) generator network tackles this problem in the context of children’s illustrations — a domain the researchers propose is qualitatively different from art paintings and cartoons due in large part to its higher abstraction levels.
The researchers adopted an unpaired approach with two unaligned separate sets: natural image and illustration. They used a dataset comprising 9448 illustrations from 363 different books and 24 different artists, such as Studio Ghibli’s My Neighbor Totoro to generate illustrations from the natural images by transferring illustration style.
To strike a balance between the images’ high abstraction style and their content (compositional elements such as mountains, people, toys, trees, etc.) the GANILLA generator network downsamples the feature map at each residual layer, while skip connections and upsampling merge low level and high level features to improve content transfer accuracy.
The researchers proposed a quantitative evaluation framework based on content and style classifiers as a metric for unpaired image-to-image illustration models. In comparisons with state-of-the-art GAN methods CartoonGAN, CycleGAN and DualGAN using unpaired data, GANILLA outperformed all others in terms of style and content identifiability on this metric, as well as in an online human evaluation study.
The paper GANILLA: Generative Adversarial Networks for Image to Illustration Translation is on arXiv. Related code and pretrained models can be found on GitHub.
Author: Yuqing Li | Editor: Michael Sarazen