In an iconic scene from the 1997 film “Titanic,” Kate Winslet’s oceangoing character Rose asks charming artist Jack Dawson (Leonardo DiCaprio) to “draw me like one of your French girls” — that is, reclining nude on a chaise lounge. A flustered Jack obliges and this kindles a romance, but — spoiler alert — the ship hits an iceberg and Jack perishes protecting Rose from the icy North Atlantic waters. On a more robust vessel who knows what additional portrait styles the young lovebirds might have explored. For example, with the help of a new AI algorithm, Jack could have drawn Rose as a cute cartoon character.
A group of researchers from the Chinese University of Hong Kong, Harbin Institute of Technology and Tencent have proposed a method to create such cartoon faces from photos of human faces via a novel CycleGAN model informed by facial landmarks.
In the paper Landmark Assisted CycleGAN for Cartoon Face Generation, researchers first acknowledge the differences in natural human faces and typical cartoon faces — such as the big dramatic eyes that characterize the latter. Because it is difficult to generate cartoon faces using CycleGAN without explicit correspondence, the researchers introduce face landmarks to define landmark consistency loss and guide the training of the local discriminator. The approach produces promising results generating Bitmoji and Japanese manga style faces from human portrait photos.
Researchers observed that if they only used CycleGAN for the face-to-cartoon conversion, geometric differences between source and target images could produce horrible results such as twisted faces, crooked teeth, displaced noses, etc. To put everything back where it’s supposed to be on the face, more spatial structure information was added to the framework.
Researchers designed a landmark consistency loss and landmark matched global discriminator to make sure the similarity of facial structures remained intact during image generation. They also noted that using face landmarks to define local discriminators produced “visually more plausible result generation.” Even without paired training data, the new method ensures semantic properties can still be matched, literally an eye for an eye.
Researchers selected 37,794 pictures of human full-frontal faces from the classic face dataset CelebA for training and validation. For their two cartoon style threads, researchers manually marked face landmarks for 2,125 Bitmoji styled images gathered from the Internet. The team similarly collected and annotated 17,920 images of Japanese manga characters from Getchu.com.
Performance of the landmark assisted CycleGAN has some limitations. When tested on the Cartoonset10k dataset, the generated faces lose many of the original human image features, and end up looking very similar to one another. Researchers however attribute this to a lack of variance among training samples in the dataset.
The paper Landmark Assisted CycleGAN for Cartoon Face Generation is on arXiv.
Journalist: Fangyu Cai | Editor: Michael Sarazen