If you’ve ever wanted to see Einstein play charades, Rodin’s “The Thinker” wink at you, or an ancient Chinese Emperor cast in a Chaplin movie — then the AI-powered video transformation tech you’re looking for is “face reenactment,” which can digitally deliver all such fantastic scenarios.
Unlike face swapping, which transfers a face from one source to another, face reenactment captures the movements of a driver face and expresses them through the identity of a target face. Starting with a dynamic driver face, researchers can manipulate any target face — from today’s celebrities to historical figures, including any age, ethnicity or gender — to perform any humanly possible face-based task.
Previous approaches at synthesizing a reenacted face used generative adversarial networks (GAN), which have demonstrated tremendous ability is a wide range of image generation tasks. GAN-based models however require at least a few minutes of training data for each target.
A big challenge when transferring the facial expressions from a driver face to a target face is identity preservation. Factors such as identity mismatch or dealing with unseen large poses and leakage of driver face details to the target face can significantly degrade and damage the output.
Researchers from the South Korea-based tech company Hyperconnect recently proposed a new framework, MarioNETte, which aims to reenact target faces in a few-shot manner (from even a single image) while preserving identity without any fine-tuning.
“We adopt image attention block and target feature alignment, which allow MarioNETte to directly inject features from the target when generating image,” the researchers explain. “In addition, we propose a novel landmark transformer which further mitigates the identity preservation problem by adjusting for the identity mismatch in an unsupervised fashion, thereby mitigating the identity preservation problem without any additional labeled data.”
The researchers compared MarioNETte with current SOTA face reenactment methods using the VoxCeleb1 and CelebV datasets. They tested the models on self-reenactment, in which the target and the driver identities coincide; and on reenactment of different identities. The results show that MarioNETte outperforms the current SOTA methods in most cases.
The team suggests future work in this area could focus on improving the landmark transformer to better handle landmark disentanglement so that the reenactments will appear even more convincing.
The paper MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets is available on arXiv.
Journalist: Yuan Yuan | Editor: Michael Sarazen