Pixel2Style2Pixel: Novel Encoder Architecture Boosts Facial Image-To-Image Translation

In the recently published paper Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation, researchers from Penta-AI and Tel-Aviv University introduce a generic image-to-image translation framework dubbed Pixel2Style2Pixel (pSp).

Unlike previous methods that employ dedicated task-specific architectures, the proposed framework is designed to address a wide range of image-to-image tasks using the same architecture — a global approach that avoids possible locality bias. The method shows strong advantages in tasks such as Face Frontalization, where its encoder can be trained in a fully unsupervised manner to align a given face image to a frontal post with a neutral expression.

The researchers noted that while the state-of-the-art image generation method StyleGAN can produce images with phenomenal realism, it also has a disentangled latent space W where meaningful manipulations can be made. As numerous methods that leverage latent space have shown promising image-to-image translation results, it has become a common practice for researchers to encode real images into an extended latent space, W+, for a wide range of applications such as high-resolution synthesis, multi-modal image synthesis, multi-domain image synthesis, conditional image synthesis, etc. However, performing a fast, direct, and accurate learned inversion of real images into W+ remains a challenge.

The team focused on the task of late space embedding, which aims to retrieve a vector that generates a desired, not necessarily known, image. They proposed a novel encoder architecture tasked with encoding an arbitrary image directly into W+. Since the encoder is based on a feature pyramid network, the style feature vectors are extracted from various pyramid scales and inserted directly into a fixed, object retrained StyleGAN generator in correspondence with tech spatial scales. The researchers observed that as a network is trained with an ID similarity loss, it shows better preservation of identity when compared to previous direct approaches.

In experiments, the team demonstrated that their image-to-image translation framework achieves compelling results across various applications. The researchers propose a global approach can further support multi-modal synthesis through resampling of styles. They also suggest that some inherent assumptions will need further investigation. For example, as the proposed method does not utilize locality, preserving fine details of input images such as earrings or background details has become a challenge.

The paper Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation is available on arXiv.

Reporter: Fangyu Cai | Editor: Michael Sarazen

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.

We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

7 comments on “Pixel2Style2Pixel: Novel Encoder Architecture Boosts Facial Image-To-Image Translation”

Ed

2020-08-27

correction requirement : late space -> latent space

Loading...

Polly

2021-12-24

Working with images is simply necessary in almost all areas of our life. Fortunately, modern tools make this process a lot easier. Any low resolution image you can bring to perfection using tools like this https://imageupscaler.com/ in a few clicks

Loading...

Zaza K Green

2023-06-24

Of course, the system is hard to beat. However, if you are planning to travel to California and you do not have a state ID, you should consider buying a fake id Los Angeles. This document is a high fidelity copy of the original so you won’t have any problems moving around. I advise you to go to the site and take two simple steps to order a high-quality fake ID.

Loading...

Henry Larry

2023-11-21

Exciting breakthrough in image-to-image translation! The Pixel2Style2Pixel framework seems promising, looking forward to seeing its impact on enhancing facial image transformations.
Mobile Auto Detailing Services in West Palm Beach

Loading...

jewelgalore

2023-12-28

Jewelgalore showcases the beauty of jewellery from Pakistan . Explore their collection to discover intricately designed pieces that reflect the cultural richness and craftsmanship of the region.

Loading...

OSH UNIVERSITY

2023-12-28

Osh University, recognized as the international higher school of medicine , is a leading institution for medical education. With a global outlook, it attracts aspiring medical professionals from various countries.

Loading...

Shalamar Hospital

2023-12-28

Shalamar Hospital, home to the best neurosurgeon in Pakistan , provides a commitment to excellence in neurological care, giving patients the highest level of expertise and treatment.

Loading...

Pixel2Style2Pixel: Novel Encoder Architecture Boosts Facial Image-To-Image Translation

Like this:

7 comments on “Pixel2Style2Pixel: Novel Encoder Architecture Boosts Facial Image-To-Image Translation”

Leave a Reply Cancel reply

Related

Share this:

Like this:

7 comments on “Pixel2Style2Pixel: Novel Encoder Architecture Boosts Facial Image-To-Image Translation”

Leave a Reply Cancel reply

Related