AI Machine Learning & Data Science Research

BIGO and iQIYI’s ClothFormer: Realistic Video Virtual Try-on Come True

A research team from BIGO Technology and iQIYI Inc. presents ClothFormer, a novel video virtual try-on framework that preserves clothes’ and humans’ features and details to generate realistic and temporally smooth try-on videos that surpass the outputs of current state-of-the-art virtual try-on systems by a large margin.

Total global retail e-commerce sales have more than tripled over the last six years and are projected to top US$7 trillion by 2025. With fashion claiming an increasing share of this market, suppliers are increasingly deploying AI-powered virtual try-on systems. Such systems are not only changing buyers’ shopping habits and boosting the e-commerce industry, they also have applications in short video and other popular domains. While the quality of image-based virtual try-on methods has dramatically improved, video-based virtual try-on remains relatively underdeveloped, as it is difficult and computationally costly to generate visually pleasing and temporally coherent video results.

In the new paper ClothFormer: Taming Video Virtual Try-on in All Module, a research team from BIGO Technology and iQIYI Inc. presents ClothFormer, a novel video virtual try-on framework that preserves clothes’ and humans’ features and details to generate realistic and temporally smooth try-on videos that surpass the outputs of current state-of-the-art virtual try-on systems by a large margin.

The team summarizes their main contributions as:

  1. A novel warp module that combines the advantages of TPS-based methods and appearance-flow-based methods is designed to address the problem of inaccurate warp due to occlusions appearing in the clothing region.
  2. A tracking module based on ridge regression and optical flow correction is proposed to deform a temporally smooth warped clothing sequence, which provides a prerequisite for the try-on module to generate coherent videos.
  3. The MPDT generator is designed carefully in the try-on module, which can extract and fuse clothing textures, person features and environment information to generate realistic try-on videos. To the best of our knowledge, this is the first time that transformer has been applied to video virtual try-on.

The main limitations of existing video virtual try-on methods are their poor performance with regard to frame consistency and spatio-temporal smoothness. The researchers trace these problems to two factors: 1) Existing models focus too much on the try-on module while neglecting the spatio-temporal dimensions, which leads to blurring and temporal artifacts in the generated videos; and 2) Most models were trained on simple datasets with clean backgrounds, and thus struggle in more complex real-life environments.

The proposed ClothFormer aims at solving the abovementioned issues. The team first designs a clothing-agnostic person representation that eliminates any clothing information and preserves backgrounds and occlusions. They then employ a frame-level TPS-based warp method to predict and mask the clothes’ occlusion regions, and feed these predicted results to an appearance flow-based method to obtain accurate and anti-occlusion dense flow pairs between the body and clothing regions. They also use an appearance flow tracking module to obtain warped clothing sequences with improved spatio-temporal consistency. Finally, they introduce a novel Multi-scale Patch-based Dual-stream Transformer (MPDT) generator, which extracts and fuses clothing textures, person features such as pose, and environment information to synthesize the final output video sequence.

To validate the effectiveness of the proposed ClothFormer, the team compared its outputs with existing state-of-the-art methods (FW-GAN, MV-TON, CP-VTON, ACGPN and PB-AFN) on both image-based and video-based evaluation metrics on the VVT video virtual try-on dataset. In the experiments, ClothFormer achieved significant quantitative and qualitative improvements in high-quality and spatio-temporally consistent try-on video generation, surpassing current systems by a large margin.

Demos in video format are available on the project’s GitHub. The paper ClothFormer: Taming Video Virtual Try-on in All Module is on arXiv.


Author: Hecate He | Editor: Michael Sarazen


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

4 comments on “BIGO and iQIYI’s ClothFormer: Realistic Video Virtual Try-on Come True

  1. Canon printers are mostly preferred by professionals who are into the business of photo printing. Canon printers produce high-quality prints and have an easy ink refill process. Canon printers perform faster than other printers. In order to perform the setup procedure for your Canon printer, you can reach the website ij.start.canon setup right now.

  2. Canon printers are highly reliable and perform excellent printing performance. Their print quality is awesome. Whether you are in search of an inkjet printer, or laser printer, Canon manufactures them both. Canon printers are an excellent choice for professionals wishing to get high-quality printouts and speedy performance. Set up your Canon printer through ij.start.canon set up now.

  3. Cricut cuts leather, card stock, sticky paper, and other materials into shapes. It can make crafts for you by cutting hard paper into greeting cards and labels. A Cricut machine buyer can also give an attractive shape to vinyl by cutting it with precision. Reach cricut.com/setup to set up your Cricut machine now.

  4. Canon is a prominent printer brand and has received numerous accolades from people worldwide. Canon printers make inkjet and laser printers. Whether you want to print, copy, or scan documents, you will find canon printers quite perfect for your printing needs. In order to perform the setup procedure for your newly bought Canon printer, you can get to the official site ij.start.canon set up right now.

Leave a Reply

Your email address will not be published.

%d bloggers like this: