Buying books, electronics or food online is quick and convenient. However even the more e-commerce savvy consumers may hesitate to buy clothes on the Internet. The reason is simple — there’s no virtual fitting room. But don’t worry, AI is working on that.
Achieving accurate 3D human digitalization requires accurate unclothed human body modelling along with plenty of labelled 3D garment data. Recent advances in deep learning-based approaches have achieved impressive progress in reconstructing human body shape and pose from multiple or even single images, performance improvements achieved with the help of massive amounts of labelled training data.
“Most of our previous research was to reconstruct naked bodies from a single image,” says Xiaoguang Han, a researcher with the Shenzhen Research Institute of Big Data (SRIBD) and a research assistant professor at the Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen). “The reconstruction of clothing is a very important part of human digitalization, but as my team moved from human modelling to garment modelling, we soon hit the first big obstacle — the scarcity of 3D garment datasets.”
Han’s team, consisting of researchers from CUHK-Shenzhen, SRIBD, Zhejiang University, Xidian University, Tencent America, and the University of Science and Technology of China, spent eight months building Deep Fashion3D — the largest collection of 3D garment models to date — with the goal of establishing a novel benchmark and dataset for the evaluation of image-based garment reconstruction systems.
Deep Fashion3D contains 2,078 3D garment models reconstructed from real-world garments in 10 different clothing categories. The researchers used image-based geometry reconstruction software to generate high-resolution garment reconstructions from multiview images in the form of dense point clouds.
To facilitate future research on 3D garment reasoning, the researchers provide additional annotations specifically tailored for Deep Fashion3D, including 3D feature lines, 3D body pose and the corresponded multi-view real images. In addition, each garment is randomly posed in order to enhance the dataset capacity for modelling features such as dynamic wrinkles.
The team also proposed a novel baseline approach that is capable of inferring realistic 3D garments from a single image, as well as a novel “adaptable template” representation which enables a single network to learn all types of clothing. It’s believed these can lead to stronger expressiveness in reconstructions.
The team compared a baseline model trained on Deep Fashion3D against six SOTA single-view reconstruction approaches that use different 3D representations. The experiment results show that the new approach achieves the highest reconstruction accuracy on single-view garment reconstruction tasks.
Han tells Synced that even with the new SOTA approach, accurate 3D clothing reconstruction from a single photo with one mouse click remains a challenge. “We merely moved one step forward. Our team is still working hard to improve the results we can get, for example, the current method struggles to generate a bubble skirt with realistic geometric details.”
“How to generate 3D data using deep learning algorithms itself remains an open question, and the geometry of the clothes can be very complicated for the current algorithms to handle. In addition, we’re still facing the domain gap between our garment dataset and real clothing in outdoor settings,” says Han.
The research team plans to continue studying algorithm limitations based on the current Deep Fashion3D dataset, and is considering expanding the dataset in the future, although they say this may take years.
The potential applications of such technologies are wide and could be game-changing. Along with virtual fitting rooms, Han sees the method’s potential in intelligent photo-editing, gaming and simulation animation, augmented reality and beyond.
The paper Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images is on arXiv.
Journalist: Yuan Yuan | Editor: Michael Sarazen