Even as fashion image analysis gets more traction from today’s image recognition researchers, understanding fashion images remains challenging for real-world applications due to large deformations, occlusions, and discrepancies in clothing across domains and between consumer and commercial images.
DeepFashion is a large-scale clothes database introduced last year by a research team from the Chinese University of Hong Kong (CUHK). The dataset contains over 800k diverse fashion images, each labeled with 50 categories, 1,000 descriptive attributes, bounding boxes and clothing landmarks.
DeepFashion was a solid foundation, but it left a number of areas for improvement. It was limited to a single clothing-item per image, sparse landmarks (4~8 only), and had no per-pixel masks. CUHK researchers recently teamed up with Chinese AI giant SenseTime to develop a greatly improved iteration in DeepFashion2, a large-scale benchmark with comprehensive tasks and annotations of fashion image understanding.
DeepFashion2 contains 491K images of 13 popular clothing categories. A full spectrum of tasks are defined, including clothes detection and recognition, landmark and pose estimation, segmentation, as well as verification and retrieval. All these tasks are supported by rich annotations.
The dataset also includes a total of 801K images of pieces of clothing. Each item is labeled with scale, occlusion, zooming, viewpoint, bounding box, dense landmarks, and per-pixel mask. These items can be categorized as 43.8k clothing identities, where a clothing identity represents a class of apparel with nearly identical cuts, patterns, and designs. Images of the same clothing identities are taken from buyers and sellers, where an item from the buyer and an item from the seller forms a pair.