Pencil sketch portrait generation has emerged as a fun and popular new application of the University of Alberta’s U^2-Net. The project’s GitHub page has received over 2,400 stars in the three days since the novel deep network architecture for salient object detection was open-sourced.
The process of detecting and segmenting the most visually attractive objects from natural scenes is known in the computer vision field as salient object detection (SOD). Most existing SOD networks have similar designs, and focus on leveraging depth features extracted by backbone networks such as AlexNet, VGG, ResNet, ResNeXt, DenseNet, etc. These backbone networks however were originally built for image classification tasks, and so they extract features that represent semantic meanings rather than the local details or global reference information critical for salient object detection. Such networks also tend to require data-inefficient pretraining on ImageNet.
U^2-Net is a simple yet powerful deep network architecture with a novel two-layer nested U-shaped structure that was designed to address these problems. The proposed ReSidual U-block (RSU) has a mixture of receive domains in a variety of different sizes, enabling it to better capture contextual information on different scales. The RSU also uses pooling operations to increase the overall architecture depth without significantly increasing computational cost.
In their paper, the researchers introduce the RSU and nested U-architecture built with them, and describe the supervision strategy and training loss of the network.
The RSU has three main components: an input convolutional layer, a U-Net-like symmetric encoder-decoder structure of L height, and a residual connection to fuse local and multiscale features through summation.
The main differences between the RSU and the original residual block is that the RSU replaces the ordinary single-flow convolution with a U-Net-like structure, and replaces original features with a local feature transformed via a weighting layer.
The team notes that RSU’s computational cost is relatively small. They attribute this to its U-shaped structure and that most computation operations are applied on the downsampled feature map.
Based on RSU, the researchers developed U^2-Net, their novel stacked U-shaped structure for salient object detection. U^2-Net consists of a 6-stage encoder, a 5-stage decoder, and a saliency graph fusion module attached to the decoder stages and the last encoder stage.
Overall, the U^2-Net design builds a deep architecture with rich multiscale features and low computational and memory costs. In addition, since the U^2-Net architecture is built on RSU blocks alone and does not use any pretrained backbone network for image classification processing, it can be flexibly and easily adapted to different working environments with minimal performance loss.
To train U^2-Net the researchers took DUTS-TR, the largest and most commonly used salient object detection dataset, and horizontally flipped its images to obtain a total of 21,106 training images. Six benchmark public datasets for salient object detection, DUTOMRON, DUTS-TE, HKU-IS, ECSSD, PASCAL-S and SOD, were used for evaluation purposes.
In the experiments, the proposed models achieved performance competitive with 20 SOTA SOD methods on qualitative and quantitative measures.
Readers who would like to experiment with the portrait application can do so by downloading u2net_portrait.pth and running it on the APDrawingGAN test set. You can also prepare your own images in sizes close to or larger than 512×512, preferably with relatively clear backgrounds. Classic oil on canvas portraits don’t seem to work quite as well, though.
The paper U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection is on arXiv, and the code is on GitHub.
Analyst: Hecate He | Editor: Michael Sarazen; Yuan Yuan
Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors
This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.
Click here to find more reports from us.
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.
Pingback: [R] University of Alberta U^2-Net: Generating Realistic Pencil Portraits Using Salient Object Detection – tensor.io
Pingback: [R] University of Alberta U^2-Net: Generating Realistic Pencil Portraits Using Salient Object Detection > Seekalgo