Computer Vision & Graphics Machine Learning & Data Science Popular

University of Alberta U^2-Net: Generating Realistic Pencil Portraits Using Salient Object Detection

University of Alberta recently proposed U^2-Net, a novel deep network architecture that achieves very competitive performance in salient object detection.

Pencil sketch portrait generation has emerged as a fun and popular new application of the University of Alberta’s U^2-Net. The project’s GitHub page has received over 2,400 stars in the three days since the novel deep network architecture for salient object detection was open-sourced.

The process of detecting and segmenting the most visually attractive objects from natural scenes is known in the computer vision field as salient object detection (SOD). Most existing SOD networks have similar designs, and focus on leveraging depth features extracted by backbone networks such as AlexNet, VGG, ResNet, ResNeXt, DenseNet, etc. These backbone networks however were originally built for image classification tasks, and so they extract features that represent semantic meanings rather than the local details or global reference information critical for salient object detection. Such networks also tend to require data-inefficient pretraining on ImageNet.

U^2-Net is a simple yet powerful deep network architecture with a novel two-layer nested U-shaped structure that was designed to address these problems. The proposed ReSidual U-block (RSU) has a mixture of receive domains in a variety of different sizes, enabling it to better capture contextual information on different scales. The RSU also uses pooling operations to increase the overall architecture depth without significantly increasing computational cost.

image.png

In their paper, the researchers introduce the RSU and nested U-architecture built with them, and describe the supervision strategy and training loss of the network.

image.png
Existing convolution blocks and the proposed residual U-block RSU: (a) Plain convolution block PLN, (b) Residual-like block RES, (c) Inception-like block INC, (d) Dense-like block DSE and (e) Our residual U-block RSU.

The RSU has three main components: an input convolutional layer, a U-Net-like symmetric encoder-decoder structure of L height, and a residual connection to fuse local and multiscale features through summation.

image.png
Residual block and RSU comparison

The main differences between the RSU and the original residual block is that the RSU replaces the ordinary single-flow convolution with a U-Net-like structure, and replaces original features with a local feature transformed via a weighting layer.

image.png
Computational cost of the RSU versus other feature extraction modules

The team notes that RSU’s computational cost is relatively small. They attribute this to its U-shaped structure and that most computation operations are applied on the downsampled feature map.

image.png
Illustration of the proposed U2 -Net architecture

Based on RSU, the researchers developed U^2-Net, their novel stacked U-shaped structure for salient object detection. U^2-Net consists of a 6-stage encoder, a 5-stage decoder, and a saliency graph fusion module attached to the decoder stages and the last encoder stage.

Overall, the U^2-Net design builds a deep architecture with rich multiscale features and low computational and memory costs. In addition, since the U^2-Net architecture is built on RSU blocks alone and does not use any pretrained backbone network for image classification processing, it can be flexibly and easily adapted to different working environments with minimal performance loss.

To train U^2-Net the researchers took DUTS-TR, the largest and most commonly used salient object detection dataset, and horizontally flipped its images to obtain a total of 21,106 training images. Six benchmark public datasets for salient object detection, DUTOMRON, DUTS-TE, HKU-IS, ECSSD, PASCAL-S and SOD, were used for evaluation purposes.

image.png
Results of ablation study on different blocks, architectures and backbones. “PLN”, “RES”, “DSE”, “INC”, “PPM” and “RSU” denote plain convolution block, residual block, dense block, inception block, pyramid pooling module and the proposed residual U-block respectively.
image.png
Comparison of the proposed method and 20 SOTA methods on DUT-OMRON, DUTS-TE, HKU-IS in terms of model size. Red, Green, and Blue indicate the best, second best and third best performance.
image.png
Comparison of the proposed method and 20 SOTA methods on ECSSD, PASCAL-S, SOD in terms of model size.

In the experiments, the proposed models achieved performance competitive with 20 SOTA SOD methods on qualitative and quantitative measures.

Readers who would like to experiment with the portrait application can do so by downloading u2net_portrait.pth and running it on the APDrawingGAN test set. You can also prepare your own images in sizes close to or larger than 512×512, preferably with relatively clear backgrounds. Classic oil on canvas portraits don’t seem to work quite as well, though.

The paper U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection is on arXiv, and the code is on GitHub.


Analyst: Hecate He | Editor: Michael Sarazen; Yuan Yuan


B4.png

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon KindleAlong with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.


AI Weekly.png

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

2 comments on “University of Alberta U^2-Net: Generating Realistic Pencil Portraits Using Salient Object Detection

  1. Pingback: [R] University of Alberta U^2-Net: Generating Realistic Pencil Portraits Using Salient Object Detection – tensor.io

  2. Pingback: [R] University of Alberta U^2-Net: Generating Realistic Pencil Portraits Using Salient Object Detection > Seekalgo

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: