NVIDIA has delivered numerous noteworthy research results to the machine learning community — StyleGAN, video-to-video translation, WaveGlow, and more — and the US chip giant shows no sign of slowing down.
At the NVIDIA GPU Technology Conference (GTC) which kicked off today, NVIDIA unveiled its latest image processing research effort — GauGAN, a generative adversarial network-based technique capable of transforming segmentation maps into realistic photos. The GauGAN research paper has been accepted by CVPR 2019 as an Oral Session.
In a GTC demo this morning NVIDIA researcher Ming-Hsuan Yang illustrated how the technology works: As Yang used a “cloud” brush to paint on a segmentation map “canvas,” the model generated realistic cloud patterns in the corresponding position of an output image. The generated content appears in a style consistent with the rest of the scene. Click here to view the demo video.
GauGAN improves on pix2pixHD, a NVIDIA research paper accepted to CVPR 2018 that features a coarse-to-fine generator, multi-scale discriminators, and improved adversarial loss. The new model has fewer parameters and more generatable image options. It was trained on one million images NVIDIA collected from Flickr.
“It’s much easier to brainstorm designs with simple sketches, and this technology is able to convert sketches into highly realistic images,” said NVIDIA Vice President of Applied Deep Learning Research Bryan Catanzaro.
Catanzaro explained that the paper’s innovation is “spatial adaptive normalization,” a technique that presents visual information in a manner that is spatially uniform and with more natural relationships between compositional elements.
The model allows users to change a generated image to adapt the style of a particular painter, or change a daytime scene to a sunset, etc.
The effect demonstrated is similar to GANpaint, a paper Synced reported on last November from a team of high-profile MIT, IBM, Google, and Chinese University of Hong Kong researchers. GANpaint enables anyone to paint incredibly complex and detailed photorealistic scenes. Yang explained that the two papers were proposed at about the same time, but employ different techniques and are designed for use in different scenarios.
Although GauGAN is still a proof-of-concept, the technique has potential applications in gaming, film-making, and many other visual presentation and editing areas. Game designers for example could utilize the technology to render more realistic simulated environments faster and more efficiently.
NVIDIA now has 175 full-time researchers worldwide focusing on both the supply side (GPU card development), and the demand side, specifically the AI research that drives demand for NVIDIA graphics cards. In 2018 NVIDIA rolled out 104 publications, 51 patent applications, and 12 open-source software packages.
NVIDIA also introduced an AI playground which enables users to test and play with demos of NVIDIA’s cutting-edge deep learning models.
Read more on the paper Semantic Image Synthesis with Spatially-Adaptive Normalization.
Journalist: Tony Peng | Editor: Michael Sarazen