Site icon Synced

BigGAN: A New State of the Art in Image Synthesis

“Best GAN samples ever yet? Very impressive ICLR submission! BigGAN improves Inception Scores by >100.”

The above Tweet is from renowned Google DeepMind research scientist Oriol Vinyals. It was retweeted last week by Google Brain researcher and “Father of Generative Adversarial Networks” Ian Goodfellow, and picked up momentum and praise from AI researchers on social media.

All the attention surrounds the paper Large Scale GAN Training for High Fidelity Natural Image Synthesis, which recently popped up on the social network. The paper is an internship project by Andrew Brock from Heriot-Watt University in collaboration with Jeff Donahue and Karen Simonyan from DeepMind. It is under review for next spring’s ICLR 2019.
Figure 1 shows how the model is capable of generating very impressive images with high fidelity and low variety gap. When trained on the ImageNet dataset at 128×128 resolution, BigGAN can achieve an Inception Score (IS) of 166.3, a more than 100 percent improvement over the previous state of the art (SotA) result of 52.52. The Frechet Inception Distance (FID) score has also been improved from 18.65 to 9.6.

The authors proposed a model (BigGAN) with modifications focused on the following three aspects:

In addition to its performance boost at 128×128 resolutions, BigGAN also outperformed the previous SotA at 256×256 and 512×512 resolutions on ImageNet. The model was also tested on the larger image dataset JFT-300M to demonstrate its transferability.

Although BigGAN appears to be the new SotA in class-conditional image synthesis, some questions remain regarding “how much distribution did it capture and what would the unconditional version look like?” according to XGBoost, MXNet, and TVM contributor Tianqi Chen.

The paper first appeared on OpenReview, where it was uploaded anonymously. More recently, it was posted on Arxiv and shared on Twitter by the authors. The paper is currently under double-blind review. Posting such papers on public forums or Arxiv is permitted under ICLR/NIPS/ICML conference rules, although submissions that are not properly anonymized are prohibited from consideration by the ACL (Association for Computational Linguistics).

Additional BigGAN generated samples can be downloaded in different resolutions at https://drive.google.com/drive/folders/1lWC6XEPD0LT5KUnPXeve_kWeY-FxH002.


Author: Mos Zhang | Editor: Michael Sarazen

Exit mobile version