Generative adversarial networks (GANs) have grabbed the media spotlight with their outstanding performance in image synthesis, but GANs’ potential strength in other research fields remains relatively unexplored. Leading UK AI research company DeepMind has now proposed “BigBiGAN” (not BigBigGAN, but BigBiGAN), which achieves state-of-the-art results in both unsupervised representation learning on ImageNet and in unconditional image generation.
Representation learning denotes learning representations of data that make it easier to extract useful information when building classifiers or other predictors. The ML research community is increasingly interested in unsupervised representation learning to leverage the potential of massive unlabeled data. With advanced algorithm development, the combination of self-supervised data representation obtained in the pretraining process and model fine-tuning through transfer learning promises great performance for many computer vision and language tasks.
Two years ago, a research group from UC Berkeley and UT Austin proposed Bidirectional GAN (BiGAN) as a means of representation learning. Representing a new theoretical framework for training GANs, BiGAN added an encoder to the standard generator-discriminator GAN architecture — the encoder inputs data and outputs a latent representation. Then researchers trained a discriminator to not only discriminate original data and generated samples, but also eliminate divergence between the encoder and the generator. As a result, an encoder trained via the BiGAN framework becomes an effective means of visual representation learning on ImageNet for downstream tasks.
DeepMind researchers however discovered the DCGAN architecture used in BiGAN is not capable of modeling high-quality images. That motivated them to apply the generator and discriminator architecture of BigGAN, the state-of-the-art generative model for high-quality image generation DeepMind proposed last year, to BiGAN. They also discovered that a tweak on the discriminator improved the results of representation learning without compromising generation.
As with unsupervised representation learning experiments, researchers pretrained the BigBiGAN model on unlabeled ImageNet data to get a data representation, and trained a linear classifier on top of that with labeled data on a supervised downstream task. In the experiments, BigBiGAN boosted unsupervised learning results from 55.4 percent to 60.8 percent on top-1 accuracy. In another unsupervised image generation experiment, BigBiGAN improved both IS and FID over the baseline unconditional BigGAN generation results.
While the DeepMind research paper doesn’t introduce any novel techniques or algorithms, the combination of BigGAN and BiGAN explores the potential of GANs for a wide range of applications. Ian Goodfellow, the inventor of GANs, tweeted “While we were writing the original GAN paper, my co-author @dwf tried to get something similar to BiGAN working for representation learning. It’s pretty cool to see that work out 5 years later.”
Read the paper Large Scale Adversarial Representation Learning on arXiv for more information. Authors Jeff Donahue and Karen Simonyan are also the main contributors to the paper Large Scale GAN Training for High Fidelity Natural Image Synthesis (BigGAN).
Journalist: Tony Peng | Editor: Michael Sarazen