ArtGAN – Artwork Synthesis with Conditional Categorical GANs

Paper Source: https://pdfs.semanticscholar.org/ec94/874d38378f53319d467412a124809542d3db.pdf?_ga=1.46132652.922857708.1488461012

Paper Authors:

1. Introduction

This paper presents a so-called ArtGAN to generate complex images like artworks as shown in the following figure.

Prior works based on GANs are usually used to generate images, which normally have clear, distinguishable foregrounds and backgrounds while often only having one or two (main) objects in each image. These objects are also relatively structured in shape. Comparing with these prior work, this ArtGAN can generate images with abstract information like images with a certain art style.

Technically, this paper makes it possible to allow back-propagation of the loss function with respect to the labels to the generator from the discriminator.

2. Model Architecture and Experiments

2.1 Generative Adversarial Networks

Generative Adversarial Network (GAN) mainly consists of two parts: one generator (G) and one discriminator (D). G generates images with distribution close to that of training samples (real images). At the same time, D learns to distinguish generated images from training samples. The purpose is to find optimal parameters for both D and G so that the generated images are as real as possible. Mathematically, training GAN is to optimize the following objective function:

Here, D is trained by maximizing the probability of the training data (first term), while minimizing the probability of the samples from G (second term).

2.2 ArtGAN based on conditional GAN

Based on GAN in subsection 2.1, ArtGAN in this paper has one key trick: It allows feedback from the labels given to each generated image to G using the loss function of D. That is, additional (label) information y-hat is added to the GAN network. This is the way how humans learn to draw.

D is updated by minimizing function (2):

Which has the same idea as prior conditional GANs. At the same time, the maximization of function (2) updates theta_G in G to compete with D. So function (2) is reformulated as a minimization problem of function (3):

Another trick used in this paper is that a L2-loss function (4), added only in G for pixel-wise reconstruction in order to improve training stability of ArtGAN.

Based on the above functions, the entire architecture of ArtGAN is constructed as shown in Fig.2.

Noting that z-hat and y-hat are concatenated dense vectors as input of this network.

2.3 Experiments and Results

In this work, Wikiart dataset with around 80,000 labelled artworks are used to train this ArtGAN.

Fig. 3 shows the generated images from a random vector with an “artist” label. It is clear that the preference of artists on color and lines are well extracted and expressed in these examples. Fig. 4 uses “style” as the additional label vector to generate images. The texture and color feature is well maintained but no meaningful information can be extracted from these generated images.

3. Conclusion

Advantages: This paper proposed a novel ArtGAN to synthesize images with complex and abstract characteristics. The key innovation, feedback from the label information during the back-propagation step, enhances the quality of generated images.

Future work:

Using a deeper ArtGAN may maintain more detail information so that the generated images could be besser.
Jointly learning these modes, so that ArtGAN can create artwork based on the combination of several modes.

4. Thoughts from the Reviewer

General Comments: This paper presents a conditional GAN using additional label information to generate complex images. This ArtGAN has visually better result than some prior works, like DCGAN or GAN/VAE. L2-loss is only used in G, because the authors found this loss degrades the quality of the generated images when it is used in D, which is an interesting finding. Although this network is a supervised model, it does not require very complex labels for each images.

Possible Problems: It is very difficult to define “COMPLEX”. The authors think artworks are more complex than the normal training data of prior GANs. But in my opinion, some artworks, especially abstract artworks, are composed of many color patches which have simpler structures than natural images. So whether it is more difficult for GAN to generate paintings is an open question. Also, “style” does not have a very strict definition in art. Because it is not only the “texture” or “color” of an image, but also the “preference” of the artists.

Additional Thoughts: Artwork generation is a hard work. On one hand, it is a machine learning problem as shown in this paper. On the other hand, it is also a problem of art or cognition. That is, how we define art and how we observe this world. Concretely, each artist draws the world in her or his eyes, which means the objects in real world are mapped into other objects in the artists’ brain. As for generating more “realistic” artwork, I think we should not only focus on given artworks, but also should try to find how to map the real world objects into art objects.

Analyst: Yiwen Liao |Editor: Hao Wang | Localized by Synced Global Team : Xiang Chen

ArtGAN – Artwork Synthesis with Conditional Categorical GANs

1. Introduction