Hierarchical image analysis and synthesis with DTLC-GAN
Abstract
We aim to develop a generative model that makes it easier to create an image a person has in mind. When we create an image of an object from scratch, we typically draw it coarsely first, and then refine the details. For example, when we create an image of a face with glasses, we first select the main categories, e.g., transparent/colorful glasses, and then define the details, e.g., small/big colorful glasses. This fact motivated us to derive hierarchical selection functionality in a generative model. A possible solution would be to collect sufficient detailed annotations to solve the problem in a fully supervised manner. However, this requires high annotation costs. To avoid this, we propose an extension of a generative adversarial network (GAN) called the decision tree latent controller GAN (DTLC-GAN) that can discover detailed categories from data without relying on detailed supervision. Our DTLC-GAN is a natural extension of GANs. Possible future work includes applying it to other data and tasks.
Reference
[1] T. Kaneko, K. Hiramatsu, K. Kashino, “Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks,” in Proc. The 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[2] T. Kaneko, K. Hiramatsu, K. Kashino, “Generative Adversarial Image Synthesis with Decision Tree Latent Controller,” in Proc. The 31st IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.