Loading [MathJax]/extensions/MathMenu.js
Deformable Generator Networks: Unsupervised Disentanglement of Appearance and Geometry | IEEE Journals & Magazine | IEEE Xplore

Deformable Generator Networks: Unsupervised Disentanglement of Appearance and Geometry


Abstract:

We present a deformable generator model to disentangle the appearance and geometric information for both image and video data in a purely unsupervised manner. The appeara...Show More

Abstract:

We present a deformable generator model to disentangle the appearance and geometric information for both image and video data in a purely unsupervised manner. The appearance generator network models the information related to appearance, including color, illumination, identity or category, while the geometric generator performs geometric warping, such as rotation and stretching, through generating deformation field which is used to warp the generated appearance to obtain the final image or video sequences. Two generators take independent latent vectors as input to disentangle the appearance and geometric information from image or video sequences. For video data, a nonlinear transition model is introduced to both the appearance and geometric generators to capture the dynamics over time. The proposed scheme is general and can be easily integrated into different generative models. An extensive set of qualitative and quantitative experiments shows that the appearance and geometric information can be well disentangled, and the learned geometric generator can be conveniently transferred to other image datasets that share similar structure regularity to facilitate knowledge transfer tasks.
Page(s): 1162 - 1179
Date of Publication: 04 August 2020

ISSN Information:

PubMed ID: 32749961

Funding Agency:

References is not available for this document.

1 Introduction

Learning disentangled structures of the observations [1], [2] is a fundamental problem towards controlling modern deep models and understanding the world. Conceptual understanding requires a disentangled representation that separates the underlying explanatory factors and shows the important attributes of the real-world data explicitly [3], [4]. For instance, given an image dataset of human faces, a disentangled representation can separate the face’s appearance attributes, such as color, light source, identity, gender, and the geometric attributes, such as face shape and viewing angle. Such disentangled representations are semantically meaningful not only in building more transparent and interpretable generative models, but also useful for a large variety of downstream AI tasks such as transfer learning and zero-shot inference where humans excel but machines struggle [5]. It has also been shown that such disentangled representations are more generalizable and robust against adversarial attacks [6].

Select All
1.
Y. Bengio, A. Courville and P. Vincent, "Representation learning: A review and new perspectives", IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1798-1828, Aug. 2013.
2.
M. F. Mathieu, J. J. Zhao, J. Zhao, A. Ramesh, P. Sprechmann and Y. LeCun, "Disentangling factors of variation in deep representation using adversarial training", Proc. Advances Neural Inf. Process. Syst., pp. 5040-5048, 2016.
3.
C. P. Burgess et al., "Understanding disentangling in β-VAE", 2018.
4.
A. Achille and S. Soatto, "Emergence of invariance and disentanglement in deep representations", Proc. Int. Conf. Mach. Learn., pp. 1-9, 2018.
5.
B. M. Lake, T. D. Ullman, J. B. Tenenbaum and S. J. Gershman, "Building machines that learn and think like people", Behavioral Brain Sci., vol. 40, 2017.
6.
A. A. Alemi, I. Fischer, J. V. Dillon and K. Murphy, "Deep variational information bottleneck", 2017.
7.
A. Brock, J. Donahue and K. Simonyan, "Large scale GAN training for high fidelity natural image synthesis", 2019.
8.
T. Karras, S. Laine and T. Aila, "A style-based generator architecture for generative adversarial networks", Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 4401-4410, 2019.
9.
M. Lucic, K. Kurach, M. Michalski, S. Gelly and O. Bousquet, "Are GANs created equal? a large-scale study", pp. 700-709, 2018.
10.
T. Han, E. Nijkamp, X. Fang, M. Hill, S.-C. Zhu and Y. N. Wu, "Divergence triangle for joint training of generator model energy-based model and inferential model", Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 8670-8679, 2019.
11.
H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. Singh and M.-H. Yang, "Diverse image-to-image translation via disentangled representations", Proc. Eur. Conf. Comput. Vis., pp. 35-51, 2018.
12.
X. Huang, M.-Y. Liu, S. Belongie and J. Kautz, "Multimodal unsupervised image-to-image translation", Proc. Eur. Conf. Comput. Vis., pp. 172-189, 2018.
13.
J. Xie, R. Gao, Z. Zheng, S.-C. Zhu and Y. N. Wu, "Motion-based generator model: Unsupervised disentanglement of appearance trackable and intrackable motions in dynamic patterns", pp. 12442-12451, 2019.
14.
T. Han, X. Xing and Y. N. Wu, "Learning multi-view generator network for shared representation", Proc. 24th Int. Conf. Pattern Recognit., pp. 2062-2068, 2018.
15.
F. Locatello, G. Abbati, T. Rainforth, S. Bauer, B. Schölkopf and O. Bachem, "On the fairness of disentangled representations", Proc. Advances Neural Inf. Processi. Syst., pp. 14 584-14 597, 2019.
16.
I. Goodfellow et al., "Generative adversarial nets", Proc. Advances Neural Inf. Process. Syst., pp. 2672-2680, 2014.
17.
Z. Li, Y. Tang and Y. He, "Unsupervised disentangled representation learning with analogical relations", pp. 2418-2424, 2018.
18.
L. Tran, X. Yin and X. Liu, "Disentangled representation learning GAN for pose-invariant face recognition", Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 1283-1292, 2017.
19.
D. P. Kingma and M. Welling, "Auto-encoding variational bayes", 2014.
20.
D. J. Rezende, S. Mohamed and D. Wierstra, "Stochastic backpropagation and approximate inference in deep generative models", Proc. 31st Int. Conf. Int. Conf. Mach. Learn., pp. 1278-1286, 2014.
21.
A. Kumar, P. Sattigeri and A. Balakrishnan, "Variational inference of disentangled latent concepts from unlabeled observations", 2018.
22.
X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever and P. Abbeel, "InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets", Proc. Advances Neural Inf. Process. Syst., pp. 2172-2180, 2016.
23.
I. Higgins et al., "beta-VAE: Learning basic visual concepts with a constrained variational framework", 2017.
24.
T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen and T. Aila, "Analyzing and improving the image quality of stylegan", Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 8110-8119, 2020.
25.
T. Karras, T. Aila, S. Laine and J. Lehtinen, "Progressive growing of GANs for improved quality stability and variation", 2018.
26.
T. Nguyen-Phuoc, C. Li, L. Theis, C. Richardt and Y.-L. Yang, "Hologan: Unsupervised learning of 3d representations from natural images", Proc. IEEE Int. Conf. Comput. Vis., pp. 7588-7597, 2019.
27.
P. W. Hallinan, G. Gordon, A. L. Yuille, P. Giblin and D. Mumford, Two-and Three-Dimensional Patterns of the Face, Boca Raton, FL, USA:AK Peters/CRC Press, 1999.
28.
T. F. Cootes, G. J. Edwards and C. J. Taylor, "Active appearance models", IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 6, pp. 681-685, Jun. 2001.
29.
J. Kossaifi, G. Tzimiropoulos and M. Pantic, "Fast and exact newton and bidirectional fitting of active appearance models", IEEE Trans. Image Process., vol. 26, no. 2, pp. 1040-1053, Feb. 2017.
30.
J. Kossaifi, L. Tran, Y. Panagakis and M. Pantic, "Gagan: Geometry-aware generative adverserial networks", pp. 878-887, 2018.
Contact IEEE to Subscribe

References

References is not available for this document.