Loading [MathJax]/extensions/MathMenu.js
Complementary, Heterogeneous and Adversarial Networks for Image-to-Image Translation | IEEE Journals & Magazine | IEEE Xplore

Complementary, Heterogeneous and Adversarial Networks for Image-to-Image Translation


Abstract:

Image-to-image translation is to transfer images from a source domain to a target domain. Conditional Generative Adversarial Networks (GANs) have enabled a variety of app...Show More

Abstract:

Image-to-image translation is to transfer images from a source domain to a target domain. Conditional Generative Adversarial Networks (GANs) have enabled a variety of applications. Initial GANs typically conclude one single generator for generating a target image. Recently, using multiple generators has shown promising results in various tasks. However, generators in these works are typically of homogeneous architectures. In this paper, we argue that heterogeneous generators are complementary to each other and will benefit the generation of images. By heterogeneous, we mean that generators are of different architectures, focus on diverse positions, and perform over multiple scales. To this end, we build two generators by using a deep U-Net and a shallow residual network, respectively. The former concludes a series of down-sampling and up-sampling layers, which typically have large perception field and great spatial locality. In contrast, the residual network has small perceptual fields and works well in characterizing details, especially textures and local patterns. Afterwards, we use a gated fusion network to combine these two generators for producing a final output. The gated fusion unit automatically induces heterogeneous generators to focus on different positions and complement each other. Finally, we propose a novel approach to integrate multi-level and multi-scale features in the discriminator. This multi-layer integration discriminator encourages generators to produce realistic details from coarse to fine scales. We quantitatively and qualitatively evaluate our model on various benchmark datasets. Experimental results demonstrate that our method significantly improves the quality of transferred images, across a variety of image-to-image translation tasks. We have made our code and results publicly available: http://aiart.live/chan/.
Published in: IEEE Transactions on Image Processing ( Volume: 30)
Page(s): 3487 - 3498
Date of Publication: 01 March 2021

ISSN Information:

PubMed ID: 33646952

Funding Agency:

References is not available for this document.

I. Introduction

Image-to-image (I2I) translation aims to transfer images from a source domain to a target domain. It has received significant attention, as it enables numerous applications, e.g. image style transfer [1], [2], image in-painting [3], [4], face photo-sketch synthesis [5], image super-resolution reconstruction [6], semantic segmentation [7], [8], and data augmentation [9], etc. These applications are critical for various practical circumstances in the communities of digital entertainments and public security.

Select All
1.
L. A. Gatys, A. S. Ecker and M. Bethge, "A neural algorithm of artistic style" in arXiv:1508.06576, 2015, [online] Available: http://arxiv.org/abs/1508.06576.
2.
L. A. Gatys, A. S. Ecker, M. Bethge, A. Hertzmann and E. Shechtman, "Controlling perceptual factors in neural style transfer", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3985-3993, Jul. 2017.
3.
R. A. Yeh, C. Chen, T. Y. Lim, A. G. Schwing, M. Hasegawa-Johnson and M. N. Do, "Semantic image inpainting with deep generative models", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 5485-5493, Jul. 2017.
4.
G. Liu, F. A. Reda, K. J. Shih, T.-C. Wang, A. Tao and B. Catanzaro, "Image inpainting for irregular holes using partial convolutions", Proc. 15th Eur. Conf. Comput. Vis. (ECCV), pp. 89-105, Sep. 2018.
5.
N. Wang, W. Zha, J. Li and X. Gao, "Back projection: An effective postprocessing method for GAN-based face sketch synthesis", Pattern Recognit. Lett., vol. 107, pp. 59-65, May 2018.
6.
C. Ledig et al., "Photo-realistic single image super-resolution using a generative adversarial network", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 105-114, Jul. 2017.
7.
Q. Wang, J. Gao and X. Li, "Weakly supervised adversarial domain adaptation for semantic segmentation in urban scenes", IEEE Trans. Image Process., vol. 28, no. 9, pp. 4376-4386, Sep. 2019.
8.
Y. Li, S. Tang, R. Zhang, Y. Zhang, J. Li and S. Yan, "Asymmetric GAN for unpaired Image-to-Image translation", IEEE Trans. Image Process., vol. 28, no. 12, pp. 5881-5896, Dec. 2019.
9.
L. Zhang, A. Gonzalez-Garcia, J. van de Weijer, M. Danelljan and F. S. Khan, "Synthetic data generation for end-to-end thermal infrared tracking", IEEE Trans. Image Process., vol. 28, no. 4, pp. 1837-1850, Apr. 2019.
10.
I. J. Goodfellow et al., "Generative adversarial nets", Proc. Int. Conf. Neural Inf. Process. Syst., pp. 2672-2680, 2014.
11.
P. Isola, J.-Y. Zhu, T. Zhou and A. A. Efros, "Image-to-image translation with conditional adversarial networks", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 1125-1134, Jul. 2017.
12.
T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz and B. Catanzaro, "High-resolution image synthesis and semantic manipulation with conditional GANs", Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 8798-8807, Jun. 2018.
13.
R. Chen, W. Huang, B. Huang, F. Sun and B. Fang, "Reusing discriminators for encoding: Towards unsupervised image-to-image translation", Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 8168-8177, Jun. 2020.
14.
J. Kim, M. Kim, H. Kang and K. H. Lee, "U-GAT-it: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation", Proc. Int. Conf. Learn. Represent., pp. 1-19, 2020, [online] Available: https://openreview.net/forum?id=BJlZ5ySKPH.
15.
J. Johnson, A. Alahi and F. F. Li, "Perceptual losses for real-time style transfer and super-resolution", Proc. Eur. Conf. Comput. Vis., pp. 694-711, Oct. 2016.
16.
T. Karras, T. Aila, S. Laine and J. Lehtinen, "Progressive growing of GANs for improved quality stability and variation", Proc. Int. Conf. Learn. Represent., pp. 1-26, 2018, [online] Available: https://openreview.net/forum?id=Hk99zCeAb.
17.
Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim and J. Choo, "StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation", Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 8789-8797, Jun. 2018.
18.
J.-Y. Zhu, T. Park, P. Isola and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks", Proc. IEEE Int. Conf. Comput. Vis. (ICCV), pp. 2242-2251, Oct. 2017.
19.
M.-Y. Liu, T. Breuel and J. Kautz, "Unsupervised image-to-image translation networks", Proc. Adv. Neural Inf. Process. Syst., pp. 700-708, 2017.
20.
X. Huang, M.-Y. Liu, S. Belongie and J. Kautz, "Multimodal unsupervised image-to-image translation", Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 172-189, Sep. 2018.
21.
A. Ghosh, V. Kulharia, V. Namboodiri, P. H. S. Torr and P. K. Dokania, "Multi-agent diverse generative adversarial networks", Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 8513-8521, Jun. 2018.
22.
H. Zhang et al., "StackGAN++: Realistic image synthesis with stacked generative adversarial networks", IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 8, pp. 1947-1962, Aug. 2019.
23.
J. Yu et al., "Toward realistic face photo-sketch synthesis via composition-aided GANs", IEEE Trans. Cybern., Mar. 2020.
24.
R. Yi, Y.-J. Liu, Y.-K. Lai and P. L. Rosin, "APDrawingGAN: Generating artistic portrait drawings from face photos with hierarchical GANs", Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 10743-10752, Jun. 2019.
25.
M. Zhang, R. Wang, X. Gao, J. Li and D. Tao, "Dual-transfer face Sketch–Photo synthesis", IEEE Trans. Image Process., vol. 28, no. 2, pp. 642-657, Feb. 2019.
26.
K. He, X. Zhang, S. Ren and J. Sun, "Deep residual learning for image recognition", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 770-778, Jun. 2016.
27.
A. Vaswani et al., "Attention is all you need", Proc. Adv. Neural Inf. Process. Syst., pp. 5998-6008, 2017.
28.
H. Zhang, I. Goodfellow, D. Metaxas and A. Odena, "Self-attention generative adversarial networks", Proc. Int. Conf. Mach. Learn., pp. 7354-7363, May 2019.
29.
F. Gao, J. Yu, S. Zhu, Q. Huang and Q. Tian, "Blind image quality prediction by exploiting multi-level deep representations", Pattern Recognit., vol. 81, pp. 432-442, Sep. 2018.
30.
X. Huang and S. Belongie, "Arbitrary style transfer in real-time with adaptive instance normalization", Proc. IEEE Int. Conf. Comput. Vis. (ICCV), pp. 1510-1519, Oct. 2017.
Contact IEEE to Subscribe

References

References is not available for this document.