Journals & Magazines >IEEE Transactions on Pattern ... >Volume: 45 Issue: 3

Semantic Layout Manipulation With High-Resolution Sparse Attention

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

We tackle the problem of semantic image layout manipulation, which aims to manipulate an input image by editing its semantic label map. A core problem of this task is how...Show More

Metadata

Abstract:

We tackle the problem of semantic image layout manipulation, which aims to manipulate an input image by editing its semantic label map. A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic. Recent work on learning cross-domain correspondence has shown promising results for global layout transfer with dense attention-based warping. However, this method tends to lose texture details due to the resolution limitation and the lack of smoothness constraint on correspondence. To adapt this paradigm for the layout manipulation task, we propose a high-resolution sparse attention module that effectively transfers visual details to new layouts at a resolution up to 512x512. To further improve visual quality, we introduce a novel generator architecture consisting of a semantic encoder and a two-stage decoder for coarse-to-fine synthesis. Experiments on the ADE20k and Places365 datasets demonstrate that our proposed approach achieves substantial improvements over the existing inpainting and layout manipulation methods.

Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 45, Issue: 3, 01 March 2023)

Page(s): 3768 - 3782

Date of Publication: 13 June 2022

ISSN Information:

PubMed ID: 35696464

DOI: 10.1109/TPAMI.2022.3181587

Citations are not available for this document.

Contents

1 Introduction

Semantic layout manipulation refers to the task of editing an image by modifying its semantic label map, i.e., changing the semantic layout or inserting/erasing objects as illustrated in Fig. 1. It has many practical image editing applications such as photo-editing [1], image retargeting [2], restoration [3], composition [4] and image melding [5], but is relatively under-explored due to the challenges of predicting complex, non-rigid spatial deformations and the domain gap between the input image and the target semantic layout. Essentially, developing an effective method to transfer visual patterns from the input image to the target semantic layout is the key to solving the problem.

Cites in Papers - |

Cites in Papers - IEEE (5)

Select All

Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Qing Liu, Sohrab Amirghodsi, Yuqian Zhou, Jiebo Luo, "Structure-Guided Image Completion With Image-Level and Object-Level Semantic Discriminators", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.46, no.12, pp.7669-7681, 2024.

Show Article

Google Scholar

Meng Ye, Mikael Kanski, Dong Yang, Leon Axel, Dimitris Metaxas, "Unsupervised Exemplar-Based Image-to-Image Translation and Cascaded Vision Transformers for Tagged and Untagged Cardiac Cine MRI Registration", 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp.7629-7639, 2024.

Show Article

Google Scholar

Zuopeng Yang, Tianshu Chu, Xin Lin, Erdun Gao, Daqing Liu, Jie Yang, Chaoyue Wang, "Eliminating Contextual Prior Bias for Semantic Image Editing via Dual-Cycle Diffusion", IEEE Transactions on Circuits and Systems for Video Technology, vol.34, no.2, pp.1316-1320, 2024.

Show Article

Google Scholar

Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Shijian Lu, Lingjie Liu, Adam Kortylewski, Christian Theobalt, Eric Xing, "Multimodal Image Synthesis and Editing: The Generative AI Era", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.45, no.12, pp.15098-15119, 2023.

Show Article

Google Scholar

Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Shijian Lu, Changgong Zhang, "Marginal Contrastive Correspondence for Guided Image Generation", 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.10653-10662, 2022.

Show Article

Google Scholar

Cites in Papers - Other Publishers (1)

Difan Liu, Sandesh Shetty, Tobias Hinz, Matthew Fisher, Richard Zhang, Taesung Park, Evangelos Kalogerakis, "ASSET", ACM Transactions on Graphics, vol.41, no.4, pp.1, 2022.

CrossRef Google Scholar

References is not available for this document.

MIT Libraries

MIT Libraries

Semantic Layout Manipulation With High-Resolution Sparse Attention

Abstract:

Metadata

Abstract:

ISSN Information:

1 Introduction

Cites in Papers - |

Cites in Papers - IEEE (5)

Cites in Papers - Other Publishers (1)

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Semantic Layout Manipulation With High-Resolution Sparse Attention

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1 Introduction

Cites in Papers - IEEE (5) | Other Publishers (1)

Cites in Papers - IEEE (5)

Cites in Papers - Other Publishers (1)

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Cites in Papers - |