Loading [MathJax]/extensions/MathMenu.js
ObjectStitch: Object Compositing with Diffusion Model | IEEE Conference Publication | IEEE Xplore

ObjectStitch: Object Compositing with Diffusion Model


Abstract:

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and...Show More

Abstract:

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results. Furthermore, annotating training data pairs for compositing requires substantial manual effort from professionals, and is hardly scalable. Thus, with the recent advances in generative models, in this work, we propose a selfsupervised framework for object compositing by leveraging the power of conditional diffusion models. Our framework can hollistically address the object compositing task in a unified model, transforming the viewpoint, geometry, color and shadow of the generated object while requiring no manual labeling. To preserve the input object's characteristics, we introduce a content adaptor that helps to maintain categori-cal semantics and object appearance. A data augmentation method is further adopted to improve the fidelity of the generator. Our method outperforms relevant baselines in both realism and faithfulness of the synthesized result images in a user study on various real-world images.
Date of Conference: 17-24 June 2023
Date Added to IEEE Xplore: 22 August 2023
ISBN Information:

ISSN Information:

Conference Location: Vancouver, BC, Canada

Funding Agency:

Citations are not available for this document.

1. Introduction

Image compositing is an essential task in image editing that aims to insert an object from a given image into another image in a realistic way. Conventionally, many sub-tasks are involved in compositing an object to a new scene, including color harmonization [6], [7], [19], [51], relighting [52], and shadow generation [16], [29], [43] in order to naturally blend the object into the new image. As shown in Tab. 1, most previous methods [6], [7], [16], [19], [28], [43] focus on a single sub-task required for image compositing. Consequently, they must be appropriately combined to obtain a composite image where the input object is re-synthesized to have the color, lighting and shadow that is consistent with the background scene. As shown in Fig. 1, results produced in this way still look unnatural, partly due to the viewpoint of the inserted object being different from the overall background. Prior works only focus on one or two aspects of object compositing, and they cannot synthesize novel views. In contrast, our model can address all perspectives as listed.

Method Geometry Light Shadow View
ST-GAN [ ] I
SSH [19]
DCCF [ ] X
SSN [ ]
SGRNet [ ]
GCC-GAN [5]
Ours

Cites in Papers - |

Cites in Papers - IEEE (5)

Select All
1.
Jae Hyun Cho, Min Seo Shin, So Hyun Kang, Jung Won Yoon, Tae Hyung Kim, Youn Kyu Lee, "Composition-based Detail Preservation in Pose Transformation Using Diffusion Models", 2024 15th International Conference on Information and Communication Technology Convergence (ICTC), pp.25-29, 2024.
2.
Ayush Sarkar, Hanlin Mai, Amitabh Mahapatra, Svetlana Lazebnik, D.A. Forsyth, Anand Bhattad, "Shadows Don't Lie and Lines Can't Bend! Generative Models Don't know Projective Geometry…for Now", 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.28140-28149, 2024.
3.
Yichen Sheng, Zixun Yu, Lu Ling, Zhiwen Cao, Xuaner Zhang, Xin Lu, Ke Xian, Haiting Lin, Bedrich Benes, "Dr.Bokeh: DiffeRentiable Occlusion-Aware Bokeh Rendering", 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.4515-4525, 2024.
4.
Karran Pandey, Paul Guerrero, Matheus Gadelha, Yannick Hold-Geoffroy, Karan Singh, Niloy J. Mitra, "Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D", 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.7695-7704, 2024.
5.
Liu He, Daniel Aliaga, "GlobalMapper: Arbitrary-Shaped Urban Layout Generation", 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp.454-464, 2023.

Cites in Papers - Other Publishers (7)

1.
Dongxu Yue, Maomao Li, Yunfei Liu, Qin Guo, Ailing Zeng, Tianyu Yang, Yu Li, "AddMe: Zero-Shot Group-Photo Synthesis by\\xa0Inserting People Into Scenes", Computer Vision – ECCV 2024, vol.15078, pp.222, 2025.
2.
Yuming Jiang, Nanxuan Zhao, Qing Liu, Krishna Kumar Singh, Shuai Yang, Chen Change Loy, Ziwei Liu, "GroupDiff: Diffusion-Based Group Portrait Editing", Computer Vision – ECCV 2024, vol.15092, pp.221, 2025.
3.
Ruicheng Wang, Jianfeng Xiang, Jiaolong Yang, Xin Tong, "Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-trained Diffusion Priors", Computer Vision – ECCV 2024, vol.15114, pp.441, 2025.
4.
Liu He, Daniel Aliaga, "COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation", Computer Vision – ECCV 2024, vol.15071, pp.1, 2025.
5.
Daniel Winter, Matan Cohen, Shlomi Fruchter, Yael Pritch, Alex Rav-Acha, Yedid Hoshen, "ObjectDrop: Bootstrapping Counterfactuals for\\xa0Photorealistic Object Removal and\\xa0Insertion", Computer Vision – ECCV 2024, vol.15135, pp.112, 2024.
6.
Beomjo Kim, Kyung-Ah Sohn, "Text-Free Diffusion Inpainting Using Reference Images for Enhanced Visual Fidelity", Pattern Recognition Letters, 2024.
7.
Rui Jiang, Guang-Cong Zheng, Teng Li, Tian-Rui Yang, Jing-Dong Wang, Xi Li, "A Survey of Multimodal Controllable Diffusion Models", Journal of Computer Science and Technology, vol.39, no.3, pp.509, 2024.
Contact IEEE to Subscribe

References

References is not available for this document.