Loading [a11y]/accessibility-menu.js
ObjectStitch: Object Compositing with Diffusion Model | IEEE Conference Publication | IEEE Xplore

ObjectStitch: Object Compositing with Diffusion Model


Abstract:

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and...Show More

Abstract:

Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results. Furthermore, annotating training data pairs for compositing requires substantial manual effort from professionals, and is hardly scalable. Thus, with the recent advances in generative models, in this work, we propose a selfsupervised framework for object compositing by leveraging the power of conditional diffusion models. Our framework can hollistically address the object compositing task in a unified model, transforming the viewpoint, geometry, color and shadow of the generated object while requiring no manual labeling. To preserve the input object's characteristics, we introduce a content adaptor that helps to maintain categori-cal semantics and object appearance. A data augmentation method is further adopted to improve the fidelity of the generator. Our method outperforms relevant baselines in both realism and faithfulness of the synthesized result images in a user study on various real-world images.
Date of Conference: 17-24 June 2023
Date Added to IEEE Xplore: 22 August 2023
ISBN Information:

ISSN Information:

Conference Location: Vancouver, BC, Canada

Funding Agency:


1. Introduction

Image compositing is an essential task in image editing that aims to insert an object from a given image into another image in a realistic way. Conventionally, many sub-tasks are involved in compositing an object to a new scene, including color harmonization [6], [7], [19], [51], relighting [52], and shadow generation [16], [29], [43] in order to naturally blend the object into the new image. As shown in Tab. 1, most previous methods [6], [7], [16], [19], [28], [43] focus on a single sub-task required for image compositing. Consequently, they must be appropriately combined to obtain a composite image where the input object is re-synthesized to have the color, lighting and shadow that is consistent with the background scene. As shown in Fig. 1, results produced in this way still look unnatural, partly due to the viewpoint of the inserted object being different from the overall background. Prior works only focus on one or two aspects of object compositing, and they cannot synthesize novel views. In contrast, our model can address all perspectives as listed.

Method Geometry Light Shadow View
ST-GAN [ ] I
SSH [19]
DCCF [ ] X
SSN [ ]
SGRNet [ ]
GCC-GAN [5]
Ours

Contact IEEE to Subscribe

References

References is not available for this document.