1. Introduction
Human-object interaction refers to the interactions of single or multiple people with objects and is a common type of scenario in everyday life [1]. In recent years, research on novel view synthesis (NVS) on human-object interaction has received much attention. NVS allows for the synthetic generation of a virtual view placed at an arbitrarily selected position in a three-dimensional scene. NVS in scenes of human-object interaction is necessary for high-level vision tasks like action analysis, visual scene answering, and video understanding. The main challenges are the complex interaction patterns and the severe occlusions [2].