1. Introduction
Novel view synthesis is a classical problem in computer vision, which aims to produce photorealistic images for unseen viewpoints [2], [5], [10], [36], [40]. Recently, Neural Radiance Fields (NeRF) [25] proposes to achieve novel view synthesis through continuous scene modeling through a neural net- work, which quickly attracts widespread attention due to its surprising results. However, the vanilla NeRF is actu-ally designed to fit the continuous 5D radiance field of a given scene, which often fails to generalize to new scenes and datasets. How to improve the generalization ability of neural scene representation is a challenging problem.