I. Introduction
The process of image fusion consists of combining multiple images to obtain a single image that is more precise than the original images. An aim of the image fusion is to decrease the information quantity, while simultaneously making it more interpretable for both human and machine perception. In multi-focus image fusion case, the single output fused image is constructed from the multiple input images which each of them is focused on a part of scene and the other parts are blurred [1], [2]. Technically speaking, due to the limited depth of focus in optical lenses of CMOS/CCD cameras, recording an image which has distinct components is challenging [3]. In order to create a vivid image, every part of image must remain in the focal length. However, this is usually not practical and only a fraction of the image, which is remaining in the focal length could be explicit, leaving the rest obscure [4], [5]. In Visual Sensor Networks (VSN) there is an opportunity to capture more images from a scene by multi cameras with different depths of focus [5]–[8].