Introduction
The capture and display of 3-D images has been a topic of great interest over the past two centuries. In 1838, Wheatstone was the first in tackling the challenge of displaying 3-D images through the first stereoscope [1]. Later, Rollman faced the problem of displaying stereoscopic pairs, but proposed the use of anaglyphs [2]. This method is still widely used but reproduces poorly the colors. More recent is the use of shutter glasses of liquid crystal, or the use of polarized crystals to induce binocular disparity [3], [4]. The main problem of any stereoscopic technique is that they do not produce a 3-D reconstruction of 3-D scenes. Instead, they display a pair of images which, when projected onto the retinas of the observer, allow the brain to promote their fusion producing the sensation of perspective vision and depth discrimination. But in this process a conflict between convergence and accommodation occurs [5]. This happens when the accommodation of the eye lens is fixed to one distance, that is the screen, whereas the convergence of the eyes' axes is set to a different distance, that is the perceived 3-D image. This is a strongly unnatural physiological procedure that may give rise to visual discomfort after prolonged observations.
The first scientist to propose a method for displaying 3-D images that can be observed without the need for special glasses was G. Lippmann, who proposed integral photography (IP). In 1908, Lippmann [6] postulated the possibility of capturing 3-D information of 3-D scenes with a microlens array (MLA) whose plane side was coated photographic film. This device permitted the capture of a collection of elemental images, each with different perspective of the 3-D object. The idea of Lippmann was to use these images for the display of 3-D scenes. Specifically, he proposed to paste the positive of the developed image at the plane face of the array and illuminate it through a diffuser. Then, any point on the positive image can generate a beam of parallel rays. As a result of the integration of these parallel beams a light distribution similar to the original 3-D scene is reconstructed, which can be observed from a wide range of angles. One of the initial problems faced by IP was the pseudoscopy: the reconstructed images were reversed in depth. In 1931, Ives [7] proposed a solution of pseudoscopic problem based in a two-step capture process.
The system proposed by Lippmann for the capture showed some essential problems. One was the lack of flexibility, since the emulsion was pasted to the microlens array. The other was the overlapping of elemental images in the case of wide scenes. To solve these problems, in 1936, Coffey [8] proposed the use of a field lens of large diameter to form the image of the scene in the microlens array conceived by Lippmann. This permitted to record the perspectives of a far scene, to avoid the overlapping between images and mainly to apply Lippmann concepts to a camera that was very similar to a conventional photographic camera. Note that since the images obtained with Coffey camera are much smaller than in the original Lippmann scheme, they are usually named as microimages. Coffey camera was refined by Davies et al. [9] many years later.
Due to the difficulties for manufacturing high quality microlens arrays, and also to the lack of flexibility provided by photographic emulsion technology, the interest in the IP concept was dormant for decades. However, thanks to the advances in optoelectronic sensors such as CMOS and CCDs, display devices such as LCDs, and commercially available digital computers, the interest in the IP resurrected during the last quarter of the past century, when it was renamed as integral imaging (InI). It is remarkable, in this context, that in 1997, Okano et al. [10] captured, for the first time, integral images in real-time video frequency. There were a number of proposals [11] for transmission and real-time visualization. In particular, the use of a multi camera system organized in a matrix form for the capture, and an MLA placed in front of a high-resolution screen for the visualization were noteworthy.
In a slightly different context, in 1991 Adelson and Bergen defined the plenoptic function for describing the radiance of any luminous ray in the space as function of the angle and position [12]. The plenoptic function helped to create the first plenoptic camera [13], which indeed was an update of the camera designed by Coffey, to which the plenoptic function was for the formulation. In 2005 R. Ng et al. [14] reported the first portable plenoptic camera, which now is commercialized under the name of Lytro camera [15].
In this paper, we present the concept of integral imaging for capturing the spatio-angular information of rays emitted by 3-D scenes, and will review the recent advances in the capture of this radiance map and mainly the advances in the display of 3-D images.
The Capture of the Radiance Map
Consider a self-luminous 3-D scene or, more generally, a 3-D scene composed by diffusing objects that are homogeneously illuminated by spatially incoherent, polychromatic light. If the objects can be considered, in good approximation, as Lambertian diffusers we can understand the scene as a continuous distribution of point sources that emit light-rays isotropically. In this sense, it is very interesting to build a device with the capacity of recording the spatio-angular information of all the rays emitted by the scene. Such device would capture all the information necessary to reproduce the 3-D structure of the scene. The magnitude that allows an accurate description of such spatio-angular information is the radiance, defined as the radiant flux per unit of area and unit of solid angle. If we obviate here the chromatic information, the radiance can be described by a 5-D function (three dimensions for the spatial coordinates and two for the angular ones). Note however that in any imaging process there is always a principal direction of propagation, which usually defines the optical axis. In such case one dimension can be saved, and therefore the radiance can be described through a 4-D function.
The most common device for capturing the light emitted by 3-D scenes is the photographic camera. Note, however, that the camera does not record the angular information of rays emitted by the sample, since it integrates in any pixel of the sensor the irradiance of all the rays impinging on it. In other words, the 2-D sensor registers 2-D pictures of 3-D scenes, that is, 2-D irradiance distributions, which are the result of performing the Abel transform [16] of the 4-D radiance, or plenoptic, function.
A very smart method for recording the map of radiance emitted by a 3-D scene was proposed by Lippmann. To understand the Lippmann proposal in the current technologic context, we can assume that a high-resolution opto-electronic matrix sensor, like a CCD, substitutes the original photographic film. Also to simplify the understanding, we assume a Geometrical Optics model for the light propagation, and a pinhole-like behavior for the microlenses. According to this model, the CCD records a 2-D array of elemental images, each from a different perspective, of the 3-D object, see Fig. 1. Each elemental image indeed stores the radiance of a collection of rays passing through the same spatial point (the center of the microlens), but with different slope. The slope angle is determined by the relative position of any pixel within the elemental image. Then, the recorded integral image is nothing but a sampled version of the radiance map emitted by the object. The sampling period in the spatial direction is equal to the pitch of the microlens array. The sampling period in the angular direction is determined by the microlens focal length and by the pixel pitch. In the past few years, apart from the trivial use of a microlens array [17], the capture of elemental images has been done by building large arrays of digital cameras [18], by the synthetic aperture method [19], or by integration of small digital cameras [20].
The main drawback of Lippmann scheme is that it is not really easy to build a portable integral imaging camera with reasonable parallax which is useful for both near and far scenes. This problem has been resolved in part with the plenoptic camera, which follows the Coffey concept. As enunciated above, a conventional camera can be converted into a plenoptic camera by simply inserting a MLA at the back focal plane of the camera lens, see Fig. 2. In order to avoid the overlapping between microimages and maximize their effective size, the f-number of the camera lens and that of the microlenses should be equal. Also in this case it captures a sampled version of the radiance map. Now, however, it captures the map imaged through the camera lens. In the plenoptic cameras, the array is composed by thousands of microlenses of small pitch and focal length, so that the number of pixels behind any microlens is small. It has been shown that from the radiance map captured with the plenóptica camera it is possible to calculate the radiance map at the camera lens through a simple transposition. In other words, the map captured with the plenoptics camera is the same as the one that could be captured with an array of digital cameras placed at the aperture of the camera lens. This provides the plenoptics camera with its fundamental feature: it permits the capture in a single shot, and with a single CCD, of 3-D scenes with the parallax corresponding to the size of the camera lens. In the past few years this concept has been applied for the capture of static or dynamic scenes with increasing resolution and parallax [21], [22], and also for the capture of 3-D images for industrial [23] or for medical applications [24], [25]. Another application of increasing interest is microscopy. The insertion of a MLA at the back focal plane of the tube lens, and the subsequent axial shift of the CCD, permits to build an integral microscope (iMic) that permits the capture and display of 3-D images of microscopic specimen [26]– [29].
The Reconstruction of the 3-D Scene
Although originally aimed by Lippmann for the display of 3-D images, one of the most important features of integral imaging is the possibility of calculating the irradiance distribution at different depths within the 3-D scene. Although essentially similar, several algorithms for the calculation have been reported. Perhaps the most intuitive was reported in [30], where the irradiance at any point is calculated by projecting such point through the centers of the microlenses and finding the pixels crossed by the projecting line. The irradiance is then calculated by summing up the values of the pixels. The number of pixels at any depth and the number of depth planes can be changed at will. Based on the same concept, but less time consuming is the use of back-projection algorithm [31]. In this algorithm the pixels of any elemental image are projected through the center of the corresponding microlens. Then at a given depth distance a collection of magnified and laterally shifted elemental images appear. In planes where the pixels of neighbor elemental images match, the algorithm runs very efficiently and the resolution of reconstructed image is the same as the resolution of elemental images. In planes where the projected pixels do not match the algorithm is slower since it has to evaluate the contributions to the pixels of reconstructed images. However, the resolution in these planes is increased or even sometimes duplicated [32].
A different approach has been used for the reconstruction of 3-D scenes when the radiance map is captured with a plenoptic camera. In contrast with the integral-imaging camera, now the captured frame is composed by many microimages with few pixels each. The most efficient algorithm, in terms of time consumption, was reported by Georgiev and Lumsdaine [33]. In this algorithm the reconstructed image is built by grouping central patches of all the microimages. Although very fast, this procedure has the drawback that resolution strongly depends on depth distance, and that defocused images show apparent artifacts. Other procedures make use of the propagation properties of radiance maps, and calculate the reconstructed images by shearing the map and then performing the Abel transform [14]. This last procedure is speeded up by use of the Fourier-slice theorem [34]. These algorithms have been refined with the aim of executing them with video-rate speed [35]. Note that the radiance maps captured with an integral-imaging camera and with a plenoptic camera have essentially the same information, since they can be obtained from each other through a simple transposition. This means that in fact, all the above algorithms are somehow equivalent and can be applied to the radiance map captured with any of the above devices [36].
The Display of 3-D Scenes
Following the original Lippmann concept, the display of 3-D images is possible provided that one projects the
microimages onto a high-resolution display, which is covered by an array of microlenses. The initial conditions for
this arrangement is that the number of microlenses, and their f-number, are the same as the ones used in the capture,
and that the luminous pixels are at the front focal plane of the microlenses. In the display process every pixel, and
even sub-pixel, produces a narrow, collimated light-beam when passing through the corresponding microlens. All the
narrow beams intersect in front of the monitor (or virtually behind the monitor) so that they create a 3-D irradiance
distribution similar to the original 3-D scene. In contrast to the case of stereoscopic, or auto-stereoscopic monitors,
the eye does not receive a pair of images that has to be fused by the brain. On the contrary, these integral imaging
monitors create a real (or virtual) 3-D structure of light that can be seen by the observer with continuous, full
parallax and avoiding the conflict between convergence and accommodation. Based on these features, it seems that
integral-imaging concept provides an exciting possibility for the massive production and commercialization of 3-D
monitors. This is not the case at this moment since there are still problems to be solved. The first problem is the
lateral resolution of displayed 3-D image, which is fixed by the MLA pitch. Other problem is the viewing angle of the
monitor. The solution of these problems comes from the production of MLAs with very small pitch, and therefore composed
by millions of lenses, and also small f-number. The massive production of such lenses with acceptable level of
aberrations is still a challenge. Other issue is the spatial resolution of the flat display that is covered with the
MLA. In order to have smooth transitions when observing the 3-D image, at least 16
Although there are still many challenges to face, it is remarkable the significant advances reached in the past few years. Among the advances it is remarkable the design of methods for production of orthoscopic images [37], [38]. Also important advances have been achieved in the purpose of increasing the viewing angle [39]– [41]. An interesting issue that has been faced satisfactorily is the design of an algorithm that permits, from a captured collection of elemental images, to design the collection of microimages ready to be projected on the integral-imaging monitor. Here we select two algorithms, which permit to calculate the microimages by selecting at will the number of microimages or the size and position of the displayed 3-D scene [42], [43]. Other noteworthy advances are 1) dynamic integral imaging which employs spatial light modulators such as LCDs to improve display performance [39], [44]; 2) 3-D integral imaging endoscopy [25]; 3) 3-D integral imaging using QR codes [45]; 4) 3-D integral imaging using flexible sensors placed on non planar surfaces [46]; and 5) Augmented Reality see through viewing devices with 3-D integral imaging [47]. Finally, it is worth mentioning that currently, there are some companies that are already producing displays that are based on integral imaging concept. One is Holografika [48], who has produced a monitor that displays 3-D images with high horizontal viewing angle. Holografika technology does not use an array of microlenses, but a vertical diffuser screen that is illuminated by a horizontal array of micro-projectors. Other company is Real-Eyes [49], who is manufacturing large posters that are covered with more than 250 000 lenses per m 2.
Conclusion
In this review, we have reviewed the principal characteristic of integral imaging, a technique that is specially adapted for the capture of 3-D information of near and far, and large and small 3-D scenes [50]. Although light propagation is essentially a wave phenomenon, integral-imaging technology can be described very accurately in terms of ray propagation. Apart from many other interesting applications, such as the possibility of calculating depth maps, depth images or synthetics views, integral imaging is specifically adapted for the display of 3-D color spatially incoherent images to audiences of more than one person. We have reviewed the main limitations of integral imaging and also the main recent advances. We conclude that while integral imaging requires more technological maturity for mass production, the outlook is very promising for a prominent role in the 3-D field, including healthcare applications [39]– [41], [51], and security and surveillance applications [52]. Additional review material can be found in [53] and [54].