Abstract:
Light field cameras capture the intensity of light rays coming from multiple directions, thus allowing a set of 2D images, named sub-aperture (SA) images, to be rendered....Show MoreMetadata
Abstract:
Light field cameras capture the intensity of light rays coming from multiple directions, thus allowing a set of 2D images, named sub-aperture (SA) images, to be rendered. These images correspond to observations of the scene from slightly different angles. The rich spatio-angular information obtained using these cameras is exploited in this paper, for the first time, in the context of facial emotion recognition. A deep learning spatio-angular fusion framework is adopted which is able to model both the intra-view/spatial and inter-view/angular information, using a VGG-16 convolutional neural network and a long short-term memory (LSTM) recurrent network. The proposed solution, based on the adopted deep spatio-angular fusion framework, creates two view sequences, horizontal and vertical, with selected SA images, for which VGG-Face descriptions are extracted. The resulting descriptions are fed to two LSTM networks, with the aim of independently learning horizontal and vertical classification models. The softmax classifier scores obtained for the horizontal and vertical descriptors are then fused to obtain the final emotion recognition labels. A comprehensive set of experiments has been conducted on the IST-EURECOM light field face database using two assessment protocols. The adopted framework achieves superior emotion recognition performance when compared with state-of-the-art benchmarking methods.
Published in: 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII)
Date of Conference: 03-06 September 2019
Date Added to IEEE Xplore: 09 December 2019
ISBN Information: