Loading [MathJax]/extensions/MathZoom.js
Facial expression recognition using face-regions | IEEE Conference Publication | IEEE Xplore

Facial expression recognition using face-regions


Abstract:

This paper proposes a facial expression recognition method based on a novel facial decomposition. First, seven regions of interest (ROI), representing the main components...Show More

Abstract:

This paper proposes a facial expression recognition method based on a novel facial decomposition. First, seven regions of interest (ROI), representing the main components of face (left eyebrow, right eyebrow, left eye, right eye, between eyebrows, nose and mouth), are extracted using facial landmarks detected by IntraFace algorithm. Then, different local descriptors, such as LBP, CLBP, LTP and Dynamic LTP, are used to extract features. Finally, feature vector, representing face image, is fed into a multiclass support vector machine to achieve the recognition task. Experimental results on two public datasets show that the proposed method outperforms state of the art methods based on other facial decompositions.
Date of Conference: 22-24 May 2017
Date Added to IEEE Xplore: 23 October 2017
ISBN Information:
Conference Location: Fez, Morocco

I. Introduction

Face is one of the most important means of human communication. It plays a central role in all social interactions. Facial expressions are non-verbal clues to emotions. Indeed, some facial muscles are specifically associated with certain emotional states and allow, according to Ekman [8] the expression of primary emotions (Sadness, Anger, Fear, Joy, Disgust and Surprise). These external signals express the internal emotional state of an individual, and therefore his intentions. In fact, 7% of the communication relies on verbal interaction, 38% represent tone and sound of voice, 55% are articulated around gestures and expressions of the face according to Mehara-bian [18]. Automatic recognition of facial expressions is an interesting problem which finds its interest in several fields such as eLearning and affective computing [20], [15], [21]. When designing an automatic facial expression recognition system, three problems are considered: face detection, facial feature extraction, and classification of expressions. First, face acquisition is a processing stage to automatically locate the face region in the input images. The next step is to extract and represent facial changes caused by facial expressions. Finally, the classification task allows to infer the facial expressions. According to the existing types of feature extraction, facial expression recognition process can be commonly divided into appearance-based and geometric-based methods. For instance, the methods addressed in [23], [3], [1] extract textures from image to characterize facial appearance changes and the methods in [14], [24] measure geometric displacement, distances and angles between facial landmarks to extract geometric information. Recently, several works, based on appearance-based features, used local descriptors such as LBP [23], [9], its variants like LTP [9], MTP [2], CLBP [1]. Similarly, HOG [3], PCA [17], LDA and Wavelet [16] were used for appearance-based feature extraction. Shan et al. [23] carried out extensive experiments where they evaluated LBP features with different classification techniques and they showed that LBP features are effective and efficient for facial expression recognition, even in low-resolution video sequences. Gritti et al. [9] also extensively investigated different local features, such as LBP, LTP, HOG and Gabor and demonstrated that LBP outperforms all the tested features. In [3], Caragni et al. largely studied HOG parameter settings (cell size and number of orientation bins) and concluded that the choice of a proper set of HOG parameters can make HOG descriptor one of the powerful techniques for facial expression recognition. Ahmed et al. [1] proposed an LBP variant called Compound Local Binary Pattern (CLBP) that combines magnitude information of the difference between two gray values and the basic LBP. Another new local texture pattern called Median Ternary Pattern (MTP) is proposed in [2]. It combines the advantages of median filter and quantization of gray-scale values into 3 value codes. Geometric-based approaches use facial landmarks to represent the whole face shape. As in many research works [10], [24], the movement and positions of facial landmarks are calculated to extract geometric information. In [10], the authors detect facial expressions by using Active Appearance Models (AAM) to extract key features and observing changes of their value using Fuzzy Logic. Shbib and Zhou [24] applied Active Shape Model (ASM) fitting technique to extract facial feature points. Then geometric displacement of projected ASM feature points, and the mean shape of ASM were analyzed to recognize facial expressions. Two feature extraction approaches Correlation Features Selection (CFS) and Empirical Normalized Distances (END) are applied in [6]. These techniques are both based on geometric feature extraction using Point Distribution Model (PDM) tracker to localize landmark positions. As reported in the literature [3], [7], [11], appearance feature extraction techniques can be applied to the whole face, specific face-regions, patches around some facial landmarks to extract appearance textures of the face. Lately, many studies are interested in recognizing facial expressions using specific face-regions. In [27], the authors extracted facial regions using AAM, then extracted facial features from the defined regions applying Gabor wavelet transformation. The authors in [5], [7] defined facial components from which they extracted HOG feature descriptors. Almost all the studies stated above used Support Vector Machine (SVM) as a classifier for facial expression recognition. The objective of this work is to recognize automatically the six basic emotions as well as the neutral state by applying appearance-feature methods like LBP and its variants to new specific face-regions. Two main contributions are reported in this paper: (1) the decomposition of the face into seven regions of interest (ROI) where each ROI contains one face component, this novel decomposition improves facial expression recognition performance compared to state of the art methods based on specific face-regions or whole face. (2) A comprehensive study is carried out with different descriptors, especially a dynamic LTP where different formulas are adopted to calculate its threshold. The evaluation is performed on two databases using SVM.

Contact IEEE to Subscribe

References

References is not available for this document.