I. Introduction
Linear (Fisher) discriminant analysis-based (LDA) [1] method has been shown to be an effective approach in face recognition application and its superior performance has been reported in many literatures [3]–[14] in the last decade. LDA is theoretically sound and its objective is to find the most discriminant feature for classification. Hence, it is good for pattern recognition problem. However, LDA suffers from two major drawbacks. First, LDA is a linear method and is hard to solve nonlinear problem, while the second is the small sample size (S3) problem.