1 Introduction
It is common practice to preprocess data by extracting linear or nonlinear features. In many such feature extraction techniques, one has a criterion assessing the quality of a single feature which ought to be optimized. For additional features, one often uses the same quality criterion, but enforces additional constraints with respect to the formerly found features. The most well-known feature extraction technique in this framework is principal component analysis, PCA (e.g., [1]). However, PCA is a linear technique and cannot capture nonlinear structure in a data set. Therefore, nonlinear generalizations have been proposed, among them, kernel PCA [2], which computes the principal components of the data set mapped nonlinearly into some high-dimensional feature space . This work generalizes what has been done for kernel PCA to a more general setting. We will first recall how to use prior information for extracting meaningful features in a linear setting leading us to the Rayleigh coefficient. In a second step, which is the main contribution of this work, we propose a nonlinear variant of the Rayleigh coefficient and discuss regularization approaches and implementation issues.