1. INTRODUCTION
Speech emotion recognition is one of the latest challenges in speech processing. Besides human facial expressions speech has proven as one of the most promising modalities for the automatic recognition of human emotions. Especially in the field of security systems a growing interest can be observed throughout the last year. Besides, the detection of lies, video games and psychiatric aid are often claimed as further scenarios for emotion recognition [1]. Addressing classification in a practical view it has to be considered that a technical approach can only rely on pragmatic decisions about kind, extent and number of emotions suiting the situation. It seems reasonable to adapt and limit this number and kind of recognizable emotions to the requirements given within the application to ensure a robust classification. Yet no standard exists for the classification of emotions in technical recognition. An often favored way is to distinguish between a defined set of discrete emotions. However, as mentioned, no common opinion exists about their number and naming. A recent approach can be found in the MPEG4 standard, which names the six emotions anger, disgust, fear, joy, sadness and surprise. The addition of a neutral state seems reasonable to realize the absence of any of these emotions. This classification is used as a basis for the comparison throughout this work also expecting further comparisons. Most approaches in nowadays speech emotion recognition use global statistics of a phrase as basis [2]. However also first efforts in recognition of instantaneous features exist [3] [4]. We present two working engines using both alluded alternatives by use of continuous hidden Markov models, which have evolved as a far spread standard technique in speech processing.