I. Introduction
Noise reduction algorithms for head-mounted assistive listening devices (e.g., headsets, hearing aids, cochlear implants) are crucial to improve speech quality and intelligibility in background noise. For binaural microphone configurations, algorithms that use the microphone signals from the left and the right hearing device simultaneously are promising noise reduction techniques, because the spatial information captured by all microphones can be exploited [1]–[4]. Besides noise reduction and limiting speech distortion, another important objective of binaural algorithms is the preservation of the binaural cues of all sound sources. These binaural cues, i.e. the interaural level difference (ILD), the interaural time difference (ITD) and the interaural coherence (IC) are important for spatial awareness, i.e. for source localization and for determining the width of sound fields, and have a major impact on speech intelligibility due to the so-called binaural unmasking [5]– [8]. When the desired speech source is spatially separated from the interfering sources and background noise, a binaural hearing advantage compared to monaural hearing occurs. For example, in an anechoic environment with one desired speech source and one interfering source, both located in front of the listener, a speech reception threshold (SRT) corresponding to 50% speech intelligibility of about −8 dB is obtained [5]. If the sources are spatially separated, i.e., if the interfering source is not located in front of the listener, the SRT may decrease down to −20 dB, depending on the position of the interfering source. Although for reverberant environments this SRT difference is smaller than for anechoic environments, SRT differences for spatially separated sources up to 6 dB have been reported [9]. Furthermore, for scenarios with one desired speech source masked by a diffuse noise field, as considered in this paper, an improvement of the speech reception threshold (SRT) of 2–3 dB for both normal-hearing and hearing-impaired listeners has been reported [10] , while no improvement in SRT can be observed if the desired speech source and the noise component are both coming from the same direction, i.e. contain the same spatial information [7]. For combined binaural noise reduction and cue preservation two main concepts have been established. In the first concept, the same real-valued spectro-temporal gain, which is typically derived from beamforming algorithms, blind source separation techniques or based on spatial assumptions, is applied to the reference microphone signals in both hearing devices [11]– [15]. Using this concept allows for perfect preservation of the instantaneous binaural cues, but may introduce audible artifacts.