1. INTRODUCTION
Nonnegative matrix factorization (NMF) consists of decomposing nonnegative data , such as a spectrogram, into \begin{equation*}{\mathbf{V}} \approx {\mathbf{WH}}\tag{1}\end{equation*} where and are two nonnegative matrices referred to as dictionary and activation matrix, respectively. K is usually chosen such that the decomposition is low-rank (K < min(M, N)). NMF is known to produce a factorization that gives a part-based representation of V. This method, popularized by Lee and Seung [1], has led to state-of-the-art results in audio source separation [2], [3], [4] and music transcription [5], [6]. In the context of audio processing, NMF is typically applied on a spectrogram, with each column corresponding to one time frame of data. It can lead to a meaningful decomposition where the dictionary tends to capture the vertical structure (spectral patterns) while the activation matrix encodes how these are mixed.