I. Introduction
Direction of arrival (DOA) estimation is a well-known task in audio signal processing and an important component of many applications such as speech separation [1] or speech enhancement [2]. Neural networks have been shown to be superior to classical parametric approaches in DOA estimation, especially in very demanding reverberant, noisy, and low-SNR environments [3]–[6]. In addition, Ambisonics-based audio signal processing is becoming increasingly popular due to the flexibility and generalizability it enables. Therefore, DOA estimation based on first-order Ambisonics (FOA) signals has been the subject of much attention [6]–[9].