Learning to Answer Questions in Dynamic Audio-Visual Scenarios | IEEE Conference Publication | IEEE Xplore